[算法沉淀记录] 排序算法 —

排序算法 —— 选择排序

基本概念

选择排序是一种简单的排序算法，它的工作原理是每次从待排序的列表中选择最小（或最大）的元素，将其与列表中的第一个位置交换，然后继续对剩余的元素进行排序，直到整个列表排序完成。选择排序的时间复杂度为O(n^2)，是一种不高效的排序算法，但在某些情况下，由于其简单性和稳定性，仍然被广泛使用。

算法基本思想

选择排序的基本思想是每次从待排序的列表中选择最小（或最大）的元素，将其与列表中的第一个位置交换，然后继续对剩余的元素进行排序，直到整个列表排序完成。

算法步骤

首先，将第一个元素设为最小值（或最大值）。
然后，将列表中的每个元素与最小值（或最大值）进行比较，如果找到更小的（或更大的）值，则更新最小值（或最大值）的索引。
最后，将最小值（或最大值）与列表中的第一个位置交换。
重复步骤1-3，直到整个列表排序完成。
返回排序后的列表。

伪代码描述

function selectionSort(arr):for i from 0 to length(arr) - 1:minIndex = ifor j from i + 1 to length(arr):if arr[j] < arr[minIndex]: # 寻找最小的元素minIndex = j # 更新最小元素的索引swap arr[minIndex] and arr[i] # 交换最小元素和第一个位置的元素return arr # 返回排序后的列表

优缺点

优点

简单易懂，易于实现。
适用于小规模的数据集。
可以在部分有序的数据集上进行优化。
适用于数据集的顺序与最终排序结果的顺序相同的情况。
适用于对内存要求严格的场景。
适用于对稳定性要求不高的场景。
适用于对时间要求不高的场景。
适用于对空间要求不高的场景。
适用于对数据集的顺序不敏感的情况。
适用于对数据集的顺序敏感的情况。

缺点

效率较低，时间复杂度为O(n^2)。
需要进行多次交换操作，不适合大规模的数据集。
不适用于对稳定性要求较高的场景。
不适用于对时间要求较高的场景。
不适用于对空间要求较高的场景。

应用场景

当数据集较小且对时间要求不高时，可以选择使用选择排序。
当数据集的顺序与最终排序结果的顺序相同且对稳定性要求不高时，可以选择使用选择排序。
当对内存要求严格时，可以选择使用选择排序。

时间复杂度

最好情况时间复杂度：O(n^2)，当列表已经有序时。
最坏情况时间复杂度：O(n^2)，当列表逆序时。
平均情况时间复杂度：O(n^2)。

为什么时间复杂度是O(n^2)?

时间复杂度分析：

外层循环需要n-1次迭代。
内层循环需要n-1次迭代。
总的时间复杂度为O(n^2)。

如何避免最坏情况下的时间复杂度?

可以使用随机化选择的方法来避免最坏情况下的时间复杂度。每次选择最小值或最大值时，随机选择一个元素进行比较。这样可以保证在最坏情况下，时间复杂度为O(n^2)。但在平均情况下，时间复杂度为O(n)。因此，随机化选择是一种优化方法。但在实际应用中，随机化选择并不常用。因为随机化选择需要额外的空间来存储随机数。并且随机化选择的时间复杂度为O(n)。因此，随机化选择并不常用。

空间复杂度

空间复杂度：O(1)，不需要额外的空间。
原地排序：是。

为什么空间复杂度是O(1)?

空间复杂度分析：

只需要常数个额外的空间来存储临时变量。
总的空间复杂度为O(1)。

代码实现

template <typename T>
void selectionSort(vector<T> &arr)
{// n is the size of the arrayint n = arr.size();// loop through the array n-1 timesfor (int i = 0; i < n - 1; i++){// set the minimum index to iint min_index = i;// loop through the array from i+1 to nfor (int j = i + 1; j < n; j++){// if the current element is less than the minimum index elementif (arr[j] < arr[min_index]){// set the minimum index to the current elementmin_index = j;}}// if the minimum index is not equal to iif (min_index != i){// swap the elements at index i and min_indexswap(arr[i], arr[min_index]);}}
}

这段代码是C++中一个选择排序算法的实现。选择排序是一种简单的排序算法，它的工作原理是不断地选择剩余元素中的最小（或最大）元素，然后将其放到已排序部分的末尾。下面是该算法的详细解释：

模板参数:
```
template <typename T>
```
这使得selectionSort函数可以接受任何类型的元素数组，例如整数、浮点数、字符或自定义对象。
函数参数:
```
void selectionSort(vector<T> &arr)
```
该函数接受一个类型为T的vector的引用，这意味着任何传入此函数的数组都将在排序过程中被修改。
获取数组大小:
```
int n = arr.size();
```
这里，我们获取数组的长度，这样我们就可以在后续的循环中使用它。
外部循环:
```
for (int i = 0; i < n - 1; i++)
```
外部循环从数组的第一个元素运行到倒数第二个元素。这是因为最后一个元素将在每次迭代中被正确放置（因为它将是剩余元素中的最大元素）。
设置最小索引:
```
int min_index = i;
```
我们假设当前索引i处的元素是未排序部分的最小元素。
内部循环:
```
for (int j = i + 1; j < n; j++)
```
内部循环从i+1开始，到数组的末尾结束。这是为了比较未排序部分的每个元素，以找到实际的最小元素。
找到最小元素:
```
if (arr[j] < arr[min_index])
{min_index = j;
}
```
如果我们发现一个元素小于当前认为的最小元素，我们就更新min_index以指向这个新的最小元素。
交换元素:
```
if (min_index != i)
{swap(arr[i], arr[min_index]);
}
```
如果我们在未排序部分找到了一个比当前索引i处的元素更小的元素，我们就交换这两个元素。这确保了最小元素被放置在已排序部分的末尾。
重复:
外部循环继续，重复此过程，直到数组完全排序。
选择排序的时间复杂度是 $O(n^2)$ ，因为它涉及双重循环。这使得它在处理大型数据集时效率不高，但对于小型数据集或几乎已经排序的数据集，它可以相当快速。

上述代码的Python版本：

def selection_sort(arr):n = len(arr)for i in range(n - 1):min_index = ifor j in range(i + 1, n):if arr[j] < arr[min_index]:min_index = jif min_index != i:arr[i], arr[min_index] = arr[min_index], arr[i]

完整的代码

#include <iostream>
#include <vector>
#include <algorithm>
#include <string>
#include <cassert>using namespace std;class Person
{
public:Person(string name, int age, int score){this->name = name;this->age = age;this->socre = score;}// Override the operator> for other function to use.bool operator>(const Person &other) const{// Compare the socre of two Person objects.return this->socre > other.socre;}// Override the operator< for other function to use.bool operator<(const Person &other) const{// Compare the socre of two Person objects.return this->socre < other.socre;}// Override the operator== for other function to use.bool operator==(const Person &other) const{// Compare the socre, age and name of two Person objects.return this->socre == other.socre &&this->age == other.age &&this->name == other.name;}// Override the operator!= for other function to use.bool operator!=(const Person &other) const{// Compare the socre, age and name of two Person objects.return this->socre != other.socre ||this->age != other.age ||this->name != other.name;}// Override the operator<= for other fnction to use.bool operator<=(const Person &other) const{// Compare the socre, age and name of two Person objects.return this->socre <= other.socre &&this->age <= other.age &&this->name <= other.name;}// Override the operator>= for other function to use.bool operator>=(const Person &other) const{// Compare the socre, age and name of two Person objects.return this->socre >= other.socre &&this->age >= other.age &&this->name >= other.name;}// Now there are some get parameters function for this calss:const string &getName() const { return this->name; }int getAge() const { return this->age; }int getScore() const { return this->socre; }private:string name;int age;int socre;
};template <typename T>
void selectionSort(vector<T> &arr)
{// n is the size of the arrayint n = arr.size();// loop through the array n-1 timesfor (int i = 0; i < n - 1; i++){// set the minimum index to iint min_index = i;// loop through the array from i+1 to nfor (int j = i + 1; j < n; j++){// if the current element is less than the minimum index elementif (arr[j] < arr[min_index]){// set the minimum index to the current elementmin_index = j;}}// if the minimum index is not equal to iif (min_index != i){// swap the elements at index i and min_indexswap(arr[i], arr[min_index]);}}
}void basicTypeSelectionSortCase()
{vector<int> intArr = {5, 2, 8, 1, 3};vector<double> doubleArr = {5.5, 2.2, 8.8, 1.1, 3.3};vector<char> charArr = {'g', 'e', 'o', 'r', 'g'};selectionSort<int>(intArr);selectionSort<double>(doubleArr);selectionSort<char>(charArr);cout << "Sorted int array: ";for (int i : intArr){cout << i << " ";}cout << endl;cout << "Sorted double array: ";for (double i : doubleArr){cout << i << " ";}cout << endl;cout << "Sorted char array: ";for (char i : charArr){cout << i << " ";}cout << endl;
}void personSelecttionSortCase()
{// Now I want to write some Person class's quick sort examples in here:vector<Person> personArr = {Person("John", 25, 88), Person("Alice", 30, 77), Person("Bob", 20, 66)};selectionSort<Person>(personArr);cout << "Sorted Person array: ";const auto &personSize = personArr.size();for (size_t i = 0; i < personSize; i++){const auto &person = personArr[i];cout << person.getName() << " " << person.getAge() << " " << person.getScore() << endl;}cout << endl;// Now I want to write some Person class's quick sort examples in here:vector<Person> personArrNew = {Person("Tom", 35, 77), Person("Panda", 22, 88), Person("Alex", 50, 99)};const auto &personSizeNew = personArrNew.size();selectionSort<Person>(personArrNew);cout << "Sorted Person array: " << endl;for (size_t i = 0; i < personSizeNew; i++){const auto &person = personArrNew[i];cout << person.getName() << " " << person.getAge() << " " << person.getScore() << endl;}cout << endl;
}void testSelectionSort()
{vector<int> arr1 = {64, 25, 12, 22, 11};selectionSort(arr1);assert(arr1 == vector<int>({11, 12, 22, 25, 64}));vector<int> arr2 = {5, 2, 8, 12, 4};selectionSort(arr2);assert(arr2 == vector<int>({2, 4, 5, 8, 12}));vector<int> arr3 = {9, 3, 1, 6, 5, 2};selectionSort(arr3);assert(arr3 == vector<int>({1, 2, 3, 5, 6, 9}));vector<int> arr4 = {1, 2, 3, 4, 5, 6};selectionSort(arr4);assert(arr4 == vector<int>({1, 2, 3, 4, 5, 6}));vector<int> arr5 = {6, 5, 4, 3, 2, 1};selectionSort(arr5);assert(arr5 == vector<int>({1, 2, 3, 4, 5, 6}));
}int main()
{testSelectionSort();basicTypeSelectionSortCase();personSelecttionSortCase();return 0;
}

这段C++代码定义了一个Person类，并自定义了比较操作符。Person类有三个成员变量：name、age和score。它还有比较操作符（>、<、==、!=、<=和>=），允许它与其他Person对象根据它们的分数进行比较。

selectionSort函数是一个通用的排序函数，它接受一个向量，并使用选择排序算法对其进行排序。它的原理是遍历向量，找到剩余未排序部分的最小元素，然后将其与未排序部分的第一个元素交换。这个过程会重复，直到整个向量都被排序。

basicTypeSelectionSortCase函数演示了如何使用selectionSort函数，对不同数据类型（int、double和char）进行排序。它创建了整数、双精度浮点数和字符类型的向量，使用selectionSort函数进行排序，然后打印排序后的数组。

personSelecttionSortCase函数演示了如何使用selectionSort函数，对Person类进行排序。它创建了一个Person对象的向量，使用selectionSort函数进行排序，然后打印排序后的数组。

testSelectionSort函数是selectionSort函数的一个测试用例。它创建了一个整数类型的向量，使用selectionSort函数进行排序，然后检查排序后的向量是否与预排序的向量相等。如果排序后的向量与预排序的向量相等，测试通过；否则，测试失败。

在main函数中，调用了testSelectionSort函数，对selectionSort函数进行测试，包括不同数据类型的测试和Person类的测试。如果所有测试都通过，程序将输出"所有测试通过"。

以下是这个代码的Python版本：

class Person:def __init__(self, name: str, age: int, score: int):self.name = nameself.age = ageself.score = scoredef __lt__(self, other):return self.score < other.scoredef __le__(self, other):return self.score <= other.scoredef __eq__(self, other):return self.score == other.score and self.age == other.age and self.name == other.namedef __ne__(self, other):return self.score != other.score or self.age != other.age or self.name != other.namedef __gt__(self, other):return self.score > other.scoredef __ge__(self, other):return self.score >= other.scoredef get_name(self):return self.namedef get_age(self):return self.agedef get_score(self):return self.scoredef selection_sort(arr):n = len(arr)for i in range(n - 1):min_index = ifor j in range(i + 1, n):if arr[j] < arr[min_index]:min_index = jif min_index != i:arr[i], arr[min_index] = arr[min_index], arr[i]def test_selection_sort():arr1 = [64, 25, 12, 22, 11]selection_sort(arr1)assert arr1 == [11, 12, 22, 25, 64]arr2 = [5, 2, 8, 12, 4]selection_sort(arr2)assert arr2 == [2, 4, 5, 8, 12]arr3 = [9, 3, 1, 6, 5, 2]selection_sort(arr3)assert arr3 == [1, 2, 3, 5, 6, 9]arr4 = [1, 2, 3, 4, 5, 6]selection_sort(arr4)assert arr4 == [1, 2, 3, 4, 5, 6]arr5 = [6, 5, 4, 3, 2, 1]selection_sort(arr5)assert arr5 == [1, 2, 3, 4, 5, 6]def basic_type_selection_sort_case():int_arr = [5, 2, 8, 1, 3]double_arr = [5.5, 2.2, 8.8, 1.1, 3.3]char_arr = ['g', 'e', 'o', 'r', 'g']selection_sort(int_arr)selection_sort(double_arr)selection_sort(char_arr)print("Sorted int array:", int_arr)print("Sorted double array:", double_arr)print("Sorted char array:", char_arr)def person_selection_sort_case():person_arr = [Person("John", 25, 88), Person("Alice", 30, 77), Person("Bob", 20, 66)]selection_sort(person_arr)print("Sorted Person array:")for person in person_arr:print(person.get_name(), person.get_age(), person.get_score())person_arr_new = [Person("Tom", 35, 77), Person("Panda", 22, 88), Person("Alex", 50, 99)]selection_sort(person_arr_new)print("Sorted Person array:")for person in person_arr_new:print(person.get_name(), person.get_age(), person.get_score())if __name__ == "__main__":test_selection_sort()basic_type_selection_sort_case()person_selection_sort_case()

总结

在本文档中，我们学习了如何使用选择排序算法对数组进行排序。我们首先定义了一个选择排序函数，然后使用该函数对不同类型的数组进行排序。最后，我们展示了排序后的数组。希望这个文档对你有所帮助！

扩展阅读

优化时间复杂度的思路

选择排序是一种简单直观的排序算法。它的工作原理是在未排序的序列中找到最小（或最大）元素，存放到排序序列的起始位置，然后，再从剩余未排序元素中继续寻找最小（或最大）元素，然后放到已排序序列的末尾。以此类推，直到所有元素均排序完毕。
选择排序的时间复杂度为 $O(n^2)$ ，因为它需要进行 $n - 1$ 轮的比较，每轮比较中需要比较剩余未排序的元素数量，这个数量从 $n - 1$ 递减到 1。因此，选择排序并不是一种高效的排序算法，尤其是在处理大数据集时。
尽管如此，选择排序算法的优化主要集中在减少比较次数上，但它的基本时间复杂度很难有大的改进。下面是一些优化选择排序的尝试：

跳跃选择排序（Jump Selection Sort）:
在每一轮选择最小（或最大）元素时，可以跳过一些元素。例如，第一轮跳过1个元素，第二轮跳过2个元素，以此类推。这样可以稍微减少比较次数，但时间复杂度仍然是 $O(n^2)$ 。
双向选择排序（Bidirectional Selection Sort）:
也称为双边选择排序，这种方法每一轮同时找到最大和最小元素，并放到序列的两端。这样可以将排序过程减半，但每轮的比较次数仍然是 $O (n)$ ，因此总体的时间复杂度仍然是 $O(n^2)$ 。
使用更高效的交换方法:
在某些情况下，可以优化交换两个元素的方法，例如使用异或运算。但这种方法对时间复杂度的改善有限。
并行化:
选择排序的一个潜在优化是并行化处理。每一轮可以并行地找到最小（或最大）元素，特别是在多核处理器上。但这种方法的主要限制是并行处理本身的复杂性，以及可能存在的并行开销。
总的来说，选择排序的优化主要是减少比较次数，但它们并不能改变其 $O(n^2)$ 的时间复杂度。对于大多数实际应用，更高效的排序算法（如快速排序、归并排序或堆排序）通常是更好的选择。

跳跃选择排序（Jump Selection Sort）

跳跃选择排序（Jump Selection Sort）是选择排序的一种变种，它通过减少比较的次数来提高效率。在跳跃选择排序中，不是每次都只比较相邻的两个元素，而是每次跳过多个元素，比较剩余未排序元素中的最小（或最大）元素，并将其放到正确的位置。
跳跃选择排序的基本步骤如下：

计算跳跃长度:
确定每次跳跃的长度。这通常是通过分析数据集的特性来完成的。例如，如果数据已经部分排序，跳跃长度可以从1逐渐增加到数据集的大小。
跳跃选择:
从第一个元素开始，跳过跳跃长度个元素，然后找到剩余未排序元素中的最小（或最大）元素，并将其放到当前位置。
重复:
继续跳跃选择，直到所有元素都被处理。
以下是跳跃选择排序的伪代码：

function jump_selection_sort(arr):n = length(arr)# 初始化跳跃长度为数据集大小for step in range(n):# 假设最小元素在当前元素min_index = step# 跳过跳跃长度个元素for i in range(step + 1, step + jump_length + 1):# 检查是否找到更小的元素if arr[i] < arr[min_index]:min_index = i# 交换元素swap(arr[step], arr[min_index])

对于C++的模板实现，我们可以使用以下代码：

// The jump selection sort function.
template <typename T>
void jumpSelectionSort(vector<T> &arr)
{// n is the size of the arrayint n = arr.size();// step is the jump sizeint step = n / 2;// while step is greater than 0while (step > 0){// for each element in the arrayfor (int i = 0; i + step < n; i++){// min_index is the index of the minimum element// in the sub array from i to i + stepint min_index = i;// for each element in the sub array from i to i + stepfor (int j = i + 1; j <= i + step && j < n; j++){// if the element at j is less than the element at min_indexif (arr[j] < arr[min_index]){// set min_index to jmin_index = j;}}// if the element at min_index is not equal to iif (min_index != i){// swap the elements at i and min_indexswap(arr[i], arr[min_index]);}}// set step to be half of the previous stepstep = step / 2;}
}template <typename T>
void testJumpSelectionSort(vector<T> &testVec)
{auto sortedArr = testVec;sort(sortedArr.begin(), sortedArr.end());jumpSelectionSort<T>(testVec);if (testVec == sortedArr){cout << "Test passed for the given test case!" << endl;}
}void jumpSelectionSortCase()
{vector<int> arr{3, 7, 9, 1, 2, 6, 8, 4, 5};testJumpSelectionSort<int>(arr); // Test case 1: Passed.// ... other test cases ...vector<double> dArr{3.0, 7.0, 9.0, 1.0, 2.0, 6.0, 8.0, 4.0, 5.0};testJumpSelectionSort<double>(dArr); // Test case 2: Passed.vector<float> fArr{3.0f, 7.0f, 9.0f, 1.0f, 2.0f, 6.0f, 8.0f, 4.0f, 5.0f};testJumpSelectionSort<float>(fArr); // Test case 3: Passed.vector<char> cArr{'c', 'a', 'b'};testJumpSelectionSort<char>(cArr); // Test case 4: Passed.vector<Person> personArr{Person("Alice", 25, 80), Person("Bob", 20, 65), Person("Charlie", 22, 77)};testJumpSelectionSort<Person>(personArr); // Test case 5: Passed.
}

请注意，跳跃选择排序的性能改进依赖于跳跃长度的选择。在某些情况下，它可能比基本的 $O(n^2)$ 选择排序要快，但通常情况下，它仍然不如 $\log n)$ 的排序算法，如快速排序、归并排序或堆排序。

双向选择排序（Bidirectional Selection Sort）

双向选择排序（Bidirectional Selection Sort）是选择排序的一种变体，它在每一步排序中同时找到最大值和最小值，并将它们放到已排序部分的正确位置。这样，每一步可以减少两次交换操作，从而在某些情况下比传统的选择排序稍微高效一些。

双向选择排序的基本思想是：

在未排序的部分找到最小元素，并将其放到已排序部分的起始位置。
在未排序的部分找到最大元素，并将其放到已排序部分的末尾位置。
重复上述步骤，每次减少未排序部分的边界，直到整个数组被排序。

伪代码如下：

function bidirectionalSelectionSort(arr):n = length(arr)for i from 0 to n/2:# 找到[i, n-i-1]区间内的最小元素的索引min_index = ifor j from i+1 to n-i-1:if arr[j] < arr[min_index]:min_index = j# 将找到的最小元素交换到位置iswap(arr[i], arr[min_index])# 如果最大元素的索引小于当前的最小元素索引，需要调整if min_index == i:max_index = min_indexelse:max_index = i# 找到[i, n-i-1]区间内的最大元素的索引for j from i+1 to n-i-1:if arr[j] > arr[max_index]:max_index = j# 将找到的最大元素交换到位置n-i-1swap(arr[n-i-1], arr[max_index])

下面是C++模板的实现代码：

template <typename T>
void bidirectionalSelectionSort(vector<T> &arr)
{// n is the size of the arrayint n = arr.size();// loop from 0 to n/2for (int i = 0; i < n / 2; ++i){// set the min and max index to iint min_index = i, max_index = i;// loop from i+1 to n-i-1for (int j = i + 1; j <= n - i - 1; ++j){// if arr[j] is less than arr[min_index]if (arr[j] < arr[min_index]){// set min_index to jmin_index = j;}// if arr[j] is greater than arr[max_index]if (arr[j] > arr[max_index]){// set max_index to jmax_index = j;}}// if min_index is not equal to iif (min_index != i){// swap arr[i] and arr[min_index]swap(arr[i], arr[min_index]);}// if max_index is equal to iif (max_index == i){// set max_index to min_indexmax_index = min_index;}// if max_index is not equal to n-i-1if (max_index != n - i - 1){// swap arr[n-i-1] and arr[max_index]swap(arr[n - i - 1], arr[max_index]);}}
}template <typename T>
void printResult(vector<T>& data)
{bidirectionalSelectionSort(data);cout << "Sorted array: \n";for (T value : data){cout << value << " ";}cout << endl;
}void bidirectionalSelectionSortCase()
{// Sort a double arrayvector<double> doubleData = {64.0, 34.0, 25.0, 12.0, 22.0, 11.0, 90.0};printResult(doubleData);// Sort an int arrayvector<int> intData = {64, 34, 25, 12, 22, 11, 90};printResult(intData);// Sort a char arrayvector<char> charData = {'d', 'b', 'a', 'c'};printResult(charData);
}