Algorithm Exploration: Sorting Without Comparison (3)
- Bucket sorting
In the first two issues of algorithm exploration, Zhao codesmith introduced two linear sorting algorithms - counting sorting and cardinal sorting (click to jump to view the corresponding article), which both used a concept - "bucket".
Counting sorted "buckets" are used to hold the same element, indicating the number of times it appears; Cardinal sorted "buckets" are used to hold elements with the same value on a certain digit. Both of these "buckets" are special buckets, specialized versions of the theory based on bucket ordering. Therefore, in this issue, Zhao Coder would like to introduce the ordinary barrel sorting.
The basic idea of bucket sorting
Divide the data to be sorted into several ordered buckets, and the data in each bucket is sorted separately. After the sorting is completed, the data in each bucket is taken out in order of the bucket, and the resulting sequence is orderly.
Basic steps for bucket sorting
in arrays
a[10]={35,22,26,8,1,36,12,19,47,11}
1. Determine the bucket: first find out the size of the array (10), the maximum and minimum values of the array (47 and 1), determine the number of buckets and the bucket strategy (divided into 5 buckets, respectively I 0~9, 10~19, 20~29, 30~39, 40~49)
2. Initialize the bucket: Generally speaking, create a two-dimensional array of vector as a bucket, where vector[i] represents the ordinal number of the bucket, and vector[i][j] represents the elements allocated in the bucket;
3. Assign elements: Iterate through the array to be sorted and allocate data to the corresponding bucket according to the policy. Among them, the strategy can be the number of buckets and the range of values for values.
Before sorting | Barrel 1 | Barrel 2 | Barrel 3 | Barrel 4 | Barrel 5 |
range | 0~9 | 10~19 | 20~30 | 30~40 | 40~50 |
value | 8,1 | 12,19,11 | 22,26 | 35,36 | 47 |
4. Sorting in the bucket: Generally speaking, there are not many elements in the bucket, a variety of sorting algorithms can be used, generally speaking, sort() is used for quick sorting, if there are more elements in the bucket, other efficient sorting algorithms such as counting sorting can be used;
Before sorting | Barrel 1 | Barrel 2 | Barrel 3 | Barrel 4 | Barrel 5 |
range | 0~9 | 10~19 | 20~30 | 30~40 | 40~50 |
value | 1,8 | 11,12,19 | 22,26 | 35,36 | 47 |
5. Merge buckets: Merge all elements in the order of the buckets, and the merged sequence is the sorted sequence.
Post-merger:
a[10]={1,8,11,12,19,22,26,35,36,47};
Spatiotemporal complexity analysis of bucket sorting
Let the length of the original array be n, the number of buckets be k, and each bucket has an average of m elements, where n=m*k.
Firstly, the array elements are traversed to find the maximum and minimum values, and the strategy is designed, and the time complexity is O(n).
Secondly, the array is traversed again according to the strategy, and each value is assigned to the corresponding bucket, and the time complexity is also O(n).
Then, the elements in the bucket are quickly sorted, and the average time complexity is O(mlogm).
Finally, all the elements are merged, and the time complexity is still O(n);
Therefore, the total time complexity of bucket sorting is O(n+mlogm), and when the number of elements in the bucket is much less than the length of the array, the time complexity of bucket sorting is gradually O(n).
The space complexity of bucket sorting is O(n+m*k), and when m is small enough, the space complexity is O(n+k).
summary
As the parent algorithm of count sorting and cardinality sorting, bucket sorting overcomes the drawbacks of sorting only integers, and can also have good sorting performance for floating-point numbers and simple strings (mainly because the sorting algorithm in the bucket can be selected arbitrarily), which is suitable for the situation that the data volume is large but the distribution is relatively uniform, the data range is known, and it is convenient to allocate to buckets.
However, since bucket sorting is essentially distributed computing of data, giving full play to the advantages of parallel processing, if the data cannot be evenly distributed, the efficiency of bucket sorting cannot be reflected.
【Code Display】
#include<iostream>
#include<ctime>
#include<vector>
#include<algorithm>
using namespace std;
#define N 20 //以20个元素的数组为例
int a[N];
vector<int> tong[N];//最多N个桶
int main(){
srand(time(0));
cout<<"排序前:";
for(int i=0;i<N;i++){
a[i]=rand()%100;//随机生成20个100以内的数字
cout<<a[i]<<" ";
}
cout<<endl;
//找最大值,进行数据准备
int maxx=0;
for(int i=0;i<N;i++){
maxx=max(maxx,a[i]);
}
//下面开始桶排序
//第一步:分配
for(int i=0;i<N;i++){
tong[a[i]/10].push_back(a[i]);//规则:按十位数分桶
}
//展示每个桶的元素
for(int i=0;i<=maxx/10;i++){
cout<<"第"<<i<<"个桶里的元素有:" ;
for(auto j: tong[i]){
cout<<j<<" ";
}
cout<<endl;
}
//第二步:桶内排序
for(int i=0;i<=maxx/10;i++){
sort(tong[i].begin(),tong[i].end());
}
//第三步,还原
int num=0;
for(int i=0;i<=maxx/10;i++){
for(auto j: tong[i]){
a[num++]=j;
}
}
cout<<"排序结果:";
for(int i=0;i<N;i++){
cout<<a[i]<<" ";
}
return 0;
}
Run the results