# Bucket Sort

Summary in One Sentence

The core idea of the bucket sort algorithm is divided into three steps:

Distribute elements of the array to be sorted into several "buckets" using a mapping function.

Sort the elements within each bucket.

Finally, merge the sorted buckets to obtain the sorted result.

Open the visualization panel below, and repeatedly click the line of code `buckets[index].push(num)`

to see the process of elements being distributed into different buckets. Repeatedly click the line `insertSort(curBucket)`

to see the process of sorting each bucket. Repeatedly click the line `nums[index++] = num`

to see the process of merging the sorted buckets.

The bucket sort algorithm might not be very common, but I personally find its concept very interesting because you can see traces of both Merge Sort and Counting Sort in its idea.

If you have studied all the previous algorithms in sequence, you will appreciate the intricate connections between them. Consider how generations of computer scientists have demonstrated their ingenuity for the single need of "sorting." Their brilliant ideas are endless, and as their successors, why not savor these concepts?

## Key Points of Bucket Sort

To get straight to the point, the idea behind bucket sort is quite simple. You first distribute the elements of the array to be sorted into several buckets, sort the elements within each bucket individually, and finally merge the elements from these buckets back together in order.

Does this approach sound a bit like Merge Sort? Both divide a large array into smaller arrays for sorting, and then merge them back together. However, bucket sort is more flexible, with each of its three core steps open to variation:

How do you allocate elements to be sorted into buckets? You need to determine the number of buckets and provide a mapping function.

How do you sort the elements within each bucket? Theoretically, any sorting algorithm can be used, or you can mimic the Merge Sort approach by recursively running bucket sort on each bucket.

How do you merge the sorted buckets? Later sections will discuss the general algorithm for merging multiple sorted lists/arrays, but that algorithm involves using a Binary Heap structure and has a complexity of $O(n*logk)$, which is clearly not suitable here:

If we are using binary heaps, we might as well go for Heap Sort, right? Therefore, the time complexity of this merge step should not exceed $O(n)$. To achieve this, you must design the element distribution mapping function appropriately.

Regarding these three questions, I would like to first explore the second one. Have you ever wondered, **why we divide the array to be sorted into several buckets and then sort each bucket? Compared to sorting the entire array directly, is there really a difference**?

The answer is, if we temporarily disregard the complexity of merging sorted buckets, sorting them separately is indeed more efficient than sorting the entire array at once.

Separate Sorting vs. Overall Sorting

Taking the simplest selection sort as an example, if I directly apply selection sort to an array of size $n$, the time complexity is $O(n^2)$.

Suppose we divide the array to be sorted into $k$ buckets and apply selection sort to each bucket. Will the total time complexity be greater or less than $O(n^2)$?

This is essentially a simple math problem. Assume there is a positive integer $n$, which can be decomposed into $n = n_1 + n_2 + \cdots + n_k$. Which is greater, $n^2$ or $n_1^2 + n_2^2 + \cdots + n_k^2$?

There are multiple ways to derive the answer. Let's consider a geometric approach:

Think of $n^2$ as the area of a square, while $n_1^2 + n_2^2 + \cdots + n_k^2$ is the sum of the areas of several smaller squares along one side of this large square. Intuitively, these smaller squares have less area than the entire square, so clearly $n^2 >= n_1^2 + n_2^2 + \cdots + n_k^2$.

**From this, we understand that the total time complexity of separate sorting is less than that of overall sorting**, which is the mathematical foundation ensuring the efficiency of the bucket sort algorithm.

Building on this square area abstraction, we can go further. What happens when $k$ becomes infinitely large and $n_1, n_2, \cdots, n_k$ become infinitely small?

The sum of the areas of the smaller squares along one side approaches zero, eventually merging with the side itself. In other words, the value of $n_1^2 + n_2^2 + \cdots + n_k^2$ approaches $n$.

**Thus, if bucket sort allocates elements into as many buckets as possible (maximizing $k$), with at most one element per bucket, it effectively becomes counting sort, reducing its complexity to $O(n)$.**

Even if each bucket doesn't contain only one element, as long as $k > 1$, the time complexity of bucket sort will be less than $O(n^2)$. The larger $k$ is, the closer the complexity gets to $O(n)$.

Conversely, if $k = 1$, bucket sort degenerates entirely into selection sort, with a time complexity of $O(n^2)$.

Of course, the above analysis does not consider the time complexity of merging sorted buckets, but as long as merging can be done in $O(n)$ time, the overall time complexity of bucket sort remains less than $O(n^2)$.

Next, I will discuss how to allocate elements into buckets and how to merge sorted buckets, followed by several code implementations of bucket sort.