Heap Sort and Binary Heap
Prerequisite Knowledge
Before reading this article, you should first learn:
Summary in One Sentence
Heap sort is a sorting algorithm derived from the binary heap structure with a complexity of . It mainly involves two steps: first, creating a binary heap in-place on the array to be sorted (Heapify), and then sorting in-place (Sort).
You can open the visualization panel below and click to jump to the let tree = ...
part of the code to see the array abstracted as a complete binary tree. Continuously clicking the Heap.swim
part of the code will show the heap construction process, and clicking the Heap.sink
part will show the sorting process.
To learn the heap sort algorithm, you must understand the principles of the binary heap structure; otherwise, you might not comprehend the sorting process.
In the previous article Binary Heap Basics, we introduced the binary heap structure, and Binary Heap in Implementing Priority Queue used the binary heap structure to implement a SimpleMinPQ
priority queue, where elements inserted into the queue are removed in ascending order.
This article will introduce the heap sort algorithm, a new sorting algorithm derived from the properties of binary heaps, which is both elegant and efficient.
First, let me reiterate some key principles of implementing a priority queue with a binary heap. If you do not understand any part, be sure to review the previous articles; otherwise, you will not grasp heap sort.
A binary heap (priority queue) is implemented using an array but logically represents a complete binary tree, primarily maintained by the
swim
andsink
methods.Priority queues can be divided into min-heaps and max-heaps. A min-heap maintains the smallest element at the top, while a max-heap maintains the largest element at the top.
When inserting an element into a priority queue, the element is appended to the bottom of the binary heap, and the
swim
method is called to float the element to the correct position, with a time complexity of .When deleting the top element of the priority queue, the last element at the bottom of the heap is swapped to the top as the new top element, and the
sink
method is called to sink this new top element to the correct position, with a time complexity of .
The simplest idea for a heap sort algorithm is to directly use a priority queue: insert all elements into the priority queue and then extract them, completing the sort.
// sort the array from smallest to largest using a priority queue
void sort(int[] nums) {
// create a min heap that sorts elements from smallest to largest
SimpleMinPQ pq = new SimpleMinPQ(nums.length);
// first insert all elements into the priority queue
for (int num : nums) {
// push operation automatically builds a binary heap, with time complexity O(logN)
pq.push(num);
}
// then extract all elements, resulting in a sorted order from smallest to largest
for (int i = 0; i < nums.length; i++) {
// pop operation removes the smallest element from the top of the binary heap, with time complexity O(logN)
nums[i] = pq.pop();
}
}
Since the push
and pop
methods of a priority queue have a complexity of , the overall time complexity of sorting is , where N
is the length of the input array.
This approach can yield the correct sorting result, but the space complexity is , as the priority queue we create is an additional data structure that uses an array to store elements.
Therefore, the problem heap sort aims to solve is to avoid using extra auxiliary space and perform sink
and swim
operations directly on the original array, completing the sort in time.
Two Key Steps of Heap Sort
In-place Heap Construction (Heapify): Directly transform the array to be sorted into a binary heap in place.
Sorting: Continuously extract elements from the heap to obtain a sorted result.
Take a few minutes to think about it yourself. Comparing the process of adding and removing elements in a priority queue, it is indeed not difficult to implement these two steps in-place using swim
and sink
methods, and you should be able to figure it out independently.
Before explaining the heap sort code implementation in detail, I'll first present the swim
and sink
methods of a binary heap along with the supporting utility functions, because I will guide you through optimizing the heap sort code step by step later, and I won’t repeat the implementation of these functions.
These functions are extracted from the priority queue implementation in Binary Heap Implementation of Priority Queue, with the array passed in as a function parameter, and the other logic remains unchanged: