Heapsort - illustration | Algorithm | Pseudocode
Whenever we need to arrange data in a specific order how you sort the data provided is of the utmost importance. However, in C, C++ we use many different types of Sorting Techniques to do this job as we have already discussed some of these techniques like Bucket Sort, Shell Sort, Quick Sort, Merge Sort, Selection Sort, Insertion Sort, Bubble Sort, and Tree Sort, So today we are going to understand one of the popular sorting techniques i.e " Heapsort ".
So let's get started...
illustration
Heapsort is one of the best sorting methods being in-place and with no quadratic worst-case running time. It is a comparison-based sorting technique based on a Binary Heap data structure.
At this time you might be wondering well ok Heapsort is a comparison based sorting technique but " What is Heap ", " How much better/ effective it is actually than other sorting techniques " "what about Time Complexity " "What about Space Complexity " " What about is's algorithms " but after reading this post you will get all the answers for your questions and will be able to answer them with full confidence and required knowledge.
So Let's Get started with the first question you might get in your mind i.e
What Do you mean by a Heap?
Well, the right answer for this question is that It is a data structure that is a complete binary tree, and all the levels of which are completely filled except the last level. However, the heap has some order of values to be maintained between parents and their children.
There are only 2 variations of the heap that are possible i.e
1. MIN HEAP : Here the value of the parent is always less than the value of its children hence the root will be the minimum in the entire heap.
2. MAX HEAP : Here the value of the parent is always more than the value of its children hence the root will be the maximum in the entire heap.
So now when we know what a Heap means let's quickly go to the next question i.e :
What is Heapsort ?
Being invented by John Williams.
Heapsort is one of the efficient sorting algorithms based on the heap data structure. It is a sorting technique of data structure that uses the approach quite opposite to that of the Selection Sort.
As we saw earlier as the selection sort finds the smallest element among n elements then the smallest element among n-1 elements and so on until the end of the array, heapsort finds the largest element and puts it at end of the array, the second largest element is found and this process is repeated for all other elements present in the given array. So for the ith index, the left child will become the element present at the 2*i+1 index in the array while the right child will become the element present at the 2*i+2 index in the array while the Parent of the ith index will be element present at (i-1)/2 index in the array.
However, there are 2 major operations that are responsible for maintaining the heap:
Heapify
- This operation sets the order of values between the parent and its child.
- If we are dealing with the max heap, it will find the index having max value among the node and its children
- If the index holding max value is not the parent, it will wap the parent with the child having the max value
Algorithm
heapify(arr , i)
leftChild = arr [2*0 + 1];
rightChild = arr [2*0 + 2];
maxIndex = max( arr[i], leftChild, rightChild)
if(i != maxIndex)
swap(arr[i], arr[maxIndex])
Building MAXHEAP or MINHEAP
- This is responsible for setting parent-child order in the entire heap
- This is the first function which is called to build the heap.
- It starts from the last parent in the tree because that is the first instance where the order may get disturbed.
- So it iterates from i=n/2-1 to 0 and call heapify on every parent.
Algorithm
buildMaxHeap(arr)
for(int i = n / 2 - 1; i >= 0; i--)
heapify(arr, i);
In Heapsort, we deal with Maxheap but as we know root will have the max value possible in the heap, we will swap it with the last element in the heap and reduce the heap size by 1 while the heap size is variable that maintains the total number of elements to be considered in the heap after which we call downward heapify on the root while it starts from setting the relationship between the root n d its children. If either of the children was maximum then heapify is called on it.
Which is better Mergesort or Heapsort ?
To better answer this question we should start by understanding it based on different parameters like In terms of algorithm, time and space complexity, In terms of speed, in-place, and stability.
In terms of the algorithm :
In merge sort, in every iteration, we divide the problem into 2 almost equal subproblems. Once no more dividing is possible, we start merging upwards
In heapsort, there are 2 major operations that basically aids heapsort that is heapify and build a heap.
In terms of time and space complexity :
Heapsort makes all the changes in the input array itself hence space requirement is constant here.
In terms of speed :
In terms of in-place :
In terms of stability :
Heapsort Pseudo-code
- First, call build max heap to set the heap initially
- Once the heap is created, take the root and wap it will the last element of the heap
- Reduce the size of the heap
- Call max heapify of index 0 i.e, the new root of the heap
Heapsort(A as array)
BuildHeap(A)
for i = n to 1
swap(A[1], A[i])
n = n - 1
Heapify(A, 1)
BuildHeap(A as array)
n = elements_in(A)
for i = floor(n/2) to 1
Heapify(A,i,n)
Heapify(A as array, i as int, n as int)
left = 2i
right = 2i+1
if (left <= n) and (A[left] > A[i])
max = left
else
max = i
if (right<=n) and (A[right] > A[max])
max = right
if (max != i)
swap(A[i], A[max])
Heapify(A, max)
Heapsort Algorithm
Heapsort(arr) buildMaxHeap(arr) for (int i = n - 1; i >= 0; i--) { swap(&arr[0], &arr[i]); heapsize--; maxHeapify(arr,0); }
Heapsort Algorithm Dry Run
0 | 1 | 2 | 3 | 4 |
23 | 10 | 16 | 11 | 20 |
0 | 1 | 2 | 3 | 4 |
23 | 20 | 16 | 11 | 10 |
For i=4
0 | 1 | 2 | 3 | 4 |
20 | 11 | 16 | 10 | 23 |
After i=3
0 | 1 | 2 | 3 | 4 |
16 | 11 | 10 | 20 | 23 |
After i=2
0 | 1 | 2 | 3 | 4 |
11 | 10 | 16 | 20 | 23 |
After i=1
0 | 1 | 2 | 3 | 4 |
10 | 11 | 16 | 20 | 23 |
After i=0
0 | 1 | 2 | 3 | 4 |
10 | 11 | 16 | 20 | 23 |
Heapsort Time complexity
- Build max heap takes O(n/2) time
- We are calling for heapify inside the for loop, which may take the height of the heap in the worst case for all comparisons. Therefore time complexity will become O(nlogn).
- Best Time Complexity: O(nlogn)
- Average Time Complexity: O(nlogn)
- Worst Time Complexity: O(nlogn)
Heapsort Space Complexity
- No auxiliary space is required in Heapsort implementation that is we are not using any arrays, linked list, stack, queue, etc to store our elements
- Hence space complexity is: O(1)
Heapsort Example
Input : 11 34 9 5 16 10 K = 3 Output: 11
Heapsort in C
#include <stdio.h>
void swap(int *a, int *b) {
int tmp = *a;
*a = *b;
*b = tmp;
}
void heapify(int arr[], int n, int i) {
int max = i;
int leftChild = 2 * i + 1;
int rightChild = 2 * i + 2;
if (leftChild < n && arr[leftChild] > arr[max])
max = leftChild;
if (rightChild < n && arr[rightChild] > arr[max])
max = rightChild;
if (max != i) {
swap(&arr[i], &arr[max]);
heapify(arr, n, max);
}
}
void heapSort(int arr[], int n) {
for (int i = n / 2 - 1; i >= 0; i--)
heapify(arr, n, i);
for (int i = n - 1; i >= 0; i--) {
swap(&arr[0], &arr[i]);
heapify(arr, i, 0);
}
}
void display(int arr[], int n) {
for (int i = 0; i < n; ++i)
printf("%d ", arr[i]);
printf("\n");
}
int main() {
int arr[] = {11, 34, 9, 5, 16, 10};
int n = sizeof(arr) / sizeof(arr[0]);
printf("Original array:\n");
display(arr, n);
heapSort(arr, n);
printf("Sorted array:\n");
display(arr, n);
}
Output:
Output of the program:
Original array:
11 34 9 5 16 10
Sorted array:
5 9 10 11 16 34
Heapsort in C++
#include <iostream> using namespace std; void heapify(int arr[], int n, int i) { int max = i; int leftChild = 2 * i + 1; int rightChild = 2 * i + 2; if (leftChild < n && arr[leftChild] > arr[max]) max = leftChild; if (rightChild < n && arr[rightChild] > arr[max]) max = rightChild; if (max != i) { swap(arr[i], arr[max]); heapify(arr, n, max); } } void heapSort(int arr[], int n) { for (int i = n / 2 - 1; i >= 0; i--) heapify(arr, n, i); for (int i = n - 1; i >= 0; i--) { swap(arr[0], arr[i]); heapify(arr, i, 0); } } void display(int arr[], int n) { for (int i = 0; i < n; ++i) cout << arr[i] << " "; cout << "\n"; } int main() { int arr[] = {11, 34, 9, 5, 16, 10}; int n = sizeof(arr) / sizeof(arr[0]); cout << "Original array:\n"; display(arr, n); heapSort(arr, n); cout << "Sorted array:\n"; display(arr, n); }
Output:
Output of the program:
Original array:
11 34 9 5 16 10
Sorted array:
5 9 10 11 16 34
Heapsort in JAVA
public class HeapSort { public void sort(int arr[]) { int n = arr.length; for (int i = n / 2 - 1; i >= 0; i--) { heapify(arr, n, i); } for (int i = n - 1; i >= 0; i--) { int tmp = arr[0]; arr[0] = arr[i]; arr[i] = tmp; heapify(arr, i, 0); } } void heapify(int arr[], int n, int i) { int max = i; int leftChild = 2 * i + 1; int rightChild = 2 * i + 2; if (leftChild < n && arr[leftChild] > arr[max]) max = leftChild; if (rightChild < n && arr[rightChild] > arr[max]) max = rightChild; if (max != i) { int swap = arr[i]; arr[i] = arr[max]; arr[max] = swap; heapify(arr, n, max); } } static void display(int arr[]) { int n = arr.length; for (int i = 0; i < n; ++i) System.out.print(arr[i] + " "); System.out.println(); } public static void main(String args[]) { int arr[] = {11, 34, 9, 5, 16, 10}; HeapSort hs = new HeapSort(); System.out.println("Original array:"); display(arr); hs.sort(arr); System.out.println("Sorted array:"); display(arr); } }
Output:
Output of the program:
Original array:
11 34 9 5 16 10
Sorted array:
5 9 10 11 16 34
Heapsort in Python
def heapify(arr, n, i):
largest = i
l = 2 * i + 1
r = 2 * i + 2
if l < n and arr[i] < arr[l]:
largest = l
if r < n and arr[largest] < arr[r]:
largest = r
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i]
heapify(arr, n, largest)
def heapSort(arr):
n = len(arr)
for i in range(n //2, -1, -1):
heapify(arr, n, i)
for i in range(n - 1, 0, -1):
arr[i], arr[0] = arr[0], arr[i]
heapify(arr, i, 0)
arr = [11, 34, 9, 5, 16, 10]
n = len(arr)
print("Original array:")
for i in range(n):
print("%d " % arr[i], end = '')
heapSort(arr)
print("Sorted array:")
for i in range(n):
print("%d " % arr[i], end = '')
Output:
Output of the program:
Original array:
11 34 9 5 16 10
Sorted array:
5 9 10 11 16 34
Conclusion
- Heapsort is a comparison-based sorting technique using a binary heap.
- It can be termed as an improvement over selection sort since both these sorting techniques work with similar logic of finding the largest or smallest element in the array repeatedly and then placing it into the sorted array.
- Heapsort makes use of max-heap or min-heap to sort the array. The first step in heapsort is to build a min or max heap from the array data and then delete the root element recursively and heapify the heap until there is only one node present in the heap.
- Heapsort is an efficient algorithm and it performs faster than selection sort. It may be used to sort an almost sorted array or find k largest or smallest elements in the array.
No comments
If you have any doubts, please let us Know.