Heap Sort Algorithm Explained: Implementation & Examples


7 min read 07-11-2024
Heap Sort Algorithm Explained: Implementation & Examples

Introduction

Heap sort is a comparison-based sorting algorithm that is efficient and relatively simple to understand. It utilizes a binary heap data structure to sort an array of elements. This algorithm is known for its stability, which means that elements with equal values maintain their relative positions in the sorted output. Let's dive deeper into the intricacies of heap sort, exploring its core principles, implementation, and illustrative examples.

What is Heap Sort?

Heap sort is a powerful sorting algorithm that leverages the properties of a binary heap to arrange elements in ascending or descending order. A binary heap is a special kind of binary tree that satisfies the heap property:

  • Max-Heap: In a max-heap, the value of each node is greater than or equal to the values of its children.
  • Min-Heap: In a min-heap, the value of each node is less than or equal to the values of its children.

How Heap Sort Works

The heap sort algorithm involves two key phases:

1. Heapify: The initial array is transformed into a binary heap. This step ensures that the heap property is satisfied throughout the tree.

2. Sort: The algorithm repeatedly extracts the maximum element (in the case of a max-heap) or the minimum element (in the case of a min-heap) from the heap and places it at the end of the sorted array. This process continues until the heap is empty.

Heapify Phase

The heapify phase is crucial for establishing the foundation of the heap sort algorithm. This step involves building a binary heap from the given input array. The process typically involves:

  1. Building the Heap: Start by placing all the elements of the input array into a binary tree. This tree is then converted into a heap by recursively applying the "heapify" operation to each non-leaf node from the bottom up. The heapify operation ensures that the heap property is maintained after the insertion of a new element.

  2. Heapify Operation: The heapify operation is a recursive procedure that ensures that a subtree rooted at a particular node follows the heap property. It works by comparing the node's value with its children's values. If the node's value violates the heap property, it is swapped with its larger (or smaller, in the case of a min-heap) child. This process continues recursively until the heap property is restored.

Sort Phase

Once the heap is built, the sort phase begins. This phase iteratively removes the root element (the maximum or minimum) from the heap and places it at the end of the sorted array. The process continues until the heap becomes empty, resulting in a sorted array.

Steps:

  1. Extract the Root: Remove the root node (maximum or minimum) from the heap and place it at the end of the sorted array.

  2. Heapify: After removing the root, the last element of the heap is placed at the root position. The heap property might now be violated. Therefore, a heapify operation is performed on the root node to restore the heap property.

  3. Repeat: Steps 1 and 2 are repeated until the heap is empty.

Implementation of Heap Sort

Here's a Python implementation of the heap sort algorithm:

def heapify(arr, n, i):
  largest = i  # Initialize largest as root
  l = 2 * i + 1     # left = 2*i + 1
  r = 2 * i + 2     # right = 2*i + 2

  # See if left child of root exists and is greater than root
  if l < n and arr[largest] < arr[l]:
    largest = l

  # See if right child of root exists and is greater than root
  if r < n and arr[largest] < arr[r]:
    largest = r

  # Change root if needed
  if largest != i:
    arr[i], arr[largest] = arr[largest], arr[i]  # Swap

    # Heapify the root
    heapify(arr, n, largest)

def heap_sort(arr):
  n = len(arr)

  # Build a max heap
  for i in range(n // 2 - 1, -1, -1):
    heapify(arr, n, i)

  # One by one extract elements from the heap
  for i in range(n - 1, 0, -1):
    arr[i], arr[0] = arr[0], arr[i]  # Swap
    heapify(arr, i, 0)

# Example usage:
array = [12, 11, 13, 5, 6, 7]
heap_sort(array)
print("Sorted array is:", array) 

Example

Let's illustrate how the heap sort algorithm works with a simple example:

Input Array: [5, 1, 4, 2, 8]

1. Heapify Phase:

  • Step 1: Build the heap:
    • Start by placing all elements into a binary tree:
            5
          /   \
        1      4
       / \    / \
      2   8  -   -
      
    • Heapify the tree from the bottom up:
      • Heapify node 2:
          5
        /   \
        1      4
        
      / \ /
      8 2 - -
      * Heapify node 1:
          ```
            5
          /   \
        8      4
       / \    / \
      2   1  -   - 
      
      • Heapify node 5:
          8
        /   \
        5      4
        
      / \ /
      2 1 - -
  • Step 2: The heap is now built:
              8
            /   \
          5      4
         / \    / \
        2   1  -   - 
        ```
    
    

2. Sort Phase:

  • Step 1: Extract the root (8) and place it at the end of the sorted array:
    • Sorted Array: [8]
    • Heap: [5, 1, 4, 2]
  • Step 2: Swap the last element of the heap (2) with the root (5), and heapify:
    • Sorted Array: [8]
    • Heap: [2, 1, 4]
  • Step 3: Extract the root (2) and place it at the end of the sorted array:
    • Sorted Array: [8, 2]
    • Heap: [4, 1]
  • Step 4: Swap the last element of the heap (1) with the root (4), and heapify:
    • Sorted Array: [8, 2]
    • Heap: [1]
  • Step 5: Extract the root (1) and place it at the end of the sorted array:
    • Sorted Array: [8, 2, 1]
    • Heap: []

Final Sorted Array: [1, 2, 4, 5, 8]

Advantages of Heap Sort

  • In-Place Sorting: Heap sort is an in-place sorting algorithm, meaning it does not require additional memory to store sorted elements.
  • Time Complexity: It has an average and worst-case time complexity of O(n log n), which makes it a very efficient algorithm for large datasets.
  • Stable: Heap sort is a stable sorting algorithm, which means that elements with equal values maintain their relative positions in the sorted output.
  • Reliable: Heap sort is a robust algorithm that performs consistently well across various datasets.

Disadvantages of Heap Sort

  • Space Complexity: While heap sort is in-place, it uses a constant amount of additional space for its internal operations, which can be a concern in memory-constrained scenarios.
  • Not Ideal for Small Datasets: Heap sort's overhead in the heapify phase can make it less efficient than simpler algorithms like insertion sort for smaller datasets.

Applications of Heap Sort

Heap sort finds numerous applications in various domains, including:

  • Data Structures: It is often used in implementing priority queues, which require efficient retrieval of the highest or lowest priority element.
  • Operating Systems: Heap sort is used in operating systems to manage memory allocation and process scheduling.
  • Database Systems: Heap sort is used in database systems for sorting and indexing data efficiently.
  • Algorithm Design: It serves as a fundamental building block in many advanced algorithms, such as the QuickSort algorithm's pivot selection process.

Heap Sort vs. Other Sorting Algorithms

It's helpful to compare heap sort with other commonly used sorting algorithms to understand its strengths and weaknesses:

Algorithm Time Complexity (Average) Space Complexity Stable
Bubble Sort O(n^2) O(1) Yes
Insertion Sort O(n^2) O(1) Yes
Selection Sort O(n^2) O(1) No
Merge Sort O(n log n) O(n) Yes
Quick Sort O(n log n) O(log n) No
Heap Sort O(n log n) O(1) Yes
  • Bubble Sort, Insertion Sort, and Selection Sort: These algorithms have a time complexity of O(n^2), making them less efficient for large datasets compared to algorithms like heap sort.

  • Merge Sort: Merge sort is also an efficient algorithm with O(n log n) time complexity, but it requires O(n) additional space, making it less space-efficient than heap sort.

  • Quick Sort: Quick sort is generally faster than heap sort in practice, but it has a worst-case time complexity of O(n^2). Its average time complexity is O(n log n).

Conclusion

Heap sort is a highly efficient and reliable sorting algorithm that offers a balance between time and space complexity. Its in-place nature and stability make it a suitable choice for various applications, especially when dealing with larger datasets. While it may not always be the fastest algorithm, its predictable performance and consistent results make it a valuable tool in the sorting algorithm arsenal.

FAQs

1. What is the difference between a heap and a binary tree?

A binary tree is a tree data structure where each node can have at most two children. A heap is a special type of binary tree that satisfies the heap property, which means that the value of each node is greater than or equal to (or less than or equal to) the values of its children.

2. How does the heap sort algorithm handle duplicates?

Heap sort handles duplicates by preserving their relative order. The stability of the algorithm ensures that elements with equal values maintain their positions in the sorted output.

3. Is heap sort suitable for sorting linked lists?

Heap sort is generally not the most suitable algorithm for sorting linked lists. Since heap sort requires random access to elements, which is not efficient for linked lists, other algorithms like merge sort or insertion sort are often preferred.

4. What is the space complexity of heap sort?

The space complexity of heap sort is O(1), meaning it uses a constant amount of additional space for its internal operations. This makes it an in-place sorting algorithm.

5. What is the use of heap sort in real-world applications?

Heap sort finds applications in various real-world scenarios, such as priority queues in operating systems and database systems, memory allocation in operating systems, and indexing data in database systems. It is also used as a building block in other advanced algorithms, such as the QuickSort algorithm.