Heapsort

Algorithms Heapsort

Heapsort is among the smarter algorithms, which are much faster than most of the ones we've covered. It's built on the Selection sort algorithm, so it's based on the extremes (in this case the greatest values), which are always moved to the end of the array. After integrating all of the greatest values to the end, we get a sorted array. The problem with Selection sort algorithm is that you have to manually search for the greatest/smallest values. In each nested loop, it has to go through the entire unsorted array part and check each item to see if it's the greatest value. There is no quicker way to find an array's greatest values.

However, what if there was a data structure in which we'd be able to represent items the same way as arrays, and also be able to find the greatest value in a constant amount of time without having to gothrough it all? With me being extremely specific and all, you probably guessed that this structure does indeed exist. The structure is called Heap (that's why the approach is called Heap Sort).

A heap is a binary tree with the following properties:

  • All "levels" of the heap except for the last one are fully occupied by items (each inner node has exactly two children, so the tree is very balanced)
  • The last level of the heap is filled from what's left (it can even be entirely filled)
  • All items in a heap have a special property: both children are always smaller or equal to their parent.

Knowing this special property, we can deduce that the root will always contain the greatest value which is very handy for us.

A heap looks something like this:

Heap – Heapsort

Of course, we can define special properties of the heap the other way and have the smallest value appear in the root (this approach is used in some projects as well).

Since we usually sort an array and not a heap, we'll have to build the heap from the given array. As some of you may have guessed, we'll create the tree structure using pointers or references and add the items into it. We can represent the heap using an ordinary array (thanks to knowing it's balanced) using a simple trick. This way, we won't need an auxiliary structure/memory. We'll work directly in our array, which we'll see as a heap so that we can sort the items directly in place.

The heap in the picture above would look something like this (the item indexes are above, whereas, the items are below):

0 1 2 3 4 5 6 7 8 9
9 7 8 5 4 0 3 2 1 2

Notice that there're almost no similarities between the heapified array and the sorted one. Now let's try to work with this array as if it was a heap. You probably noticed that the items in the array are placed just like they would be in a heap - sorted from top to bottom and from left to right. If we wanted to access the root, we'd simply go to the 0th index. The last item of the heap is also on the index of the last item in the array. However, we'll need the access from any parent to it's children and vice-versa.

If a child has an index of i, we can calculate it's parent's index using (i - 1) / 2 (it's a whole-number division). If a parent has an index of i, its left child's index is the result of (2 * i + 1) and the right child's index is the result of (2 * i) + 2.

We can quickly check whether it worked as well. Let's try to find the child of item 7 which has an index of 1 (the array is indexed from 0 so it's the second item). (1 - 1) / 2 is 0. Its parent should be item 9, which proves to be correct. Its children would be (2 * 1 + 1) = 3 and the third index holds a five - which is indeed its left child. The right child has the same index incremented by one, which also proves to be correct.

Now, we have all the needed tools and we can go try the first step of the algorithm - heapifying the array.

Process

Heapifying the array (building the heap)

Consider an array of the following items:

3 2 5 1 9 4

The array represents the following "heap":

Heapifying the array – building the heap – Heapsort

I used the quotes because this heap is broken - the special property of the heap doesn't apply here. Therefore, we've to fix it first and then heapify the array. To do so, we'll use the up function, which we'll continuously call on the items starting from the root. This function will find out whether the item breaks the special property of the heap (that the children aren't larger than their parent). If it finds one that is, it'll swap the item with its parent. However, doing so might just bring the problem up a level, which is why this function repeats until the special property of the heap applies for the item or until we reach the root. This function is called once for each item until it reaches the end. Then, we can be sure that the heap has been corrected.

Since I like to avoid the article to be too long, I'll only describe the heapifying progress in words - keep looking at the picture above.

We won't use the up function on the first item (the root), because it doesn't have a parent. Therefore, we'll start with item 2, which is smaller than its parent so we'll leave it in place. The next item is a 5 which is greater than its parent (3) so we'll swap them and end up at the root which means we're done. Item 1 is smaller than 2 which is correct. However, 9 is greater than 2 so we'd end up swapping them. However, 9 is still greater than 3 so we'd swap them again and 9 (the greatest value) would then be the root. 4 is greater than 3 so we'd swap them. 4 would then be smaller than 9, which is correct. The heapified array and the heap which it represents look like this:

9 5 4 1 2 3
Heaped array – building the heap – Heapsort

Now, let's move on to actually getting it sorted.

Sorting

As I mentioned before, we'll literally perform a Selection sort to the heap. Meaning that we're going to swap the greatest value with the last item, and continue repeating this until we get to the end. The greatest value always lies at the 0th index, so we'd be able to reach it in a constant amount of time. Swapping also does not take any time at all.

However, we still have a bit of a problem - swapping the items will break the heap for sure. We don't care about the original greatest value (just like with Selection sort). However, the item which would then be the new root probably wouldn't be the greatest value. Therefore, we'll repair the heap using a similar function as was the up function - the down function. We'll run it starting from the root and if we find children that are greater than their parent, we'll swap them out. In case they're both greater, we'll swap the parent with its greater child. We could also run into the same problem while using the up function. Therefore, this way, we have to look again at the child (which is now the parent) and find out whether it's greater than its children. We repeat the process until the heap's special property is met or we reach the very end. We have to call this function each time we swap the root with the last item.

If we think about it, we'll figure out that neither tearing the greatest value nor swapping out values costs us any time. The only problem with this approach is the function down which has to be executed n times (for every maximum). On top of that, in each call, it has to be repeated at least log n times because the heap is a balanced binary tree and has a height of log n where the base of the logarithm is 2 because each parent has 2 children. The function checks each item in each level. In the worst scenario it would go through the entire level, which is log n. The up function has the same time complexity as the down function - O(n log n). Overall, we'd call the function up once and the function down n times, so the resulting complexity of Heapsort is O(n log n).

All in all, this algorithm showed a huge improvement in performance, especially, on bigger arrays. However, the algorithm isn't stable, so the more demanding amongst you might not be satisfied with it. In that case, you'll have to read the next article about Merge sort or Quicksort.

Algorithm properties

Time complexity O (n log n)
Stability No
Speed Very good

Note. the time complexity is shown according to average cases and the speed is like this when compared to all of the other sorting algorithms.

Source code

// repair the heap up
public static void up(int[] list, int i) {
  int child = i; // save the child
  int parent, temp;
  while (child != 0) {
  parent = (child - 1) / 2; // parent
  if (list[parent] < list[child]) { // error detection
    temp = list[parent]; // swap the child with the parent
    list[parent] = list[child];
    list[child] = temp;
    child = parent; // new child
  }
  else
    return; // ok
  }
}

// repair the heap down
public static void down(int[] list, int last) {
  int parent = 0;
  int child, temp;
  while (parent * 2 + 1 <= last) {
    child = parent * 2 + 1;
    // if the smaller child is chosen
    if ((child < last) && (list[child] < list[child + 1]))
    child++; // we pick the greater
    if (list[parent] < list[child]) { // error detection
      temp = list[parent]; // swap the child with the parent
      list[parent] = list[child];
      list[child] = temp;
      parent = child; // new parent
    }
    else
      return;
    }
}

// build the heap from an array
public static void heapify(int[] list) {
  for (int i = 1; i < list.length; i++)
  up(list, i);
}

// sorting
public static void heapsort(int[] list) {
  heapify(list);
  int index = list.length - 1; // the last item
  int temp;
  while (index > 0) {
    temp = list[0]; // swap the last item with the maximum
    list[0] = list[index];
    list[index] = temp;
    index -= 1; // set new last item
    down(list, index);
  }
}
// repair the heap up
public static void up(int[] list, int i) {
  int child = i; // save the child
  int parent, temp;
  while (child != 0) {
    parent = (child - 1) / 2; // the parent
    if (list[parent] < list[child]) { // error detection
      temp = list[parent]; // swap the child with the parent
      list[parent] = list[child];
      list[child] = temp;
      child = parent; // new child
    }
    else
      return; // ok
    }
}

// repair heap down
public static void down(int[] list, int last) {
  int parent = 0;
  int child, temp;
  while (parent * 2 + 1 <= last) {
    child = parent * 2 + 1;
    // if the smaller son is chosen
    if ((child < last) && (list[child] < list[child + 1]))
    child++; // pick the greater one
    if (list[parent] < list[child]) { // error detection
      temp = list[parent]; // swap the child with the parent
      list[parent] = list[child];
      list[child] = temp;
      parent = child; // new parent
    }
    else
      return;
    }
}

// build the heap from the array
public static void heapify(int[] list) {
  for (int i = 1; i < list.Length; i++)
  up(list, i);
}

// sorting
public static void heapsort(int[] list) {
  heapify(list);
  int index = list.Length - 1; // the last item
  int temp;
  while (index > 0) {
    temp = list[0]; // swap the last item with the maximum
    list[0] = list[index];
    list[index] = temp;
    index -= 1; // set new last item
    down(list, index);
  }
}
// repair the heap up
procedure up(var list: array of integer; i: integer);
var parent, child, temp: integer;
begin
  child:=i; // save the child
  while (child <> 0) do begin
    parent:=(child - 1) div 2; // the parent
    if (list[parent] < list[child]) then begin // error detection
      temp:=list[parent]; // swap the child with the parent
      list[parent]:=list[child];
      list[child]:=temp;
      child:=parent; // new child
    end
    else exit; // OK
  end;
end;

// repair the heap down
procedure down(var list: array of integer; last: integer);
var parent, child, temp: integer;
begin
  parent:=0;
  while (parent * 2 + 1 <= last) do begin
    child:=parent * 2 + 1;
    // if the smaller child was chosen
    if ((child < last) and (list[child] < list[child + 1])) then
    inc(child); // choose the greater
    if (list[parent] < list[child]) then begin // error detection
      temp:=list[parent]; // swap the child with the parent
      list[parent]:=list[child];
      list[child]:=temp;
      parent:=child; // new parent
    end else exit;
  end;
end;

// build the heap from the array
procedure heapify(var list: array of integer);
var i: integer;
begin
  for i:=1 to (length(list) - 1) do
  up(list, i);
end;

// sorting
procedure heapsort(var list: array of integer);
var temp, index: integer;
begin
  heapify(list);
  index:=length(list) - 1; // the last item
  while (index > 0) do begin
    temp:=list[0]; // swap the last item with the maximum
    list[0]:=list[index];
    list[index]:=temp;
    dec(index); // set new last item
    down(list, index);
  end;
end;
# repair the heap up
def up(list, i)
  child = i # save the child
  while (child != 0)
    parent = (child - 1) / 2 # the parent
    if (list[parent] < list[child]) # error detection
      temp = list[parent] # swap the child with the parent
      list[parent] = list[child]
      list[child] = temp
      child = parent # new child
    else
      return # OK
    end
  end
end

# repair the heap down
def down(list, last)
  parent = 0
  while (parent * 2 + 1 <= last)
    child = parent * 2 + 1
    # if the smaller child was chosen
    if ((child < last) && (list[child] < list[child + 1]))
      child += 1 # choose the greater
    end
    if (list[parent] < list[child]) # error detection
      temp = list[parent] # swap the child with the parent
      list[parent] = list[child]
      list[child] = temp
      parent = child # new parent
    else
      return
    end
  end
end

# build the heap from the array
def heapify(list)
  1.upto(list.length - 1) { |i| up(list, i) }
end

# sorting
def heapsort(list)
  heapify(list)
  index = list.length - 1 # the last item
  while (index > 0)
    temp = list[0] # swap the last item with the maximum
    list[0] = list[index]
    list[index] = temp
    index -= 1 # set new last item
    down(list, index)
  end
end

Implementation of this algorithm is based on materials of Unicorn College from professor Ondřej Kučera.


 

 

Article has been written for you by David Jancik
Avatar
Do you like this article?
1 votes
.
Thumbnail
Previous article
Insertion sort
Thumbnail
All articles in this section
Algorithms
Thumbnail
Next article
Quicksort
Activities (8)

 

 

Comments

To maintain the quality of discussion, we only allow registered members to comment. Sign in. If you're new, Sign up, it's free.

No one has commented yet - be the first!