Heapsort

Algorithms Heapsort

Heapsort is among the smarter algorithms, which are much faster than most of the ones we've covered. It is built on the Selection sort algorithm, so it's based on the extremities (in this case the greatest values), which are always moved to the end of array. After integrating all of the greatest values to the end, we get a sorted array. The problem with Selection sort is that you have to manually search for the greatest/smallest values. In each nested loop, it has to go through the entire unsorted array part and check each item to see if it's the greatest value. There is no quicker way to find an array's greatest values.

However, what if there was a data structure in which we'd be able to represent items the same way as arrays, and also be able to find the greatest value in a constant amount of time without having to gothrough it all? With me being extremely specific and all, you probably guessed that this structure does indeed exist. The structure is called Heap (that's why the approach is called Heap Sort).

A heap is a binary tree with the following properties:

  • All "roots" of the heap except for the last one are fully occupied by items (each internal peak has two children, so the tree is very balanced)
  • The last root of the heap is filled from what's left (it can even be entirely filled)
  • All items in a heap have a special property: both children are always smaller or equal to their parent.

Knowing this special property, we can infer that the root will always contain the greatest value which is very handy for us.

A heap looks something like this:

Heap – Heapsort

Of course, we can define special properties of the heap the other way and have the smallest value appear in the root (this approach is used in some projects as well).

Since we usually sort the array and not the heap, we'll have to build the heap from said array. As some of you may have guessed, we'll create the tree structure using pointers or references and add the items into it. We can represent the Heap using an ordinary array (so that it's balanced) using a simple trick. This way, we won't need an auxiliary strucutre/memory. We'll work directly in our array, which we'll see as a heap so that we can sort the items directly in place.

The heap in the picture above would look something like this (the item indexes are above, whereas, the items are below):

0 1 2 3 4 5 6 7 8 9
9 7 8 5 4 0 3 2 1 2

Notice that there are no similiarities between the "heap array" and the sorted one. Now let's try to work with this array as if it were a heap. You probably noticed that the items in the array are placed just like they would be in a heap - sorted from top to bottom and from left to right. If we wanted to access the root, we'd simply go to the 0th index. The last item of the heap is also on the index of the last item in the array. However, we'll need the access from any parent to it's children and vice-versa.

If the child had an index of i, we'd calculate it's parent's index using (i - 1) / 2 (the division is of an integer) If the parent had an index of i its left-most child's index would be the result of (2 * i + 1) and the right-most child's index would be the result of (2 * i) + 2.

We can quickly check whether it worked as well. Lets try to find the child of item 7 which has an index of 1 (the array is indexed from 0 so it's the second). (1 - 1) / 2 is 0. Its parent should be item 9, which proves to be correct. Its children would be (2 * 1 + 1) = 3 and the third index holds a five - which is indeed its left-most child. The right-most child has the same index incremented by one, which also proves to be correct.

Now, we have all the needed tools and we can go try the first step of the algorithm - heaping the array.

Process

Heaping the array (building heap)

Consider an array of the following items:

3 2 5 1 9 4

The array represents the following "heap":

Heaping the array – building the heap – Heapsort

I used the quotes because this heap is broken - the special property of the heap is not correct here. Therefore, we have to fix it first and then heap the array. To do so, we'll use the up function, which we'll continuously call on the items starting from the root. This function will find out whether the item breaks the special property of the heap (that the children aren't larger than their parent). If it finds one that is, it'll swap the item with its parent. However, doing so might just bring the problem up a level, which is why this function repeats until the special property of heap is found to be incorrect or it finds the root once more. This function is called once for each item until it reaches the end. Then, we can be sure that the heap has been corrected.

Since I like to keep things simple, I'll only go over the progress heaps go through in this article - look at the shown picture above. You may also watch the video below, where this part is shown in detail.

We won't use the up function on the first item (the root), because it doesn't have a parent. Therefore, we'll start with item 2, which is smaller than its parent so we'll leave it in place. The next item is a 5 which is bigger than its parent (3) so we'll swap them and end up at the root which means we're done. Item 1 is smaller than 2 which is correct. However, 9 is bigger than 2 so we'd end up swapping them. However, 9 is still bigger than 3 so we'd swap them again and 9 (the greatest value) would then be the root. 4 is bigger than 3 so we'dd swap them. 4 would then be smaller than 9, which is correct. Heaped arrays represent a heap, and look something like this:

9 5 4 1 2 3
Heaped array – building the heap – Heapsort

Now, let's move on to actually getting it sorted.

Sorting

As I mentioned before, we literally add a Selection sort to the heap. Meaning that we're going to swap the greatest value with the last item, and continue repeating this until we get to the end. The greatest value always lies at the 0th index, so we'd be able to reach it in a constant amount of time. Swapping also does not take any time at all.

However, we still have a bit of a problem - swapping the items will break the heap. We don't care about the original greatest value (just like with Selection sort). However, the item which would then be the new root probably wouldn't be the greatest value. Therefore, we'll repeat the heap using - function down. We'll run it starting from the root and if we find children that are bigger than their parent, we'll swap them out. In case they're both bigger, we'll swap the parent with bigger child. We could also run into the same problem while using the up function. However, this way, we have to look at the child (which is now the parent) and find out whether it's bigger than its chuldren. We repeat the process until the heap's special property is met or we reach the very end. We have to call this function after each row swap as well.

If we thought about it even further, we'd figure out that neither tearing the greatest value nor swapping out values costs us any time. The only problem with this approach is the function down which has to be executed n times (to reach the greatest value). On top of that, in each call, it has to be repeated at least log n times because the heap is a balanced binary tree and has a height of log n where the base of the logarithm is 2 because each parent has 2 children. The function checks each item in each floor. In the worst scenario it would go through the entire floor, which is log n. The up function has the same time complexity as the down function - 0(n log n). Overall, we'd call function up once and function down n times, so the resulting complexity of heapsort is 0(n log n)

All in all, this algorithm showed a huge improvement in performance, especially, on bigger arrays. However, the algorithm isn't stable, so the more demanding amongst you might not be satisfied with it. In that case, you'll have to read the next article about Merge sort or Quicksort.

Properties of an Algorithm

Time complexity O (n log n)
Stability No
Speed Very good

Note. the time complexity is shown according to average cases and the speed is like this when compared to all of the other sorting algorithms.

Source code

// repair heap up
public static void up(int[] list, int i) {
  int child = i; // save son
  int parrent, temp;
  while (child != 0) {
  parrent = (child - 1) / 2; // father
  if (list[parrent] < list[child]) { // error detection
    temp = list[parrent]; // swap son with father
    list[parrent] = list[child];
    list[child] = temp;
    child = parrent; // new son
  }
  else
    return; // ok
  }
}

// repeair heap down
public static void down(int[] list, int last) {
  int parrent = 0;
  int child, temp;
  while (parrent * 2 + 1 <= last) {
    child = parrent * 2 + 1;
    // if the smaller son is choosen
    if ((child < last) && (list[child] < list[child + 1]))
    child++; // we pick the bigger
    if (list[parrent] < list[child]) { // error detection
      temp = list[parrent]; // swap son with father
      list[parrent] = list[child];
      list[child] = temp;
      parrent = child; // new father
    }
    else
      return;
    }
}

// build the heap from array
public static void heapify(int[] list) {
  for (int i = 1; i < list.length; i++)
  up(list, i);
}

// sorting
public static void heapsort(int[] list) {
  heapify(list);
  int index = list.length - 1; // last item
  int temp;
  while (index > 0) {
    temp = list[0]; // swap the last item with maximum
    list[0] = list[index];
    list[index] = temp;
    index -= 1; // set new last item
    down(list, index);
  }
}
// repair heap up
public static void up(int[] list, int i) {
  int child = i; // save son
  int parrent, temp;
  while (child != 0) {
    parrent = (child - 1) / 2; // father
    if (list[parrent] < list[child]) { // error detection
      temp = list[parrent]; // swap son withfather
      list[parrent] = list[child];
      list[child] = temp;
      child = parrent; // new son
    }
    else
      return; // ok
    }
}

// repair heap down
public static void down(int[] list, int last) {
  int parrent = 0;
  int child, temp;
  while (parrent * 2 + 1 <= last) {
    child = parrent * 2 + 1;
    // if the smaller son is chosen
    if ((child < last) && (list[child] < list[child + 1]))
    child++; // pick the bigger one
    if (list[parrent] < list[child]) { // error detection
      temp = list[parrent]; // swap son with father
      list[parrent] = list[child];
      list[child] = temp;
      parrent = child; // new father
    }
    else
      return;
    }
}

// build heap from array
public static void heapify(int[] list) {
  for (int i = 1; i < list.Length; i++)
  up(list, i);
}

// sorting
public static void heapsort(int[] list) {
  heapify(list);
  int index = list.Length - 1; // last item
  int temp;
  while (index > 0) {
    temp = list[0]; // swap last item with maximum
    list[0] = list[index];
    list[index] = temp;
    index -= 1; // set new last item
    down(list, index);
  }
}
// repair heap up
procedure up(var list: array of integer; i: integer);
var parrent, child, temp: integer;
begin
  child:=i; // save son
  while (child <> 0) do begin
    parrent:=(child - 1) div 2; // father
    if (list[parrent] < list[child]) then begin // error detection
      temp:=list[parrent]; // swap son with father
      list[parrent]:=list[child];
      list[child]:=temp;
      child:=parrent; // new son
    end
    else exit; // OK
  end;
end;

// repair the heap down
procedure down(var list: array of integer; last: integer);
var parrent, child, temp: integer;
begin
  parrent:=0;
  while (parrent * 2 + 1 <= last) do begin
    child:=parrent * 2 + 1;
    // if the smaller son was chosen
    if ((child < last) and (list[child] < list[child + 1])) then
    inc(child); // choose the bigger
    if (list[parrent] < list[child]) then begin // error detection
      temp:=list[parrent]; // swap son with father
      list[parrent]:=list[child];
      list[child]:=temp;
      parrent:=child; // new father
    end else exit;
  end;
end;

// build heap from array
procedure heapify(var list: array of integer);
var i: integer;
begin
  for i:=1 to (length(list) - 1) do
  up(list, i);
end;

// sorting
procedure heapsort(var list: array of integer);
var temp, index: integer;
begin
  heapify(list);
  index:=length(list) - 1; // last item
  while (index > 0) do begin
    temp:=list[0]; // swap last item with maximum
    list[0]:=list[index];
    list[index]:=temp;
    dec(index); // set new last item
    down(list, index);
  end;
end;
# repair heap up
def up(list, i)
  child = i # save son
  while (child != 0)
    parrent = (child - 1) / 2 # father
    if (list[parrent] < list[child]) # error detection
      temp = list[parrent] # swap son with father
      list[parrent] = list[child]
      list[child] = temp
      child = parrent # new son
    else
      return # OK
    end
  end
end

# repair heap down
def down(list, last)
  parrent = 0
  while (parrent * 2 + 1 <= last)
    child = parrent * 2 + 1
    # if the smaller son was chosen
    if ((child < last) && (list[child] < list[child + 1]))
      child += 1 # choose the bigger
    end
    if (list[parrent] < list[child]) # error detection
      temp = list[parrent] # swap son with father
      list[parrent] = list[child]
      list[child] = temp
      parrent = child # new father
    else
      return
    end
  end
end

# build the heap from array
def heapify(list)
  1.upto(list.length - 1) { |i| up(list, i) }
end

# sorting
def heapsort(list)
  heapify(list)
  index = list.length - 1 # last item
  while (index > 0)
    temp = list[0] # swap last item with maximum
    list[0] = list[index]
    list[index] = temp
    index -= 1 # set new last item
    down(list, index)
  end
end

Implementation of this algorithm is based on materials of Unicorn College from professor Ondřej Kučera.


 

  Activities (7)

Article has been written for you by David Jančík
Avatar

Do you like this article?
Total (1 votes) :
333 33


 


Thumbnail
Previous article
Insertion sort
Thumbnail
All articles in this section
Algorithms

 

 

Comments

To maintain quality of discussion, we only allow registered members to comment. Sign in. If you're new, Sign up, it is free.

Nobody has commented yet - be the first!