Quicksort

Algorithms Quicksort

As the name suggests, Quicksort is quick. In fact, it's the fastest algorithm that is actually used in real applications to sort items. This is why this article will be a bit longer than the others. It has a good behavior on arrays of any size and is not rough on memory usage. The algorithm is based on the principle of divide and conquer, which has already been explained in the Merge sort article.

Quicksort marks one item in the array as a so-called pivot. Don't worry about choosing the pivot, for now, we'll just take the first item in the array. Then, we'd call the divide function, which will re-organize the array so that it begins from the left with all of the items that are less than the pivot. Next, it would place the pivot and the rest of items would be all of the items that are greater than the pivot.

Notice that I've been using the word reorganize, not sort. The items are still shuffled and the only sorting there is the division by the pivot. We're able to make this operation in linear time and very quickly as well. Now, we'll simply call the algorithm recursively on the left half of the array before the pivot and on the right half after the pivot (we'll keep the pivot in the same position). We'll do the same with the halves as we did with the original array and continue doing so until we get to a single item in the array (an array with a size of 1). Once we exit the recursion, the array will be completely sorted. Now, let's try the algorithm out and go further into the division function.

Process

Since Quicksort is very fast, we'll try it out on a larger array this time, so we can see the benefits of its performance :)

Consider an array of the following items:

5 2 9 3 5 1 8

We'll make the first item the pivot (5).

5 2 9 3 5 1 8

Next, we'll call the divide function on the array. The function will save the pivot at the end of the array at first so it won't stand in the way (also, so we can refer to it easily). This is done by simply swapping the pivot with the last item.

8 2 9 3 5 1 5

Then, we'll hold the position from which the items greater than the pivot begin, i.e. the position of the first item that is greater than the pivot (hereinafter, position). Since we're at the very beginning, the position will have a value of 0. We'll iterate through the array from the beginning to the last but one item (because the last one is the pivot). If there's an item in the array that's less than the pivot, it'll swap itself out with the item at the current position. At that moment, the number of items less than the pivot will increase. Therefore, we'll have to increment the position by 1. Once we get to the very end, we swap the pivot with the position (one of the items greater than the pivot will be at the end and the pivot will define both parts of the array). Since the pivot position will move (we placed the lesser items before it, so it would hardly ever be at the beginning). Keep in mind that the division function must return this new position, so Quicksort could work with it. In the following algorithm visualization, I'll symbolize the position as a dot.

.8 2 9 3 5 1 5

We'll start with the first item (8) in our array (the position is also at the beginning). Since the item is greater than the pivot (5), we'll leave it as it is. In other words, we won't change the position and we'll simply go to the next item (2).

.8 2 9 3 5 1 5

2 is less than 5, so we'll swap it with the item at our current position (so the 2 will be swapped with the 8) and we'll move the position by 1 to the right. Now, let's move on to the next item (9).

2 .8 9 3 5 1 5

9 stays in the same place (9 > 5). Now, let's move on to the next item (3).

2 .8 9 3 5 1 5

3 < 5, we'll swap them, increment the position, and approach the next item (5).

2 3 .9 8 5 1 5

5 is not less than 5, so we let it be and move on to the last item (1).

2 3 .9 8 5 1 5

1 < 5, swap them, increment the position.

2 3 1 .8 5 9 5

Last of all, we'll swap the pivot with the item on the current position.

2 3 1 5 5 9 8

So this is what it looks like after calling the divide function. We must not forget to return the position, so that Quicksort knows where the pivot is. Notice, that the result is not a sorted array at all. The array is simply split up into two parts using the pivot. Essentially, we get another 2 arrays [2, 3, 1] and [5, 9, 8]. Then, we "split" and run Quicksort on both of these arrays (more accurately on the 2 parts of the original array). In fact, we haven't created any new arrays, we're just able to look at them as such. First and foremost, we'll work with the first one, the second branch will wait for now. The original array won't disappear, we'll simply work with a part of it (the rest will be highlighted in gray).

We choose the pivot (2)

2 3 1 5 5 9 8

Call the divide function on part of the array. The result will look like this:

1 2 3 5 5 9 8

Now, if we split the array again, we'll get 2 one item arrays [1] and [3]. This is exactly what we need because one item arrays are considered trivially sorted. Quicksort will not call the divide function on the one item arrays and the left half of the array will have been completed. We'll exit the recursion and switch to the right half. We'll choose 5 as the pivot.

1 2 3 5 5 9 8

After calling the divide function, there will be no change. The pivot will remain at the beginning and both items will be greater than it. This wouldn't do much and the array would only be reduced by the pivot. In the new part of the array, we'll make the pivot (9).

1 2 3 5 5 9 8

The divide function will return the following result:

1 2 3 5 5 8 9

The single item array ([8]) is seen as sorted. We exit the recursion and the array is sorted.

1 2 3 5 5 8 9

Time complexity

In regards to stability, I must admit, that our divide function is not stable to keep it simple. However, it can be re-written to be stable. But how is it with the time complexity? I already mentioned that the divide function is done in linear time, so its time complexity is O(n). Then, that makes you ask, how many times is it being done? There is no clear answer to that and if you expect some kind of a pitfall, then you're right. In the ideal case, the divide function will divide the array into two equal halves. If we divide something into halves the complexity will be the logarithm with the base of two so O(log n). And since the division itself takes O(n), the complexity of the whole algorithm in this particular case would be our favorite O(n log n). It has been proven that Quicksort has this complexity even in the average case (even if the "halves" are not exactly of the same size).

However, how it usually is, the extreme speed of Quicksort is balanced by the fact it doesn't act well on presorted arrays. While Bubblesort did a great job on presorted arrays, Quicksort will take the first item as a pivot (so the smallest or greatest number) and the divide function won't logically move any item before the pivot. So the new array will always be smaller by one item and the second array won't be created at all. The complexity would then be lowered to O(n2). Another issue arises with this problem, hacker attacks on information systems. Imagine, that you know that some company sorts data in their database using Quicksort which always chooses the first item as the pivot. If you send them thousands of presorted arrays, their algorithm's time complexity will be lowered to O(n2), and the server will crash since it isn't ready for this load. This problem can be easily fixed, however, it is very important to recognize this vulnerability.

Quicksort variations

Choosing the first item as the pivot doesn't seem as the best idea. If we choose the last, the problem will still be there. Maybe another simple solution has crossed your mind: pick the item in the middle or the item located at 2/3 of the array. This might confuse the attacker, but if they know our source codes, they'll be able to create arrays that would get the complexity down to O(n2) again.

Another solution might be to choose the median as the pivot. We'd then split the array up into exactly two halves. There's even an algorithm which is capable of finding the median in linear time. Quicksort would then have an asymptotic time complexity of O(n log n). However, finding the median will slow down the algorithm, so this approach is usually not worth it.

Now, what if we just pick the pivot randomly? Yeah, that's it! This version is called Randomized Quicksort. Picking a random number will not slow down the algorithm by much. However, there practically are no "random" numbers involved. Numbers generated by computers are referred to as pseudorandom numbers. Random number generators mostly work with system time and number series. For example, random generators used by Unix systems are known to use noise from the sound card or the temperature of the CPU to calculate results. Generators used for army cryptography might use isotope fission and similar phenomena. With that said, let's get back to the randomized Quicksort. In 99% of the cases, Randomized Quicksort is a very safe and fast algorithm. Practically unattackable, even though theoretically no numbers are really random. However, if any of you are unsatisfied with this solution, feel free to look into Introsort (a variation of Quicksort).

Introsort is Quicksort, which doesn't have to be secured against the aforementioned attacks, so it can just choose the first item as the pivot. In addition to that, it checks the current depth of the recursion. If it goes over log n, Quicksort will switch to heapsort and the rest of the array is sorted using Heapsort, which has a guaranteed time complexity of O(n log n) on any array. Introsort is used very often, i.e. it's the algorithm that most of today's programs use to sort their data.

Algorithm properties

Time complexity O(n log n) for average cases, O(n2)
Stability Can be implemented stable
Speed Very fast

Note: the time complexity is meant for average cases, and the speed is like this when compared to all other sorting algorithms.

Source codes

// rearranges the array into three parts: items that are less than the pivot, the pivot, and items that are greater than the pivot
public static int divide(int[] list, int left, int right, int pivot) {
  int temp = list[pivot]; // swap the pivot out with the last item
  list[pivot] = list[right];
  list[right] = temp;
  int i = left;
  for (int j = left; j < right; j++) {
    if (list[j] < list[right]) { // the item is less than the pivot
      temp = list[i]; // swap the pivot with the item at the current position
      list[i] = list[j];
      list[j] = temp;
      i++; // increment the position
    }
  }
  temp = list[i]; // swap the pivot back
  list[i] = list[right];
  list[right] = temp;
  return i; // returns the pivot's new index
}

public static void limited_quicksort(int[] list, int left, int right) {
  if (right >= left) { // recursion condition
    int pivot = left; // chooses the pivot
    int new_pivot = divide(list, left, right, pivot);
    // recursive calling to both parts of the array
    limited_quicksort(list, left, new_pivot - 1);
    limited_quicksort(list, new_pivot + 1, right);
  }
}

// calls limited Quicksort on the whole array
public static void quicksort(int[] list) {
  limited_quicksort(list, 0, list.length - 1);
}
// rearranges the array into three parts: items that are less than the pivot, the pivot, and items that are greater than the pivot
public static int divide(int[] list, int left, int right, int pivot) {
  int temp = list[pivot]; // swap the pivot with the last item
  list[pivot] = list[right];
  list[right] = temp;
  int i = left;
  for (int j = left; j < right; j++) {
    if (list[j] < list[right]) { // the item is less than the pivot
      temp = list[i]; // swap the pivot with the item at the current position
      list[i] = list[j];
      list[j] = temp;
      i++; // increment the position
    }
  }
  temp = list[i]; // swap the pivot back
  list[i] = list[right];
  list[right] = temp;
  return i; // returns the new index of the pivot
}

public static void limited_quicksort(int[] list, int left, int right) {
  if (right >= left) { // recursion condition
    int pivot = left; // chooses the pivot
    int new_pivot = divide(list, left, right, pivot);
    // recursive calling to both parts of the array
    limited_quicksort(list, left, new_pivot - 1);
    limited_quicksort(list, new_pivot + 1, right);
  }
}

// calls limited Quicksort on the whole array
public static void quicksort(int[] list) {
  limited_quicksort(list, 0, list.Length - 1);
}
// rearranges the array into three parts: items that are less than the pivot, the pivot, and items that are greater than the pivot
function divide(var list: array of integer; left, right, pivot: integer): integer;
var temp, i, j: integer;
begin
  temp:=list[pivot]; // swap the pivot with the last item
  list[pivot]:=list[right];
  list[right]:=temp;
  i:=left;
  for j:=left to right do
  if (list[j] < list[right]) then begin // the item is less than the pivot
    temp:=list[i]; // swap the pivot with the item at the current position
    list[i]:=list[j];
    list[j]:=temp;
    inc(i); // increment the position
  end;
  temp:=list[i]; // swap the pivot back
  list[i]:=list[right];
  list[right]:=temp;
  result:=i; // returns the pivot's index
end;

// calls Quicksort limited on a particular part of the array
procedure limited_quicksort(var list: array of integer; left, right: integer);
var pivot, new_pivot: integer;
begin
  if (right >= left) then begin // recursion condition
    pivot:=left; // chooses the pivot
    new_pivot:=divide(list, left, right, pivot);
    // recursive calls for both parts of the array
    limited_quicksort(list, left, new_pivot - 1);
    limited_quicksort(list, new_pivot + 1, right);
  end;
end;

// calls limited Quicksort on the whole array
procedure quicksort(var list: array of integer);
begin
  limited_quicksort(list, 0, length(list) - 1);
end;
# rearranges the array into three parts: items that are less than the pivot, the pivot, and items that are greater than the pivot
def divide(list, left, right, pivot)
  temp = list[pivot] # swap the pivot with the last item
  list[pivot] = list[right]
  list[right] = temp
  i = left
  left.upto(right) do |j|
    if (list[j] < list[right]) # the item is less than the pivot
      temp = list[i] # swap the pivot with the item at the current position
      list[i] = list[j]
      list[j] = temp
      i += 1 # increment the position
    end
  end
  temp = list[i] # swap the pivot back
  list[i] = list[right]
  list[right] = temp
  return i # returns the pivot's index
end

# calls Quicksort limited on a particular part of the array
def limited_quicksort(list, left, right)
  if (right >= left) # recursion condition
    pivot = left # chooses the pivot
    new_pivot = divide(list, left, right, pivot)
    # recursive calling to both parts of the array
    limited_quicksort(list, left, new_pivot - 1)
    limited_quicksort(list, new_pivot + 1, right)
  end
end

# calls limited Quicksort on the whole array
def quicksort(list)
  limited_quicksort(list, 0, list.length - 1)
end

 

 

Article has been written for you by David Jancik
Avatar
Do you like this article?
No one has rated this quite yet, be the first one!
.
Thumbnail
Previous article
Heapsort
Thumbnail
All articles in this section
Algorithms
Thumbnail
Next article
Counting sort
Activities (2)

 

 

Comments

To maintain the quality of discussion, we only allow registered members to comment. Sign in. If you're new, Sign up, it's free.

No one has commented yet - be the first!