The Double Baggins

(This post originally appeared on my Digication site.)

As a (non-gradable) challenge for BHCC’s Data Structures course, Prof. Richmond challenged us to come up with a new sorting algorithm. This is the one I came up with.

The Holy Grail of sorting algorithms is to sort an array in linear time, that is, with O(n) efficiency. It turns out this is mathematically impossible, but hey, a man can dream, can’t he?

Because I knew I would attempt a linear-time sort, I put the condition on the sort that it would step through each array element (and not, for example, apply a divide-and-conquer approach). I also wanted to minimize memory usage, so I placed the requirement on myself that it would only do in-place swapping (and not use intermediary arrays, like the Heap Sort or Merge Sort). I also wanted to avoid recursion.

So, this algorithm would have a lot in common with the Bubble Sort (or the Selection Sort, its kissing cousin). This is the easiest algorithm to understand; unfortunately, it’s also one of the least efficient algorithms out there. It works by starting at index 0, then looping through each cell after it, and if it finds a cell that has a lower value, it swaps values so that the lower value is now in index 0. After it finishes, it moves on to index 1, and does the same thing. When the index is at the logical size of the array, your array is sorted.

The reason the bubble sort is so inefficient is because it must loop through every larger element of the array, at every index. This means the sorting happens in quadratic time, with O(n2) efficiency. Not bad when you’re sorting 100 elements, but if you want to sort 100,000 elements, things can get nasty.

What, I thought to myself, if the second loop didn’t happen? That is, with each swap, you’re guaranteed that the lowest value is in the array index, so you can immediately move forward, without having to loop through the rest of the array. This would happen if, at each point in a loop, you are guaranteed that all the values before it already have the lowest values in sorted order. You would only have to loop through the array once, making the sort a linear-time sort.

Well, that’s all well and good, but it’s a very tall order. How could I actually make that work?

My solution was this. You would have an integer variable that holds the index of the current element you’re considering (that is, the current position in the array). This would start at the beginning, and move forward. You would compare the current element, and the one in front of it. If the one in front of it is less than the current element, then there is a swap.

Once you have made a swap, you must then move backwards through the array, comparing the current element with the one ahead of it, and swapping as you go. When you no longer have to swap, you can go back to where you first swapped; you know that all the elements before that point are sorted.

Well, unbeknownst to me, a similar sorting algorithm already exists. It is called the Gnome Sort, and is one of the simplest sorting algorithms that exist in the world. It is named because it is supposedly the way garden gnomes sort flower pots. The person who came up with the name is Dick Grune, so I’ll let him describe it:

Here is how a garden gnome sorts a line of flower pots. Basically, he looks at the flower pot next to him and the previous one; if they are in the right order he steps one pot forward, otherwise he swaps them and steps one pot backwards. Boundary conditions: if there is no previous pot, he steps forwards; if there is no pot next to him, he is done.

Specifically, this algorithm is a variant called Gnome Sort with Teleportation. This variation acts a lot like the Insertion Sort. In this case, instead of the gnome stepping forward one pot, he is “teleported” to his previous position after moving a flower pot. That way, he does not need to step all the way back up to where he first made a swap, and his legs won’t get tired.

But there is another optimization technique that I use. Since we will have to move backwards through the array anyway, I figured I would optimize the sort even more: I would simultaneously work from the first and last positions. That is, I would also start at the end of the array, and move backwards; the algorithm would be the same as above, except I would be making sure the array ahead of that position is sorted in ascending order. When the high position and the low position meet, the array is sorted.

And, while we’re at it, we should compare the high and low positions against themselves. This way, we are guaranteed that when the high and low positions meet, the array is sorted completely. Otherwise, it could be the case that even though each “half” of the array is sorted, the first half could contain the larger values, and the second half could contain the smaller values. An example:

5 6 7 8 9 | 1 2 3 4

(the “|”symbol is where the high and low positions meet.)

This means that we now have to keep track of different types of swaps: first, when the low position makes a swap; second, when the high position makes a swap; and third, when the high and low positions swap values with each other. The easiest way to implement this is to have two boolean variables that keep track of whether a swap occurred, at each position, for each iteration. If it has, then we need to set the respective “teleport” positions.

One thing that is problematic is determining how efficient this algorithm actually is. It is certainly not O(n) efficiency. With the “teleport” optimization, it behaves very much like the Insertion Sort. However, because it works from both ends, it does not fall victim to the Insertion Sort’s worst-case scenario. My sort’s worst-case scenario would be a kind of “inverted pyriamid,” where the largest values are at either end of the array, and the smallest are in the center; but even in this case, it wouldn’t have to go through the entire array on each iteration (like the Bubble Sort). Its worst-case efficency is certainly a quadradic form, but I’m not sure how to determine which type of quadratic form it is.

I have run many tests, and its average-case efficency seems to be about O(n1.5). So, obviously, it doesn’t hold a candle to something like the Quick Sort, but it’s not the worst on the planet either, probably matching the efficiency of the Shell Sort.

Also, we must keep in mind that until now we have only been talking about arrays. Now let us consider linked lists. Obviously, this algorithm requires both forwards and backwards traversal, so it requires at least a doubly-linked list. On the other hand, it does not suffer from one defect that linked lists have: lack of random access. That is, simply accessing the Nth link in the list has O(N) efficiency. This algorithm does not have to actually access any specific link, other than the next/previous and teleport links (all of which are simple references, so do not require list traversal). This cuts out that entire O(N) operation. This would not be the case if you tried to sort the list using one of the other sorting algorithms. So, this algorithm seems especially suited for sorting doubly-linked lists.

So, what to call it? Nobody knows the names of any famous gnomes, but everyone knows who Bilbo Baggins is. So even though he’s a hobbit and not a gnome, I decided to name it after him. Since the sorting algorithm works from both ends, I decided to call it the Double Baggins. If you don’t get it, ask your parents. (I briefly considered calling it the Double Bilbo, but that’s too much even for me.)

Below is a Java class that demonstrates the sort. Everything should be pretty self-explanatory.

public class DoubleBaggins {
    // "low position" and "high position"
    int lp, hp;
    // "low teleport" and "high teleport" - to teleport "gnome"
    int lt, ht;
    // whether a swap occurred in left or right positions
    boolean lpSwap, hpSwap;
    // Array used to store doubles
    double[] dArray;
    // The size of the array
    int size;

    public DoubleBaggins(int size) {
        this.size = size;
        dArray = new double[size];
        // Load the array with random numbers
        for (int i = 0; i < size; i++) {
            dArray[i] = Math.random() * 100;
        }
    }

    public void sort() {
        // Initialize all variables
        lt = ht = -1;
        lp = 0;
        hp = size - 1;
        while (lp  dArray[hp]) {
                // last swap position = here, both high and low
                lt = (lt < 0 ? lp : lt);
                ht = (ht  dArray[lp + 1]) {
                // last swap position = one more than here
                lt = (lt < 0 ? lp + 1 : lt);
                lpSwap = true;
                swap(lp, lp + 1);
            }
            if (dArray[hp] < dArray[hp - 1]) {
                // last swap position = one less than here
                ht = (ht  0 ? lp - 1 : 0);
            } else {
                lp = (lt > 0 ? lt : lp + 1);
                lt = -1;
            }
            if (hpSwap) {
                hp = (hp  0 ? ht : hp - 1);
                ht = -1;
            }
        }
    }

    private void swap(int index1, int index2) {
        double temp = dArray[index1];
        dArray[index1] = dArray[index2];
        dArray[index2] = temp;
    }

    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder();
        // Append indexes
        for (int i = 0; i < size; i++) {
            sb.append(String.format("%-5d\t", i));
        }
        sb.append('\n');
        // Append values
        for (int i = 0; i < size; i++) {
            sb.append(String.format("%-2.2f\t", dArray[i]));
        }
        sb.append('\n');
        return sb.toString();
    }

    public static void main(String[] args) {
        DoubleBaggins db = new DoubleBaggins(10);
        System.out.print("\nUnsorted:\n" + db);
        db.sort();
        System.out.print("\nSorted:\n" + db);
    }
}
Advertisements

About Karl

I live in the Boston area, and am currently studying for a BS in Computer Science at UMass Boston. I graduated with honors with an AS in Computer Science (Transfer Option) from BHCC, acquiring a certificate in OOP along the way. I also perform experimental electronic music as Karlheinz.
This entry was posted in Programming and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s