Bead sort: faster than O(N log N) sort

amalcon · on June 22, 2009

There are "sorting algorithms" faster than O(n log n); that limitation only applies to comparison sorts. These typically have complexity depending on the keyspace (for example, counting-sort has linear complexity in whichever is greater, the size of the input list or the keyspace).

They're also typically kind of useless, as in this example, because we rarely care to sort an arbitrary list of integers with no metadata. Radix-sort is a rare exception, in that it can easily be modified to carry metadata, hence people actually using it in some applications.

Liron · on June 22, 2009

"They're also typically kind of useless, as in this example, because we rarely care to sort an arbitrary list of integers with no metadata."

O(n) sorting algorithms are only impractical if the size of the keyspace dwarfs the number of elements to be sorted. In general a key is a variable-length string of bits, which makes for an exponentially-sized keyspace, so you want to stick with comparisons of element pairs.

But "metadata" has nothing to do with it. The last bead in each BeadSort row can contain a sticker with some metadata. That kind of trick generalizes to all sorts.

praptak · on June 22, 2009

"The last bead in each BeadSort row can contain a sticker with some metadata."

Wrong. The beads in columns remain in the original order (not sorted) and so would these stickers.

jaydub · on June 22, 2009

http://en.wikipedia.org/wiki/Radix_sort

Yrlec · on June 22, 2009

Radix-sort is actually not linear. The algorithm will only work if the word-length w>=log(n) (because otherwise you can't store all possible n) so O(nw) is practically the same thing as O(nlog(n))

tybris · on June 22, 2009

Not sure what you mean by storing all possible n. Radix sort is O(nw) and for most use-cases w is smaller than log(n).

Yrlec · on June 22, 2009

A word which is w bits long can hold a number between 0 and 2^w-1. If you want to hold a larger interval than that you need to increase the size of w.

Perhaps you can make it smaller than log(n) for many use cases by restricting the type of input you accept but O(f(x)) only refers to the worse case.

pmjordan · on June 22, 2009

That's certainly true in general, but by choosing a fixed-length word size (i.e. 16, 32 or 64 bit integers) you can preselect w so that runtime is predictably O(n). I can't see a way of doing that with most other O(n log(n)) sorting algorithms. Additionally, the constant factor is typically low compared to other algorithms.

Yrlec · on June 22, 2009

Restricting the size of w (and therefore also n) does not make it linear.

Lets say you have restricted w to 32. Then you also know that n <= 2^32. If you now for instance are using mergesort then you also know that the number of recursions are at most 32 (since you split the input in half each time), i.e. it executes within at most 32n operations. So with that logic mergesort would also be linear.

pmjordan · on June 22, 2009

No, in the merge sort example it's not 32n operations, but a maximum of 32 levels of splits, but the number of comparisons, which dominate runtime, is still k * n * log(n). Not so for radix sort, because there are no direct comparisons. By choosing a word length in advance (often hard-coded) the number of items to sort affects the runtime linearly.

Yrlec · on June 23, 2009

I think you are mixing runtime scalability with the actual complexity. Big-O definition from Wikipedia: Let f(x) and g(x) be two functions defined on some subset of the real numbers. One writes f(x) = O(g(x)) as x -> infinity if and only if, for sufficiently large values of x, f(x) is at most a constant times g(x) in absolute value. That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that |f(x)| <= M|g(x)| for all x > x0.

In the above case M = 32, f = mergesort and g(x) = x because no matter of large input you have mergesort will always take less than or equal to 32n operations (since log(n)<=32) .

bayareaguy · on June 22, 2009

Previously posted here http://news.ycombinator.com/item?id=257374

ars · on June 22, 2009

Is there any theory on analyzing sorting real world objects? Like playing cards.

In a computer, picking the n'th item in an array is O(1), in the physical world it's O(n). In a computer inserting an item is O(n), but in the physical world it's O(1).

Are there any good algorithms for sorting playing cards?

iofthestorm · on June 22, 2009

I think Insertion sort tends to be the most natural sort for doing things like sorting playing cards. Anything else seems like it would be more of a hassle than it would be worth in real life.

karanbhangui · on June 22, 2009

I thought about this sort on my own a couple years ago, and I thought I'd discovered the most amazing thing in sorting :P

Not only did I realize I was beaten to the punch few years earlier, but also that efficiency is lost in implementation.

ralph · on June 22, 2009

The brightly coloured slides on CSP-style programming, using Occam, at http://www.cs.kent.ac.uk/teaching/08/modules/CO/6/31/slides/ include a O(n) "sort pump" starting on page 14 of http://www.cs.kent.ac.uk/teaching/08/modules/CO/6/31/slides/... that's also O(n) for space.

Those slides are an interesting read generally if you're not familiar with CSP/Alef/Limbo-style programming, despite the crufty Occam syntax.

prodigal_erik · on June 22, 2009

If you have to represent each value in unary, how can you even set up the input and read the results in less than O(N^2), much less carry out the "fall"?

stcredzero · on June 22, 2009

http://en.wikipedia.org/wiki/Spaghetti_sort

The advantages of this sort:

1) It involves something ridiculous (in the CS context) like Pasta.

2) You have an excuse to play the theme from "The Good, The Bad, and The Ugly" in that section of the talk.

die_sekte · on June 22, 2009

Now we will need extra hardware just to do sorting!