There are "sorting algorithms" faster than O(n log n); that limitation only applies to comparison sorts. These typically have complexity depending on the keyspace (for example, counting-sort has linear complexity in whichever is greater, the size of the input list or the keyspace).
They're also typically kind of useless, as in this example, because we rarely care to sort an arbitrary list of integers with no metadata. Radix-sort is a rare exception, in that it can easily be modified to carry metadata, hence people actually using it in some applications.
"They're also typically kind of useless, as in this example, because we rarely care to sort an arbitrary list of integers with no metadata."
O(n) sorting algorithms are only impractical if the size of the keyspace dwarfs the number of elements to be sorted. In general a key is a variable-length string of bits, which makes for an exponentially-sized keyspace, so you want to stick with comparisons of element pairs.
But "metadata" has nothing to do with it. The last bead in each BeadSort row can contain a sticker with some metadata. That kind of trick generalizes to all sorts.
Radix-sort is actually not linear. The algorithm will only work if the word-length w>=log(n) (because otherwise you can't store all possible n) so O(nw) is practically the same thing as O(nlog(n))
That's certainly true in general, but by choosing a fixed-length word size (i.e. 16, 32 or 64 bit integers) you can preselect w so that runtime is predictably O(n). I can't see a way of doing that with most other O(n log(n)) sorting algorithms. Additionally, the constant factor is typically low compared to other algorithms.
Restricting the size of w (and therefore also n) does not make it linear.
Lets say you have restricted w to 32. Then you also know that n <= 2^32. If you now for instance are using mergesort then you also know that the number of recursions are at most 32 (since you split the input in half each time), i.e. it executes within at most 32n operations. So with that logic mergesort would also be linear.
No, in the merge sort example it's not 32n operations, but a maximum of 32 levels of splits, but the number of comparisons, which dominate runtime, is still k * n * log(n). Not so for radix sort, because there are no direct comparisons. By choosing a word length in advance (often hard-coded) the number of items to sort affects the runtime linearly.
I think you are mixing runtime scalability with the actual complexity. Big-O definition from Wikipedia: Let f(x) and g(x) be two functions defined on some subset of the real numbers. One writes
f(x) = O(g(x)) as x -> infinity
if and only if, for sufficiently large values of x, f(x) is at most a constant times g(x) in absolute value. That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that
|f(x)| <= M|g(x)| for all x > x0.
In the above case M = 32, f = mergesort and g(x) = x because no matter of large input you have mergesort will always take less than or equal to 32n operations (since log(n)<=32) .
Is there any theory on analyzing sorting real world objects? Like playing cards.
In a computer, picking the n'th item in an array is O(1), in the physical world it's O(n). In a computer inserting an item is O(n), but in the physical world it's O(1).
Are there any good algorithms for sorting playing cards?
I think Insertion sort tends to be the most natural sort for doing things like sorting playing cards. Anything else seems like it would be more of a hassle than it would be worth in real life.
If you have to represent each value in unary, how can you even set up the input and read the results in less than O(N^2), much less carry out the "fall"?
They're also typically kind of useless, as in this example, because we rarely care to sort an arbitrary list of integers with no metadata. Radix-sort is a rare exception, in that it can easily be modified to carry metadata, hence people actually using it in some applications.