Did you read the post? There were no spin locks waiting for the lock to be released. The ~100 threads waiting for the lock were all waiting in an idle state, quite efficiently.
The problem was that the thread that owned the lock was spinning in a seven-instruction loop.
The problem was that the thread that owned the lock was spinning in a seven-instruction loop.