I dunno, the first example seems unsatisfying. The original has that ugly condition, the "good" version seems overly clever. And for all the talk of taste and aesthetics, both versions ignore an elephant in the room: defensively dealing with `entry` being absent from the list.
Not that I've never abused addresses like this. But having written this multiple times, I currently prefer something like this:
remove_list_entry(entry)
{
if (head == entry) {
head = head->next;
return;
}
for (prev = head; prev->next; prev = prev->next) {
if (prev->next == entry) {
prev->next = prev->next->next;
return;
}
}
}
There's still an `if`, but it isn't so bad since it's an early exit. What is "tasteful" about hiding the fact that one of the cases is far simpler than the other? There's conditionals and then there's conditionals. There's cases and then there's cases.
The other benefit of this approach: you eliminate segfaults by construction, because the `while` has been replaced with a much more normal `for` loop iterating through the list. Keeping it normal is 80% of the secret to avoiding security holes.
Rather than appealing to something nebulous like "taste", I prefer to focus on concrete functional benefits. What is a situation where someone using or modifying this function might be subtly led astray? That seems a better guide.
I don't think anyone's mentioned that there are two functional benefits to Linus's version: testability and performance.
The author touched on testability. Dropping the "if" case eliminates one code path that may contain bugs.
But also, the revised code may perform a lot better if you're searching a lot of short lists. In that case, the target item has a large, though not likely, chance of being first in the list. This kills predictability of that "if" branch. Eliminating it can save ten cycles or so, possibly shaving 25% or so off the runtime of this function for short lists. (For large lists, the branch becomes predictable, and you start having cache issues due to the list's linked nature.)
My current side-project is an NES emulator. Nearly all my (non-algorithmic) performance enhancements have been of this nature: find some unpredictable branch nested inside a tight loop → eliminate the branch with "clever" logic → 25% performance gain on that loop.
(That there are no checks for "error" cases is a distraction. There aren't any in the original example either, because it's an example and the error cases aren't relevant to Linus's point.)
I agree partially w.r.t. testability of branchless code, but there's a flipside too: often the root cause of an edge case is in data-structure shape, so full coverage isn't necessarily sufficient. In other words, if an edge case is "folded into" the normal path via something like Linus' trick, you still need to actually test the case where (in this example) `head` is being removed, or else you aren't certain that the folding of cases is valid. And if you're testing that case, then you'll have full coverage of the original (branchy) code too.
Totally agree w.r.t. performance, though. Even if the branch is never taken, control flow can reduce the scope of other compiler optimizations.
Yes, this is a good point. I had thought of this too and somehow convinced myself that the "head" case disappeared (since the non-"head" case would fail if "indirect" weren't initialized properly), but I'll admit that's a shaky argument.
This comment thread was refreshing to read because it was technically enlightening, devoid of malice, and each party conceding arguments while expanding and clarifying places where it broke down or wasn't clear.
Something we can all strive for professionally and while commenting on HN.
I'd say branch mispredictions and cache misses are the two biggest factors to performance. Almost everything else is noise in comparison. I keep saying if your architecture doesn't account for these, you've wasted 90% of the CPU guaranteed.
I do the exact same techniques in my game engine (sadly proprietary) - even in JavaScript its dead easy to run circles around both Three.js and Babylon.js in terms of performance to the point I'm simply amazed at how fast JavaScript VMs can be if you really push them.
Eliminating branches, cache misses and allocations in the main loop yielded by far the biggest performance boost. (For the curious, we abused ArrayBuffers to guarantee contiguous memory allocations in JavaScript)
I'm not a low level programmer, so this is a honest question.
In the better version, why doesn't the while condition (that must be tested at least once) present predictability problems whereas the if check in parent does?
However, if I'm not mistaken, branches are costly only when they're mispredicted. Hence, the question (as I see it) is wether one can better predict the behavior of Linus' better version (which only has one conditional) or that presented above (where there's an early return condition followed by the while condition).
It seems to me that the version above (which has two conditionals but is not like Linus' worse version because it has an early return) should be similarly costly to predict. Consider the situations where either most of the calls are to remove the first element (A) or most of the calls aren't (B). Let's see what happens in these cases:
- On Linus' code, the while is evaluated the first time. In case A, the while is correctly predicted to exit the while, the swap is made and the function ends. In case B, where the while yields true many times before hitting our element, you should have to pay the cost of a misprediction when you finally find it.
- In the early return code above, the situation is... the same. In case A the early return if is predicted correctly to yield true, and the function ends. In case B, the early return if is correctly predicted to yield false and then the while predicts mostly trues until you finally hit your element, time at which you pay the same misprediction penalty than above.
So... aren't both functions similarly expensive in terms of branch misprediction costs? Am I fundamentally wrong on my understanding on this issue?
Where is that early return you mention? I don't see the return keyword used in either examples. They both do the exact same while loop, but Linus' version adds a level of indirection to save a conditional later on. You're still touching the same memory in both cases so they should be identical in terms of cache misses.
From my understanding, the first version has two possible branch misprediction points while Linus' version only has one. This will probably only have a visible impact if the function is called in a loop however.
But to me the biggest advantage of the 2nd variant is its simplicity. This is the only way to stay sane with a growing codebase and keep shipping robust code.
Its no use getting a fast function if you can't integrate it optimally with the rest of the codebase. Raw performance isn't at the micro level but the macro one. This is where simplicity becomes critically important.
When every branch removed on a mainline path in the kernel saves years of compute time globally and every byte removed in the kernel saves terabytes of memory globally, being clever is absolutely the right thing to do. Removing that one branch takes away an expensive pipeline stall[0]. The version akkartik has above adds a extra branch (and expensive pipeline stall) to _each iteration_ of the loop; Linus would probably not be pleased, to put it mildly, if anyone submitted a patch like that.
And the method Linus shows isn't even particularly clever; it's been a common example for handling linked list insertion in C for decades. All these "Linus is being too clever" postings show is that most programmers aren't cut out to be kernel or embedded programmers.
Taste is malleable and relative. One can even entertain different notions of taste. I personally have at least two – "understandable" and "performant" – and they aren't always (but sometimes are) exclusive of each other.
I suspect Linus's "taste", given Linux's problem domain, tuned to some combination of performant + testable + reviewable + portable. In the example given, Linus's rewrite meets these requirements; hence his assertion that it is in better taste.
I feel at this point, the notion of 'good taste' has become even more ill-defined, something very personal. Then the notion is not very useful as a coding guideline.
There isn't a domain where taste isn't personal. It's always a blend of art and science. Just because you can't define it exactly though, doesn't mean you can't get closer to it by working at it.
Are you seriously arguing that Linus would think cleanly performant code is not in good taste? You're welcome to ask him that directly and see what he says.
EDIT: Okay, I'm being excessively snarky there; my apologies.
I will say instead that is that "good taste" depends on the context. What's "good taste" for kernel development isn't necessarily the same as "good taste" for enterprise development or the same as "good taste" for app development.
As a rookie programmer (but 10 years in IT, loving maths), this was my thought exactly. Basically a matter of field/domain definition, insofar as humans still have to interact with the code.
In short: most enterprise code should be elegant and simple, even if verbose (many conditionals, as little nesting as possible, etc) so that a 12 years old can understand it, because in real life you'll get much turnover and lack of means/skill/motivation. Hence, costly maintenance, etc.
Writing an open-source kernel consistently for 20+ years is an entirely different problem, and the fact that you use code in both these examples is a distant commonality of the problem sets; it is my experience thus contention that the human factor and real world conditions are paramount to ascribing qualitative assertions such as 'good' or 'tasty'. That is, until machines program themselves or others without human oversight. ;)
Yes, of course, I'm certainly not cut out to be a kernel programmer. Were the audience in that TED chat kernel programmers? I thought we were discussing "taste", not what it takes to get a patch into the kernel. Linus does work on other projects, you know.
I prefer the shorter version for being shorter, and for generalizing the update point at the end (and giving it a name), which stresses that there will be an update (delete).
Alas, most of my coworkers would prefer the "explicit" version with all of the extra lines.
Aside: I sure miss identifiers with underscores in them, rather than MashedTogetherEnterpriseSelfCrockumentingMego :-(
I agree with the condition about "If you're at BigCo". It really depends on who's going to read/edit the code, how often, and how many times it's going to be executed.
For a line of code that's going to be executed on a global scale billions of times every second, and only changed (very rarely) by extremely good developers, then I'd happily trade a bit of clarity for a bit more performance.
But if it were an oft-edited method in an in-house application that's going to be supported by junior devs? I would prefer clarity.
As always, there are no universal rules, only tradeoffs.
If you understand C pointers, this is obvious: it keeps track of where the pointer to the current node came from, and replaces it with a pointer to the next node once it reaches the entry to be removed.
I agree, in fact taste might rightfully be defined as that which is obvious to an experienced developer in the environment. Code which simply relies on conventions and avoids complexity is often immediately elegant.
This is a great point but I'm trying not to think disparagingly about the skill level future maintainers, too: deadlines and distraction are a powerful negative equalizer.
The next person to touch that code could be you, in two years after you've mostly moved on to another project / major upgrade and are back in a hurry to make a “quick” change, investigate a security report, deal with a scaling issue, etc
For heaven's sake, everyone, please stop getting hung up on the fact that there's no error checking. Example code usually doesn't show error checking because it obscures the main point of the example.
(which, for kernal code, would be turned off later, anyway)
In languages other than C, with exceptions and stack traces, most precondition checks just become clutter, anyway. Although it would be nice to have "Eiffel" style pre/post-condition (invariant assertion) blocks out-of-line that can be enabled or disabled...
SPARK does that with efficient, system code. Then proves the code free of common errors in C. Then other tools can generate unit tests from the specs just in case. Frama-C does something similar for C. I'm not sure what the usability or thoroughness is like vs SPARK, though.
Totally agree, never play code golf with your production code. Your cleverness will bite you in the arse when you can't be promoted or allowed to move to a new project, because your code is so clever that you're the only one who can properly maintain it.
But I think that's part of the argument. Reasoning about edge cases in code full of nested loops and conditionals is hard. Eliminating branches and opportunities to hide fencepost errors (for instance) goes some way towards reducing cognitive noise so you can focus on the meat.
Of course that's not the full argument. And I'll throw in one more: it isn't anywhere near the top of the list in terms of importance, but aesthetically pleasing, elegant code has a certain amount of value, if only to other programmers. (Naturally, it has to be correct, too.)
More likely, because there are fewer code paths to follow. Given Linus's explanation, I think that's part of the basis of his "taste", not conciseness (which in this example I suspect is a byproduct).
Define obvious. At least the pointer pointer version is obvious to me in this task.
Pointer pointer can't be applied everywhere. If you try to reverse a linked list in O(n) time with O(1) space using a pointer pointer, the code would be less obvious than using a prev pointer.
> You can always go back and make it look pretty safely with good tests.
Yeah, but you won't. I think that's part of the reason why people care about "taste". I want to maximize the chances that essentially the first draft is good enough to be maintained for as long as possible, because years of hard-won experience has taught us that it will.
> Yeah, but you won't. I think that's part of the reason why people care about "taste".
Ahh but you eventually will (at least in my experience). If you don't need to the code either doesn't matter (dead code) or works just fine.
I think I have looked at my own companies entire code base a couple of times. Do I see nasty crap... all the time. And I often fix it for good taste when it is really bad but I have to say the value prop isn't very good compared to adding tests or other automated validation (including automated profiling and dead code analysis).
Kernel code and in particular C are sort of special cases because they are hard to test and performance is generally a high concern. But for many other higher level languages and problem domains this is not the case.
I have been meaning to look how at how much code bases change overtime. In the case of Linux I wonder how much of the Kernel code has stayed the same (lets say over a 5 year period). Probably not a ton but I bet for other domains particularly business software the code either just dies or changes dramatically over time.
Then there are I suppose other domains where the code really can't physically change often (ie space probes and embedded devices).... for those systems I really really hope they have lots of tests.
Counterpoint: test code is also code and when unit test tyrants rule a team it can easily break down into a mess of maintenance (which suddenly more than doubles).
Speaking of good "taste" you could apply your "taste" to what are worthy tests and what are not (ie adding just maintenance). The fact of the matter is you need some sort of testing to happen or some sort of proof that your code works on a continuous basis and if you don't have that to happen IMO I'm not going to say it is good code particularly when it is based solely on one persons opinion of what is good looking code. Aesthetics compared to performance, readability and automated testing are pretty low on my list. You may say aesthetics brings those characteristics and it might but lets have some tests to prove it.
As for maintenance and test code being invalid it doesn't have to be as buggy and in fact can be even in done in a data driven style or declaratively. Besides if it is a true unit test you should have written the test. If its your code and you have such good "taste" shouldn't it not "break down into a mess"?
There are plenty of projects like SQL Lite, and jQuery that have excellent "taste" not just because of the quality of code but because of the test coverage. Oh and I won't go downvote people who disagree (which my parent seems to be) because I hold that to be poor in "taste" IMO.
Another issue both versions have is that they will not tolerate head being NULL.
Don't be afraid of double pointers, although they may seem overly clever at first glance, they can be really helpful for situations like this. Linus's code is considerably shorter, and not particularly difficult to debug or to understand, assuming you understand double pointers.
Another case where double pointers can be really useful is in the interface for linked lists. I.e.
Defining the interface that way allows properly removing the last entry from the list. Workarounds like making a list struct and passing a pointer to that are just double pointers in disguise. Returning the new head is error prone, since you can easily forget to reassign it.
An early exit is not the worst thing in the world, but an edge case is still an edge case. And if you have full "end of list" checks you make it even worse because now you're doing that in two places too. And if you do something with the entry, now that's in two places.
Sure, the case where it's the head of the list is simpler by itself. But treating it specially means more code, and more duplication, and it's not even faster.
To criticize the cleverness of Linus' code is fine. But the algorithm is the argument here, not the specific way he wrote it.
Edit: Also, I don't really buy into your argument about for loops. You trade one problem for another. Your version won't segfault (except the case where it does), but it will silently fail to remove anything.
1. "The algorithm" is the same in all cases here: a linear scan with a look-behind of 1. We're just talking about how the algorithm gets translated into code.
3. Yes, I don't do any error handling and neither does Linus. I wasn't trying to bring up the importance of error handling. I was trying to bring up the importance of avoiding undefined behavior. "Segfault" was shorthand for that. Dereferencing NULL is undefined behavior. I mostly don't care about "taste", but showing undefined behavior while talking about taste seems egregious.
4. Finally, I don't get nearly as hung up about "duplication" as most programmers: http://www.sandimetz.com/blog/2016/1/20/the-wrong-abstractio.... I realize this is still a minority opinion, but we own the future. (This comment has a bunch of duplicated 'e's, but we both agree that that's unimportant to "DRY" out. Your concerns strike me as similar. When you try to minimize the number of occurrences of "x != NULL" you're doing Hamming compression, not programming.)
This version is more clear for a human, but it does contain duplicate logic.
This line:
head = head->next;
is another form of this line:
prev->next = prev->next->next;
Both delete by setting "->next". Code duplication is a common source of bugs, and that's the rationale behind Linus's version, which eliminates duped code.
Not only is it duplicate, it also contains nasty "behavior". If prev->next equals NULL then dereferencing this would result in segfault, so in this case another if statement is required.
Like Linus example, this is just an example too and does not cover or comment on all the issues, so there are still elephants lurking in the shadows, just to name a few:
- Can entry be NULL? (segfault)
- Can head be NULL (list is empty)? (segfault)
Since the Linus example has the apparent precondition that the entry to be removed must be present in the list, there are no segfaults with proper API use. Since your code tries to eliminate that precondition, I'd argue "Remove NULL from the list" and "Remove something from an empty list" is something to expect, but you didn't account for that.
Point being: Don't be overly critical of examples; they are just examples.
I guess my fundamental criticism is this: how is a list traversal that doesn't check for the end of list being held up as an example of "good taste"?
You're right that they're all just examples. I'm just a random guy on the internet, and it's easy to push back against me. But we all have the tendency to nod along when someone we respect says something sagely on TED. It seemed useful to point out that the sage has no clothes on in this case. So go ahead, point at my skimpy cladding as well. I won't mind.
> If entry is null and head is valid segfault on line 10.
To get to line 10, we need prev->next == entry. Since we are talking about the case where entry == NULL, that requires prev->next == NULL
The for loop condition is prev->next, which will fail if prev->next == NULL, hence line 10 would never be reached in that case.
The difficulty of analysing all of this shows why mutable structures and variables make code hard to follow - perhaps the real good taste solution is to use something like finger trees instead of linked lists, together with a language which eliminates null.
I agree with you that moving the conditional to the beginning as an early exit keeps things simple, and while I don't see anything unusual about a 'while' loop and feel they have their place, this 'for' definitely works well.
I would make one small tweak, replacing this line
prev->next = prev->next->next;
with
prev->next = entry->next;
Just makes it a bit more straightforward to understand (as well as infinitesimally more efficient).
Totally, it's not about the one right way to do something. I do think there's some value in putting thought into how to write.. maybe "clean" code would be a less charged word than tasteful.
> both versions ignore an elephant in the room: defensively dealing with `entry` being absent from the list.
Indeed, both versions really have a poor taste.
Both are code snippets which can only work under very restrictive assumptions.
The main argument `head` is implied and the `entry` to be removed must be present.
Such piece of code has to be duplicated and tuned for every list and use case.
I even wonder why Linus bothered to show this example, while he leads one of the most famous software masterpiece.
This isn't an API though. 'head' is fine being implied, and if 'entry' doesn't exist, why is it being passed in ? That's a further code 'smell' to follow up.
Sure, you need to be massively defensive if you're writing an API or boundary, but within an owned codebase, if you're eliminating edge-cases, you should follow them all the way up.
tl;dr if you're calling remove_list_entry, only do so if you have an entry.
Or course, I'm not Linus, so I have no idea of his thought process.
Oh, absolutely. I have a long history of preferring code to be safe rather than fast[1]. In this case, for any project I had a say in you'd have to convince me with concrete measurements that this particular loop was on the critical path, and then you'd have to write a dozen tests to convince me that it was being used right without the terminal check, and that any future users would get unambiguous errors in debug mode. And after all that, you wouldn't be allowed to claim that taking away one of the conditions is "tasteful".
I actually tend to be pretty lax about coding style, believe it or not. But dereferencing NULL is undefined behavior.
The "bad taste" program of Linus performs two comparisons if it want to remove the first entry of the list. The "good taste" program of Linus, like your program perform only the needed number of comparisons.
For me, your program is similar to the one of Linus where you have unrolled the first iteration of the loop in order to avoid pointer syntax difficulties. Your program makes it also trivial to fix what you called the elephant. I think Linus has omit it for simplicity of example.
I like your program slightly better than the one of Linus because the syntax is less tricky, but I think the point of Linus was only the superfluous comparison in the "bad" one.
I didn't know that whiles were less normal than fors.
And though I don't object to for loops in this situation, I really had to puzzle yours out, because of the way your loop pointer is one behind the place of interest.
I believe you when you say it is segfault-free, but again I would have to puzzle at the code to be sure.
I think I'd be ok with a `while` that looked like this:
while (prev->next) {
if (prev->next == entry) break;
prev = prev->next;
}
or even this:
while (prev && prev->next != entry)
prev = prev->next;
Both are reasonable enough that they wouldn't trigger my (pretty laissez faire) sense of "taste". My core point: checking for the end of the list should be non-negotiable.
The `for` is just extra icing: it creates a scope for `prev`; you can tell at a glance that it's traversing a list. The update is closer to the termination check. But yes, this bit is negotiable.
There's a compromise enterprise developers should recognize from CS classes that's almost elegant enough for the kernel crowd: use a sentinel node for the element before the head. The edge case is eliminated, and no more &(*)->; ASCII Cthulhu.
Really, if you think the good version is overly clever, you probably have a very fundamental problem with your abstract reasoning. A very human one, as history is full of people having that problem--and ultimately, the good version taking over anyway.
People for a long time had strong objections to a number "0", because it doesn't make sense to have a number for nothing! Why not just have no number at all! 0 is not a number, it's a special case! Wrapping it up in arithmetic like that is just overly clever! After all, 0 is not a number!
The whole problem with that is the preconceived and completely unjustified notion that 0 is not a number. Just accept that 0 is a number, and suddenly, things become much easier to reason about, much more elegant to work with.
Similarly, the whole problem here is that you have the preconceived and completely unjustified idea that removing the first element is a special case. You can define it to be, just as you can define 0 to be a special case. But you could just as well simply let go of that preconceived idea and accept that it's not, and suddenly all the problems that you imagine are there disappear, and you have simpler code, just as having integer types with a zero makes for simpler code.
".. if you think the good version is overly clever, you probably have a very fundamental problem with your abstract reasoning.."
Just to show that you're talking out your ass, here's a code snippet where I did the address of address thing in precisely the same "remove from linked list" situation. In a VM and programming language and operating system that I wrote from scratch[1].
Moreover, I lived with it for months before I decided it wasn't worth the cleverness[3]. In this case. So I have some plausible claim to having found the "simplicity on the other side of complexity" in this case[4].
As a general rule I'd recommend not reaching conclusions about people from things they write in a comment on the internet. You're looking at them through a tiny peephole into their belief system. When in doubt, ask questions. "Excuse me sir, do you have much experience with C?", "Are you saying pointers to pointers are always bad?", and so on.
(Clever analogy there about 0. Totally agree with it. Utterly non-applicable in this instance.)
> Just to show that you're talking out your ass, here's a code snippet where I did the address of address thing in precisely the same "remove from linked list" situation. In a VM and programming language and operating system that I wrote from scratch[1].
I'm no sure what exactly that code/that change is supposed to tell me, so I can't tell whether it's a good argument. Mind condensing it down/pointing out what specifically you mean?
> As a general rule I'd recommend not reaching conclusions about people from things they write in a comment on the internet. You're looking at them through a tiny peephole into their belief system. When in doubt, ask questions. "Excuse me sir, do you have much experience with C?", "Are you saying pointers to pointers are always bad?", and so on.
Well, that depends on whether you want to make a point for other people in the discussion or discuss some issue with a specific person. As such, that was not really a judgement of your person, but rather a general statement about a group of people (which obviously might be wrong in some cases).
> (Clever analogy there about 0. Totally agree with it. Utterly non-applicable in this instance.)
You mean not applicable in the case of your code, or in the code in the article, or what else?
(edit: and yeah, I realize how the wording might give the impression that I was talking specifically to/about you, but I really wasn't, I was just using what you wrote as the opportunity, so I ended up addressing you :-)
Yeah my link is in a strange syntax, but mostly I just wanted to point at the specific line containing address:address, which is the same as a pointer to a pointer. That line is in a function called delete-before-cursor which uses a (doubly) linked list to maintain the state of a text editor.
I love your project! The of testing pervading the whole system is a very good one. I also like parameters being immutable unless they're a product, that's a great way of making passing references hurt less.
I still prefer the version Linus recommended. It's not for speed - I think it's more understandable. It's more simple (in the way Richard Hickey says, one strand rather than many) and there is less of it. Proving it correct would be easier, not that I actually would.
When people say "clever" to me, I think of code that is being tricky about something. (My canonical example of clever code is Duff's Device.) Linus' example doesn't read as tricky to me, it reads as very straightforward.
Thanks! I'm very happy that you noticed that little feature about immutability so quickly. I'd love to hear from you off-thread if you have any more questions or comments about Mu (email address in my profile).
I tried in my original comment not to bring Linus into it. I didn't want to be just another person on the internet poking holes at a famous person. If the examples had both included a guard for the end of the list I'd have been like, "eh, no accounting for taste, but ok," and moved on in silence. But we shouldn't be teaching/encouraging people to focus on minor considerations like the number of `if`s when there's a larger problem with undefined behavior. Since you already took a look at Mu, here's a place where I try to elaborate on this belief system: https://github.com/akkartik/mu/blob/07b54625f7/001help.cc#L7...
Not that I've never abused addresses like this. But having written this multiple times, I currently prefer something like this:
There's still an `if`, but it isn't so bad since it's an early exit. What is "tasteful" about hiding the fact that one of the cases is far simpler than the other? There's conditionals and then there's conditionals. There's cases and then there's cases.The other benefit of this approach: you eliminate segfaults by construction, because the `while` has been replaced with a much more normal `for` loop iterating through the list. Keeping it normal is 80% of the secret to avoiding security holes.
Rather than appealing to something nebulous like "taste", I prefer to focus on concrete functional benefits. What is a situation where someone using or modifying this function might be subtly led astray? That seems a better guide.
(Earlier version posted on reddit: https://www.reddit.com/r/programming/comments/59cq8r/applyin... )