That’s the danger of Occam’s razor and of simple answers. “Simplicity” is often a hiding place for our biases. We reach the conclusion we want to reach and then call it simple. --Plaza Garabaldi
I'm not saying you're wrong, I'm saying that I don't actually agree with you that brains being hardwired with information is that hard to explain.
I don't want to solve a riddle here, but I am genuinely interested in the evidence supporting the nativist hypothesis in this context, especially on how you'd pass abstract information through genes. Do you have any reference, a good scientific paper that could be an entry point?
There's a flipside to this: middle-aged-to-old people aren't choosing to participate in the silicon valley startup culture. As superuser2 noted, a 22-year-old can subsist on ramen while living in a car, while a 45-year-old can't ask his wife and children to do that. But it runs deeper than that: just because a 22-year-old can subsist on ramen while living in a car doesn't mean it's a rational choice.
Most startups these days are producing glorified ad servers that don't actually solve any problems or provide any value. The tech bubble is such that anyone with a young face, a Macbook Pro, and a dumb idea can get funded, but that won't last. As such, working 60 hour weeks for stock that will be worthless in a decade is a bad idea for anyone, regardless of age. Software devs in their 40s have the self-respect and experience that they won't take that kind of deal.
There are some startups which have reasonable business models, but I suspect that time will show that part of those reasonable business models is recognizing the value of experience. In the long run 45-year-old programmer with 25 years of experience working 40 hours a week is easily worth two 22-year-old programmers with 2 years of experience working 60 hours a week. There are certainly young prodigies and ideas which hit the zeitgeist so well that the execution of the idea barely matters, but these are outliers, not the norm.
> On two occasions I have been asked, — "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" In one case a member of the Upper, and in the other a member of the Lower, House put this question. I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. --Charles Babbage
The modern version of this seems to be:
"Mr. Babbage, I put the wrong figures into the machine and the wrong answers came out! Please fix it this, this has security implications!"
You can't reasonably expect the compiler to make your insecure code secure.
Calling them "language lawyers" is some entitled crap. GCC commits to implement the specification of the language. Expecting them to maintain some huge number of undefined behaviors is literally expecting them to do something they never said they would do and couldn't do even if they said they would.
The problem is that it's often perfectly clear, reasonable code on all the systems it was intended to run on. For example, on all Unix-like systems, pointer arithmetic is simply arithmetic and behaves like it. (C's predecessor didn't even have separate pointer and integer types.) So prior to compiler optimisations, this series of operations is safe and well-behaved on all architectures Linux supports even if a is NULL:
int *b = &a->something; // pointer arithmetic, doesn't dereference a.
if(a == NULL) return 0;
else something_critical = a->somethingelse;
However, some non-Unix address models that Linux doesn't support don't permit pointer arithmetic on NULL pointers. So the ANSI C standards committee declared it undefined. Which means that gcc can - and eventually did - eliminate the NULL pointer check. This has resulted in privilege escalation vulnerabilities in Linux that didn't exist until gcc decided to optimise the code, some of them quite well-hidden.
I understand the problem, I'm saying that it's not GCC's problem. If you don't want undefined behavior, don't put undefined behavior in your code. The code you wrote isn't clear or reasonable, because it relies undefined behavior. It's a valid criticism that this code does appear to be straightforward when it isn't, but that's not a criticism of GCC, it's a criticism of ANSI C. If you don't like it, use a better language. C was designed 4 decades ago; and they can't possibly have forseen every problem that we've discovered in that time.
ANSI C didn't really do anything wrong here, though - they created a least-common-denominator spec of what you could reasonably expect from C across all platforms. Pointer arithmetic on NULL pointers had to be considered undefined (not just unspecified) in ANSI C, because on certain commercially-important proprietary systems it generated a hardware trap that caused the OS to kill your process. The problem is that the gcc developers insisted on actually making that code behave as undefined even though it didn't make sense to.
Also, I should note that a lot of code - particularly the Linux kernel - isn't actually using ANSI C anyway. They're using a superset of it with gcc extensions and they have a whole bunch of architecture-specific code too.
> If you don't want undefined behavior, don't put undefined behavior in your code.
I'd quip that this is statistically impossible for a sufficiently large codebase.
> it's a criticism of ANSI C. If you don't like it, use a better language.
This is my basic stance. However, if I'm e.g. in a situation where I have a C or C++ codebase I can't afford to rewrite from scratch, I'd like to use a "Better C" compiler, where "Better C" is a slightly less bad version of "ANSI C" - some undefined behavior removed, for example.
As shorthand, I'll generally refer to compilers for "Better C" as "Good C Compilers".
GCC is not trying to be a Good C Compiler. They've decided these things aren't their problem. Which is... fair. That's their choice. I do not for one minute pretend to understand that choice however - and it gives me yet one more reason to switch to a Good C Compiler.
"POSIX says this is fine, so any application that expected this behaviour is already broken by definition. But this is rules lawyering. POSIX says that many things that are not useful are fine, but doesn't exist for the pleasure of sadistic OS implementors. POSIX exists to allow application writers to write useful applications. If you interpret POSIX in such a way that gains you some benefit but shafts a large number of application writers then people are going to be reluctant to use your code. You're no longer a general purpose filesystem - you're a filesystem that's only suitable for people who write code with the expectation that their OS developers are actively trying to fuck them over."
GCC doesn't, or at least shouldn't, exist to implement the ANSI standard. It exists to help people to write useful programs.
And it does that by implementing a compiler for a language with existing standard, which is ANSI. If you don't like the ANSI C standard (and I admit it's not perfect), don't use a compiler for ANSI C.
Also, this is not just GCC problem, all the existing C compilers have the issue to some extent. After all, STACK (the MIT tool to detect undefined behavior) is based on clang. And ICC exploits the same UB tricks AFAIK.
> And it does that by implementing a compiler for a language with existing standard, which is ANSI. If you don't like the ANSI C standard (and I admit it's not perfect), don't use a compiler for ANSI C.
I'm not arguing that GCC should violate the ANSI standard; rather it should provide additional guarantees above the what ANSI requires (which was always the intent of the standard; the standard defines the absolute minimum that cross-platform programs can depend on, the reason so much is undefined is to allow compilers to have their own strategies for what should happen in those cases, not to require that compilers blow up in those cases). Honestly I think the ANSI side of things is a red herring; when given the option of some change that will slightly improve performance on some benchmarks, but make a lot of user code silently fail, a responsible developer should know to reject that change whether or not that change violates some standard.
> Also, this is not just GCC problem, all the existing C compilers have the issue to some extent.
The post is claiming that GCC is the worst of them. Certainly my impression is that clang is substantially less aggressive at exploiting UB; I don't know ICC well enough to comment.
> I'm not arguing that GCC should violate the ANSI standard; rather it should provide additional guarantees above the what ANSI requires ...
The problem with that approach is that it introduces dependency on the compiler. The original code was ANSI C and thus should compile fine on all compilers compatible with ANSI C, the new code is not as each compiler will decide to handle undefined behavior differently. Either you'll make the exact compiler a hard dependency (i.e. it always has to be compiled with gcc and fails to build with everything else), or it will produce "correct" binaries on some compilers and "incorrect" binaries on others. That's hardly an improvement.
The only way out of this is either to abandon C and use a language with stronger guarantees, or make the ANSI C more strict by adding the guarantees to the standard. Which is not going to happen, I guess.
> The post is claiming that GCC is the worst of them. Certainly my impression is that clang is substantially less aggressive at exploiting UB; I don't know ICC well enough to comment.
GCC is also the most widely, so people tend to spot issues more often.
All this "problem" is a direct consequence of using C without really understanding what guarantees it does and does not provide, and instead driving by a simplified model of the environment. And then getting angry that the simplified model is not really correct.
> The original code was ANSI C and thus should compile fine on all compilers compatible with ANSI C, the new code is not as each compiler will decide to handle undefined behavior differently.
Except 40% of the original code already wasn't ANSI C.
> Either you'll make the exact compiler a hard dependency (i.e. it always has to be compiled with gcc and fails to build with everything else), or it will produce "correct" binaries on some compilers and "incorrect" binaries on others. That's hardly an improvement.
Having code that was broken under GCC not be broken under GCC absolutely is an improvement, particularly since in fact this kind of code often works on every other extant compiler.
> make the ANSI C more strict by adding the guarantees to the standard. Which is not going to happen, I guess.
Standards tend to codify existing practice. There's no reason the standard couldn't be made stricter - but the way we get to there from here is if the major compilers implement stricter restrictions and can show that they can be implemented consistently and users find them useful. GCC has been willing to do that kind of innovation for other parts of the standard.
> Except 40% of the original code already wasn't ANSI C.
Then why complain that ANSI C compiler gets confused by it?
> Having code that was broken under GCC not be broken under GCC absolutely is an improvement, particularly since in fact this kind of code often works on every other extant compiler.
No, the code does not work on every other compiler. And if it is, there's no guarantee it will stay like that.
> Standards tend to codify existing practice. There's no reason the standard couldn't be made stricter - but the way we get to there from here is if the major compilers implement stricter restrictions and can show that they can be implemented consistently and users find them useful. GCC has been willing to do that kind of innovation for other parts of the standard.
AFAIK some of the limitations are there because of non-traditional platforms - some of them may be a thing of the past so removing them would be OK, but some are not (and thus won't be removed from the standard). And one of the points of ANSI C (and POSIX) is to define global guarantees, not per-platform ones.
> Then why complain that ANSI C compiler gets confused by it?
Because I didn't ask for an ANSI C-and-not-a-penny-more compiler. Nobody wants that. Back in the day the GNU project made a point of going against standards when the standardized behaviour was user-unfriendly (POSIX_ME_HARDER etc.)
> No, the code does not work on every other compiler.
In many of these cases it does work on all other major compilers, or all other relevant platforms for that particular codebase.
> And if it is, there's no guarantee it will stay like that.
So what? That doesn't make it better to break it now.
> one of the points of ANSI C (and POSIX) is to define global guarantees, not per-platform ones.
Which is why it's GCC's (or any other compiler's) responsibility to define the per-platform guarantees.
Compilers will do things like remove a memset clearing a chunk of memory to zero (because it detects that the variable isn't read again). That sort of thing is bad for security.
I think you misunderstood the example -- the memory is cleared after use to ensure that if it's reallocated by someone else, or someone hooks up a debugger, the content can't be examined (except when the compiler removes this clearing attempt because of an optimization). Lets say that chunk of memory held a password -- you'd definitely want to clear it after use, even if you immediately free it and never plan to read it again.
That's actually a very good example, but I'd argue that this is actually a violation of the standard: memset is defined as setting the value in memory. Most optimizations on undefined behavior don't really fall into this category.
I guess you could group this kind of thing into the category of "dead code elimination" which is useful, but results in parts of the code written not producing the specified executable. I have to think on this example more.
It is allowed under the "as if" rule. If no visible aspects of the program are changed by an optimization, then it is allowed. The value stored in memory is not considered to be a visible aspect, and so the compiler is allowed to modify which memory is changed.
It's the same as inlining a function. The standard says that a function call is a function call. Compilers are still allowed to inline the call, even if it has not been specifically marked as "inline".
I absolutely agree with this. GCC implements ANSI C, and that unfortunately includes undefined behavior for various reasons. The problem with undefined behavior is that it's, well, undefined. Different compilers might choose different things, because different developers have different mental models of "what makes sense" in various situations. Which is hardly an improvement. Also, it wouldn't be ANSI C but some unknown mutation of C.
I believe the complaint is actually that the compiler makes secure code insecure by removing checks that rely on undefined behavior (which presumably can't be made any other way).
That complaint isn't a valid complaint. If the checks relied on undefined behavior, the code wasn't secure. If you want to rely on the behavior of a specific version of a specific compiler, then you need to define that in your dependencies instead of pretending that you've written general-purpose C code. This isn't even just a GCC problem; compiling the code on a different compiler breaks this too.
Yes, I'm really happy that GCC optimize that code away.
Most of us don't care about security issue too much when using C/C++. We do use it for performance, and use it mostly locally.
GCC is a very versatile code. It's ok that it makes secure code difficult to write because it's not what most of us is doing. Not being completely secure is ok, not being optimized is not.
What you and a lot of other people are missing, cheerful in your use of other languages, is that your runtimes and native extensions usually depend on insecure C code.
It's not GCC's job to break my program just in case I might one day run it on a broken compiler. That's like the fire marshal burning down my house to demonstrate how it violates fire codes.
There's also nothing wrong with writing a C program that rests on a base of POSIX, or the GNU system, and requires stronger guarantees than C alone provides.
What GCC is doing subverts the purpose of writing in C, which is to get close to the machine and instruct its processor to do certain things. Optimizations are useful when they allow the compiler to better express our intent, but lately, we've seen compilers more and more often ignore our plain, stated intent, then back it up with a reference to a specification that wasn't intended to allow this subversion.
> What GCC is doing subverts the purpose of writing in C, which is to get close to the machine and instruct its processor to do certain things.
That's not the purpose of writing in C. It's a goal you might be able to achieve with C, but I'm not sure why that's your goal, and it's certainly not the goal of everyone who writes C. I think more people who write C do so with the coal of producing programs that run fast or with a minimal memory fingerprint.
There's a parable where a man goes to the doctor and says, "Doctor, whenever I drink my coffee with the spoon in the cup, the spoon handle pokes me in the eye and it hurts." And the doctor says, "Well, stop doing that."
If you wrote `foo(bar(), baz())` and `baz()` relies on state mutated by `bar()`, your code is bad, and you should feel bad, because experiencing those bad feelings is the way you learn to not write bad code. This code was wrong before the compiler reordered the calls, it just failed silently for a while. The compiler isn't responsible for fixing your bugs, you are.
People need to stop expecting other people to fix their problems.
Uhh... The question was "what's an example of undefined behaviour". I gave one, specifically of an example that could realistically break by relying on undefined behaviour.
That's it. Simple education.
In the context of this post I think that's a good thing because not everyone will understand the topic.
You, however, seen to have read some sort of agenda in the question, which I find a little baffling...
To reiterate/emphasize this point: "undefined", "unspecified", and a few other related terms are Things in C. They have specific, non-interchangeable, well-defined meanings in the C specifications.
The eye-spoon defence of C undefined and unspecified behaviours is very unfortunate and misleading: it makes it sound like avoiding them is as easy as just taking a spoon out of a cup.
Theoretically "stop doing that" works... but history has shown that it really doesn't work in practice, in C, in programming more broadly, and, really, in any human endeavour ("planes don't need safety procedures, just stop making mistakes").
Who is at fault in your example becomes much less clear cut when you consider the variant where the author of foo doesn't have access to the source code of bar and baz and they only rely on shared state on some systems or in some corner cases.
I think a better design would be to separate mutators from pure functions. If a procedure mutates state, it should have a void return type, and if a procedure returns a value it should be a pure function that doesn't mutate state.
This is, of course, a rule of thumb, not a hard law. Some exceptions:
1. I think it's okay (and in fact, idiomatic in C) to mutate state and return some sort of information about what occurred (i.e. a success flag, a number of characters written, etc.).
2. Isolated mutations such as logging sometimes make sense in an otherwise pure function.
I'd agree with you in some languages, but in C this would be prohibitively difficult. In general, good fluent design uses immutable objects, which makes it basically just a syntactic sugar for functional programming. While fluent syntax is nice, the functional semantics are the real value, and are much easier to do in C (although, as soon as you add in memory management, functional programming often becomes prohibitively difficult too).
The only reasonable thing to say about this was already said upthread of the page, and quoted here:
> I have worked on many programs, and whenever I found such a problem (typically called a "portability problem"), where the code was assuming something that the language did not guarantee, I fixed the program rather than bitching about the compiler.
IMO, it's a little subtler than that. It's not the compiler's fault, it's the language's. Plenty of languages have no undefined behavior that can be written by accident. Go and safe Rust, for instance, have just about no undefined behavior at all, and are both performance-competitive with C. (Go has UB if you cause race conditions, and Rust has UB within `unsafe` blocks analogous to C's UB.)
A C compiler, meanwhile, has to aggressively take advantage of undefined behavior to get good performance, and the C specification has been keeping behavior undefined for the benefit of compilers.
You can hope that you find all such problems in C (which you might not) and "fix the program", but you can also "fix the program" by switching to a better language.
Yes, the C language specification is a bit shit, it leaves too much leeway to compilers so that C compiler for broken, niche architectures can be written.
However, I disagree with you. All this badness in the C specification did not stop people from writing reasonable C compilers for reasonable architectures for decades.
The real problem here is competition. Gcc is in a competition with clang to produce fast code which makes the gcc developers feel justified when they exploit undefined behaviours for marginal optimizations.
This is a case of following the letter of the law (in this case the C standard) while disregarding its spirit: all the undefined behaviour was so that C compilers could accomodate for odd architectures while remaining close to the metal, not so that compiler programmers could go out of their way to turn their compiler into a mine field.
More specifically, they're competing to produce the fastest code for software that follows the C specification to the letter. That's not necessarily the same as producing the fastest code that actually achieves the intended goal.
For example, if I recall correctly the popular Opus codec overflows signed integers when decoding invalid data, and so long as this can be guaranteed to produce some (possibly implementation-specific) result this is perfectly safe. However, this is technically undefined behaviour - a particularly malevolent optimising C compiler could decide to give the sender of the data arbitrary code execution, because it's allowed to do whatever it likes. This might even make the code run faster, but it'd make decoding Opus correctly and safely slower because the decoder would have to do a bunch of gratuitous overflow checks on operations it could otherwise just let overflow. Fortunately, gcc hasn't reached that level of advanced malevolence yet.
> More specifically, they're competing to produce the fastest code for software that follows the C specification to the letter. That's not necessarily the same as producing the fastest code that actually achieves the intended goal.
That's true, but "producing the fasted code that actually achieves the intended goal" is not a problem which can be solved by a compiler--the compiler can't read your mind.
Attempting to represent a reasonable approximation of reading your mind is the responsibility of the specification. You can definitely do better than C, but I doubt you could have done better 4 decades ago when C was designed.
Yes. The worst part of it is that compilers have the option to do something sensible when handling undefined behaviour. The standard even suggests doing that, describing one possible option as behaving "in a documented manner characteristic of the environment". So why doesn't gcc do that?
People are far too keen to assume that because the standard leaves it undefined, and "undefined" sounds a bit here-be-dragons, it's therefore inevitable that undefined behaviour has to be some nasty creepy ugly thing. That it renders your entire program instantly meaningless. That it's a perfect excuse for the compiler to look at your program and turn it into something completely different. But... it doesn't have to be.
I don't know why people don't treat handling of undefined behaviour as a quality of implementation issue, rather than just rolling over and letting gcc make their lives worse.
> This is a case of following the letter of the law (in this case the C standard) while disregarding its spirit: all the undefined behaviour was so that C compilers could accomodate for odd architectures while remaining close to the metal, not so that compiler programmers could go out of their way to turn their compiler into a mine field.
Computers don't have spirits; they work as you tell them to work, to the letter, and if you're remaining close to the metal, your language will indicate that fact. Optimizing undefined behaviors doesn't make C a minefield; low-level programming for different architectures just is inherently a minefield. C was a minefield before these optimizations were added.
Rust is extremely impressive because they've found so many ways to do high-level programming while maintaining low-level performance. But they can only do that because they have the benefit of the 4 decades of programming language research that have occurred since the basics of C were designed.
>Rust is extremely impressive because they've found so many ways to do high-level programming while maintaining low-level performance. But they can only do that because they have the benefit of the 4 decades of programming language research that have occurred since the basics of C were designed.
I doubt rust could be ported to a 8bit PIC microcontroller, or to a 6502 keeping reasonable performance characteristics or letting the programmer take advantage of the platform quirks. It's not just "4 decades of programming language research" it's also that it's intended to work only on "modern" processors.
Agreed. Which is why you should choose a language which was standardized by a standards committee whose goals better align with your goals.
> I doubt rust could be ported to a 8bit PIC microcontroller, or to a 6502 keeping reasonable performance characteristics or letting the programmer take advantage of the platform quirks.
I don't think that's true; I think that the current state of Rust tools is such that this is true now, but it's nothing inherent to the design of the language, and I think you'll be able to do quite a bit with Rust in the situations you describe when the tools around Rust are more mature. I can't really speak to this more because I'm not sure why you think this can't be done.
> I doubt rust could be ported to a 8bit PIC microcontroller, or to a 6502 keeping reasonable performance characteristics
I don't believe this is correct. Most of why Rust avoids UB is that it uses static types much more effectively than C does. Static types are an abstraction between the programmer and the compiler for conveying intent, that cease to exist at runtime. So the runtime processor architecture should be irrelevant.
For instance, in C, dereferencing a null pointer is UB. This allows a compiler to optimize out checks for null pointers if it "knows" that the pointer can't be null, and it "knows" that a pointer can't be null if the programmer previously dereferenced it. This is, itself, a form of communication between the programmer and the compiler, but an imperfect one. In Rust, safe pointers (references) cannot be null. A nullable pointer is represented with the Option<T> type, which has two variants, Some(T) and None. In order to extract an &Something from an Option<&Something>, a programmer has to explicitly check for these two cases. Once you have a &Something, both you and the compiler know it can't be null.
But at the output-code level, a documented compiler optimization allows Option<&Something> to be stored as just a single pointer -- since &Something cannot be null, a null-valued pointer must represent None, not Some(NULL). So the resulting code from the Rust compiler looks exactly like the resulting code from the C compiler, both in terms of memory usage and in terms of which null checks are present and which can be skipped. But the communication is much clearer, preventing miscommunications like the Linux kernel's
int flags = parameter->flags;
if (parameter == NULL)
return -EINVAL;
Here the compiler thinks that the first line is the programmer saying "Hey, parameter cannot be null". But the programmer did not actually intend that. In Rust, the type system requires that the programmer write the null check before using the value, so that miscommunication is not possible.
There are similar stories for bounds checks and for loops, branches and indirect jumps and the match statement, etc. And none of this differs whether you're writing for a Core i7 or for a VAX.
> or letting the programmer take advantage of the platform quirks.
I'm not deeply familiar with that level of embedded systems, but at least on desktop-class processors, compilers are generally better than humans at writing stupidly-optimized code that's aware of particular opcode sequences that work better, etc.
(There are a few reasons why porting Rust to an older processor would be somewhat less than trivial, but they mostly involve assumptions made in the language definition about things like size_t and uintptr_t being the same, etc. You could write a language with a strong type system but C's portability assumptions, perhaps even a fork of Rust, if there were a use case / demand for it.)
Did rust find a way to defeat the halting problem and push all the array bound checks to compile time? How well does rust deal with memory bank switching where an instruction here makes that pointer there refer to a different area of memory?
> Did rust find a way to defeat the halting problem
I don't understand why the halting problem is relevant to this conversation. Just about all practical programs don't care about the halting problem in a final sense, anyway; see e.g. the calculus of inductive constructions for a Turing-incomplete language that lets you implement just about everything you actually care about implementing.
The halting problem merely prevents a program from evaluating a nontrivial property of another program with perfect accuracy. It does not prevent a program from evaluating a nontrivial property (bounds checks, type safety, whatever) with possible outputs "Yes" or "Either no, or you're trying to trick me, so cut that out and express what you mean more straightforwardly kthx."
This realization is at the heart of all modern language design.
> How well does rust deal with memory bank switching where an instruction here makes that pointer there refer to a different area of memory?
This problem boils down to shared mutable state, so the conceptual model of Rust deals with it very well. The current Rust language spec does not actually have a useful model of this sort of memory, but it would be a straightforward fork. As I said in my comment above, if there was interest and a use case for a safe language for these processors, it could be done easily.
You don't need 4 decades of programming language research to specify that e.g. signed integer overflow either returns an implementation-defined value or the program aborting. There are languages older than C that allowed the useful forms of bit-twiddling but offered much stronger safety guarantees.
But you pay a performance cost for either of those decisions. Consider code like this:
for (int i=0; i <= N; i++) func();
Most processors have special support for looping a fixed number of times, e.g. "decrement then branch if zero." If overflow is UB, the compiler can use this support.
But if overflow returns an implementation defined value, then it is possible that N is INT_MAX and the loop will not terminate. In this case the compiler cannot use the fixed-iteration form, and must emit a more expensive instruction sequence.
A correctly predicted branch is almost free, the compiler could check for that case.
The real problem of course is that C requires the programmer to obscure their intent by messing with a counter variable. Is there really no "loop exactly N times" construct in the language?
I don't think it's the language's fault either: C is four decades old, and being bound by reverse-compatibility, they can't integrate much of the programming language research that has happened in the last four decades. I'm a less concerned with placing fault for the problem than I am with placing the responsibility for fixing the problem, and that's clearly on the writer of the program which uses undefined behavior.
I think choosing Rust might be a reasonable way to avoid the problem in the first place, so I agree with you there, but there are also reasons to choose C over Rust. Personally, I write a lot of C code because I prefer to build on a GPL stack. This is one of the reasons I'd like to see Rust added as a GCC language. Sadly I don't have the time to do it myself.
Umm wait. Undefined behavior is where the language specification is not 100% precise, and compiler implementations can differ on produced code.
Go and Rust only have single implementations. The specification for both are very brief. Are you claiming that a clean box implementation of Go and Rust would always behave identically?
Your definition of undefined behaviour is actually the definition for unspecified behaviour.
Unspecified behaviour is usually intentional ambiguity either to give wiggle room for an optimizer, or to accommodate platform variance. Writing a program with that invokes unspecified behaviour isn't normally a problem, as long as you're not relying on a specific result. Order of argument evaluation is a common example.
Relying on undefined behaviour is almost always bad, and almost always avoidable. That's where the nasal demons come from. Dereferencing null, alias-creating pointer typecasting, etc.
Undefined behavior has a very specific meaning for C. It doesn't mean "not 100% precise". It means 100% imprecise. The C standard only gives any guarantees on what your program will do if you never invoke UB. If you do, well then it can do literally whatever it wants including deleting all your files.
Literally deleting all your files at that. It's the part of the language that says your compiler is completely justified in allowing you to write that buffer overflow vulnerability that can trample executable memory, which is then used by a malicious attacker to do literally whatever they want, including deleting all your files. Or worse.
No, "undefined behavior" is a term with a specific meaning, because it is used for a very specific purpose. "Undefined behavior" means that a compiler can assume that a particular scenario will never happen in correct code, and therefore if it ever thinks it has to care about that scenario, it can in fact ignore it for the purpose of optimization.
For instance, if you have a bool in C++, the only defined behavior is for it to contain 0 or 1. If you somehow force the memory cell to contain 2 or 255 or anything else, you have triggered undefined behavior. This means the compiler never has to check for it. If it's faster for the compiler to implement, say, `if (b) x[1] = a; else x[0] = a;` as `x[b] = a`, it can do that. Even though the `b == 2` case might lead to a buffer overflow, the UB rules means that, as far as the compiler cares, the `b == 2` case cannot exist. If `x.operator[]` is a function that does a bounds check, the compiler can inline it and remove the bounds check. And so forth.
Hence, the rule that on encountering UB, the compiler may do anything. It is not so much that the compiler is intentionally doing anything, as that it is outputting code that assumes the UB can't happen. If the compiler chooses to implement an `if (b)` via a computed jump, the `b == 2` case may land on some completely ridiculous code to start executing. It is not that the compiler wants to land there to punish you, it's that the compiler's only responsibility is to make sure the `b == 0` and `b == 1` cases work.
What you're pointing to is unspecified values. This means that the implementation can return any value, but must actually return some coherent value. If a C++ function returning bool returns an "unspecified value", it is still only returning either 0 or 1. You can act with the resulting bool as if it is in fact a bool. There are no optimization gotchas to worry about. It's just that you don't know what it is.
For the particular case here, when you're inspecting a runtime error, the only useful thing to do with it is to call the Error() method from the error interface. The language guarantees you that it will work (i.e., you have defined behavior, that you have an object that indeed implements the error interface). It doesn't give you any particular guidance on exact error values or the resulting string. But that's no more "undefined behavior" than the result of reading from a file is "undefined".
The entire reason people care about undefined behavior is the fact that compilers can do arbitrarily-stupid(-seeming) things when it is present, which is solely because compilers want to optimize. In the case of unspecified values, there is nothing to optimize.
(Anyway, I am not a Go programmer at all. Maybe there is actually UB somewhere in safe Go. But what I have seen is not it.)
This makes it sound like Go and Rust have some secret sauce that enables C's performance without UB. But it is not so.
For example, consider an expression like (x*2)/2. clang and gcc will both optimize this to just x, but Go and Rust will actually perform the multiplication and division, as required by their overflow semantics. So the performance vs safety tradeoff is real.
I can't speak for Go, but your intuition regarding Rust code is incorrect. See https://play.rust-lang.org/?gist=09b464627f856e0ebdcd&versio... , click the "Release" button, then click the "LLVM IR" button to view the generated code for yourself. TL;DR: the Rust code `let x = 7; let y = (x*2)/2; return y;` gets compiled down to `ret i32 7` in LLVM.
In fact, there is "secret sauce" here. The secret sauce is that Rust treats integer overflow specially: in debug mode (the default compilation mode), integer overflow is checked and will result in a panic. In release mode, integer overflow is unchecked. It's not "undefined behavior" in the C sense, because of the fact that it always returns a value--there are no nasal demons possible here (Rust disallows UB in non-`unsafe` code entirely, because memory unsafety is a subset of nasal demons). The exact value that it returns is unspecified, and the language provides no guarantee of backward compatibility if you rely on it (it also provides explicit wrapping arithmetic types if you explicitly desire wrapping behavior). And though it's a potential correctness hazard if you don't do any testing in debug mode, it's not a memory safety hazard, even in the unchecked release mode, because Rust's other safety mechanisms prevent you from using an integer like this to cause memory errors.
A constant expression is not a good test: any compiler worth its salt (which includes LLVM) will be able to optimise a case like that. Change the function to:
pub fn test(x: i32) -> i32 {
let y = (x*2)/2;
y
}
and you'll see the difference (e.g. compare against a similar function on https://gcc.godbolt.org/ ).
> The secret sauce is that Rust treats integer overflow specially
Debug vs. release mode is irrelevant: the panicking case is more expensive than any arithmetic, and, the RFC[0] was changed before it landed:
> The operations +, -, [multiplication], can underflow and overflow. When checking is enabled this will panic. When checking is disabled this will two's complement wrap.
The compiler still assumes that signed overflow may happen in release mode, and that the result needs to be computed as per two's complement, i.e. not unspecified.
Bah, darn you for changing the RFC out from under me. :P
I don't understand the point of specifying wrapped behavior in unchecked mode rather than leaving the value unspecified. Surely we don't care about properly accommodating use cases that will panic in debug mode.
I finally understood your point and where I was unclear, mostly after reading the other comment about the for loop.
You're right that C requires UB to optimize things like (x2)/2. My argument is that Go and Rust's secret sauce is you wouldn't have to write things spiritually similar to that in the first place. (x2)/2 is a bad example, since nobody would write that on purpose, but loops are a great one. C's only interface to elements of an array is a pointer. So C has to say that accessing out-of-bounds pointers are UB, as is even having a pointer that's neither in-bounds or right at the end, because it simply has no way to distinguish "pointer" from "pointer that is in-bounds for this array". The type of an array iterator in Rust (and I think also Go) carries knowledge of what array it's iterating over, and how far it can iterate; since it's part of the type system, the compiler has access to that knowledge. So you can just say `for element in array`, and the compiler knows that you're only accessing in-bounds pointers, and generate the same code C would, without needing to define a concept of UB.
Of course if you do generate pointers in unsafe code, you are subject to UB, same as in C. Rust and Go simply give you a language where most of the time, you don't need to reach for constructs that require UB to be performant.
(There is, however, an actual bit of secret sauce, at least in Rust but I suspect Go has an analogue: Rust's ownership system allows it to do far stronger alias analysis than even C's -fstrict-aliasing, without the risk of false positives -- you cannot construct overlapping mutable pointers in safe code.)
Leaving aside the fact that I'm also comparing Rust... why not? It's a language that produces fast, static executables. I bet that a good fraction of the Debian archive (not all of it, for sure) could be reimplemented in Go without causing any problems. What "different universes" are these?
(To be fair, I haven't written any Go because my personal use cases involve things like shared libraries and C-ABI compatibility, so I'm going off what I've heard about Go, not personal experience. But out of what I've heard about Go, it's a fine language for this purpose, because the requirement here is just portability to all Debian architectures and comparable performance, and whether GC is used is an implementation detail.)
At a very basic level, when I was volunteering helping the homeless, I saw a great deal of disparity. There are many homeless shelters and programs that only take in women and children, and many shelters which accept people of all genders but have limited space will make men wait to gain entrance, while letting in women and children, so that if someone doesn't get in, it's always men. Donations are frequently given under the condition that they go to help homeless women and children.
I do understand some of the legitimate reasons for this: women are at higher risk to be raped and experience more medical issues related to homelessness. But this only accounts for some of the disparity.
Part of the issue here is that there's a belief that adult males should be able to support themselves. This is evident in the programs that do serve men: while homeless programs that target women provide food, shelter, medical care, and childcare, the few homeless programs which target men almost universally focus on helping homeless men get jobs. Men do experience privilege in employment when they're actually employed, but this idea is inapplicable to the homeless. Most homeless people are mentally ill, and a mentally ill man is no more employable than a mentally ill woman.
in sf finding a shelter for children is disgustingly difficult and, as i finally concluded, impossible on weekends
i'm sorry, but conflating the issue of a perceived disparity with 'beliefs' about social gender consensus is unhelpful and only incites issue derailing ire
programs to help the homeless are flawed, i know it first hand, but complaining about someone else's idea of a solution does the least possible to help
if you want a shelter organised under your own incentives i'd suggest starting your own
it's an asshole thing to say but look at the landscape of shelters
currently we have overcrowded, underfunded understaffed solutions:
some universal, few women only, few men only
how do we remove the gender bias?
bulk the universal with the necessary staff that administer the specific needs of the gender exclusive shelters
but the universal shelters are already overcrowded, underfunded and understaffed so to ask more of them would only aggravate the issues further
so we need more shelters, and to have more people willing to open and operate a shelter, and some people get the will to do so by focusing their attention on a specific group, in this case gender based
why is there gender bias in shelters? i'd argue because the people who are willing to put in the work choose to create an environment for the gender bias
should we regulate out the bias with legislation forbidding discrimination based on gender? in the current landscape i only see that harming the issues because then those that have the personal will to run a gender biased shelter will lose their incentive and simply do something else, limiting the number of shelters available to share in the solution
instead i think the argument should be to fund a state run universal shelter program well enough that specific interest shelters get phased out proactively
> i'm sorry, but conflating the issue of a perceived disparity with 'beliefs' about social gender consensus is unhelpful and only incites issue derailing ire
It's not a conflation, it's acknowledging what I believe to be a causal relation. If I'm correct, changing the beliefs will help fix the issue.
> instead i think the argument should be to fund a state run universal shelter program well enough that specific interest shelters get phased out proactively
Yes. The solution to disparity is enough abundance that it doesn't matter.
my bad, i called it conflating because the differences between your own beliefs and the apparent beliefs of others were lost and i interpreted your statement as speaking for each
..acknowledging what I believe to be a causal
relation. If I'm correct..
I'm not questioning what you're saying, as it seems to be correct based on what I know (I may or may not have watched the Helvetica documentary) but what's with the graininess in the color picker? It doesn't fit with anything else I'm looking at on this topic.
My guess is to simulate how the colors will look on paper, they always look like they "pop" (god I hate that word lol) on a screen more than when printed out. You can turn it off by clicking "noise" at the top right.
The Swiss style seems to have influenced the covers of a lot of old sci-fi books, so I mentally associate these kinds of color schemes with faded old paper and sitting indoors on a rainy afternoon.
> But let's remember context, you support capitalism where COMPETITION exists, in this situation it is an artificially created monopoly of one company.
That's not what Republicans, or almost any politicians, support. Politicians are mostly addicted to power and support keeping power, even if that means screwing the ideals they supposedly espouse. And keeping power in a democratic republic involves keeping funding for the next election cycle. Regardless of their stated ideals, politicians are de facto slaves of the corporations whose interests they serve, unless laws exist to limit this effect. In short, laissez faire capitalism and democracy are incompatible.
I have deep, deep objections to OAuth, and I hoped this article would espouse some of them, but it doesn't. Instead, this article is basically whining about "programming is hard" which is true but also something anyone trying to solve serious problems got over somewhere between high school and getting a real job.
OAuth is hard because it's pretending to solve a hard technical problem (logging in and sharing some private data securely) while actually solving an easy business problem (how do we (a big company) get smaller companies to outsource as much of their user data as possible to us). It kind of solves the hard technical problem (because bigger companies are better at hiding the data you give them from other companies who don't pay for it, and have all your data anyway). But if that half-assed solution to the hard problem is good enough for you, you aren't actually trying to solve it, you're just to persuade uneducated users you've solved it. In short, OAuth For Dummies would be a tautological title for a book.
I've heard it said that it's okay and even desirable to outsource everything in your business to other companies except your core service. But what service exists without users? If you can't get users to sign up for your service and give you their information, then I'd argue you don't even have users. You might as well quit. This isn't a question of "what's the best way to authorize users?" question it's a "do I even have users?" question.
Your comment is dead on. OAuth2 provides a way for people to log into your service without really telling you who they are. Instead, you outsource the task of asking them who they are to another company. But if you don't care who your customers are, why are you asking them to log in?
The whole concept makes no sense. And then when you consider that OAuth2 isn't even a protocol, but a "framework for a protocol," whatever that means. But sites use it, and will continue to use it, because it makes signup easier for new users.
Here is what the author is missing from is "programming is hard" angle. Yes, programming is hard, but that in itself is not a bad thing, and it's even a good thing if the program itself is a good idea. But OAuth2 is not a good idea. It's not even a bad idea -- it's a framework for a bad idea.
> But if you don't care who your customers are, why are you asking them to log in?
I may not care who they are, but I do care how their identity maps to information in my system. Simple functionality such as favorites or saved preferences doesn't require any information about the user, other than them being able to identify themselves as someone who has previously used the site.
Cookies or browser fingerprinting should do the trick, with gentle reminders to register. It's how most online shopping carts work.
I'm not saying no one should ever use OAuth2, ever. Sometimes websites have to jump through weird hoops to get conversions. But as a web developer, it annoys me a lot.
Cookies and fingerprinting tell me that the same browser is coming back to my website, but I don't care about the browser I care about the user. As a user if I go to a website on my work machine, my home machine, and my phone then I'd like my data to be available on each.
I don't have strong opinions on OAuth or any other federated identity solution, I just think it's a specious claim to say that just because a system doesn't care about who a user is then that system doesn't care about the uniqueness of that user.
I do not know who is the uneducated here, but in the case of OAuth, the other company already has the user data. What OAuth enables is to use their information to verify the user. What this post is trying to say, is that the method stinks.
> I do not know who is the uneducated here, but in the case of OAuth, the other company already has the user data.
No, they don't. Google, for example, doesn't have the entire signup list of all the users of The Old Reader, but they have a lot of The Old Reader's users, because The Old Reader outsources authorization for some of its users to Google. That's data that Google is collecting via OAuth, and you'd better believe they use that data.
> What OAuth enables is to use their information to verify the user.
That's what it enables for the OAuth consumer, but there are far easier ways of doing that. The difficulties of OAuth exist because OAuth doesn't serve the OAuth consumer's needs, it serves the OAuth provider's needs.
On the first point I think we are talking about two different things. I am not talking about the entire signup list of Old Reader, I am talking about a user of Old Reader that uses Google OAuth to access Old Reader. In this case Google already has this particular user data.
Don't agree on the second one. What it does it serves the site owner needs. They can choose to provide OAuth or not. Some provide it to make it easier for their users to login, and they also provide their own authentication otherwise, other sites use OAuth only and some sites just their own. Most see it as a benefit for their users to only use one login. The benefit for the OAuth providers are stronger relationship with that particular user.
> On the first point I think we are talking about two different things. I am not talking about the entire signup list of Old Reader, I am talking about a user of Old Reader that uses Google OAuth to access Old Reader. In this case Google already has this particular user data.
We're talking about different things because you missed my point a few posts ago when I said that the problem it solves is "how do we (a big company) get smaller companies to outsource as much of their user data as possible to us". User lists are data.
> Some provide it to make it easier for their users to login
If that's their goal, they're failing to achieve it. OAuth requires more steps than a simple username/password signup form, including going to a completely different site to give permission to log in with your data. Google/Facebook/etc. and other OAuth providers aren't stupid: they know that's not a good solution to that problem. If they really wanted to solve that problem they'd write a login library (something like Reddit's signup/login system) which would solve that problem better. The reason OAuth isn't implemented that way is that the goal of OAuth is not to make it easier to sign up and log in.
> Most see it as a benefit for their users to only use one login.
There is nothing that stops users from using one login everywhere; OAuth does not aid this in any way. I use the same login on all the sites where I don't care about the security of my account.
You have yet to make any compelling argument that users or sites which use OAuth are gaining any benefit from OAuth. The only people who benefit from OAuth are OAuth providers.
>You have yet to make any compelling argument that users or sites which use OAuth are gaining any benefit from OAuth. The only people who benefit from OAuth are OAuth providers.
I am not making a compelling argument for or against OAuth. My point is that you do not understand how OAuth works. The user is already a user of the OAuth provider. The outsourcing is not decided by the Oauth provider, it is decided by the site owner, and it is the user that decides to use this option or not.
And as stated above, this post is saying that the method stinks.
The fault in your analogy is that you seem to think that oxygen is somehow less important in situations where it's easily available. You say, "Just when I'm diving or doing something where it is equally important." But oxygen is always equally important. It's just that in most situations it's easily available to you: all you have to do is inhale.
Your analogy isn't even internally correct, but even if it were, it still doesn't prove anything about privacy, because privacy isn't easily available, at least not over technological channels. Privacy isn't oxygen, so accurate claims about oxygen don't imply anything about privacy at all.
Analogies are for explanation, not evidence. If you can't make an argument without an analogy, you're may want to consider that you're wrong.
> Analogies are inexact by semantic definition, and that doesn't make then "faulty".
Well, it makes them useless as evidence. An argument by analogy simply isn't a valid argument. Don't you remember the "You wouldn't steal a car" ads?
> Privacy is readily available through https, two factor OAuth, etc.
HTTPS is broken by privileged man-in-the-middle attacks (attacks where the attacker has key signing power) and downgrade attacks. And that is when it's even available (it isn't always). And even against attackers with less power, it only provides privacy for what you send over the wire, not who you send it to. And finally, this all assumes that you're sending your data to an entity which won't simply sell it to whoever is willing to pay a few bucks (an uneducated user might think, for example, that data sent through GMail is private).
I'm not even gonna touch "two factor OAuth"; I'm not sure what kind of privacy you even think that provides.
In short, you clearly have no knowledge about what does and does not provide privacy. It would behoove you to not make claims on topics you are ignorant of.
Analogies aren't evidence, they're a tool for explanation, again by semantic definition.
It's committing a no-true-scotsman to say that "privacy isn't as easily available as oxygen" when you change it to "true privacy is is really perfect privacy" when faced with HTTPs and OAuth.
All privacy & security tools are are imperfect, but most of us find the right level, rather than live in a faraday cage in our mother's basements (that's the point). Unless of course, copsarebastards, you need that level of privacy - then I'm not going to judge.
> Analogies aren't evidence, they're a tool for explanation, again by semantic definition.
Agreed, that's what I've been saying all along.
So then why did you use an analogy? Did you really think the sentences "Privacy is easily available" or "People only have to use privacy tools when they are doing something that they want to keep private" needed explanation? Perhaps I assumed you were using it as evidence when you weren't, but you have to admit that's a reasonable assumption given that the analogy is completely pointless otherwise.
> It's committing a no-true-scotsman to say that "privacy isn't as easily available as oxygen" when you change it to "true privacy is is really perfect privacy" when faced with HTTPs and OAuth.
Imperfect privacy isn't privacy. Either people are able to look at your data or they aren't. If people are able to look at your data, you don't have privacy. This isn't a complicated idea or a "no true scotsman" fallacy, it's the meaning of the word "privacy".
We have plenty of evidence showing that the NSA surveils data which is "protected" by HTTPS, ergo, HTTPS does not provide privacy. And the NSA isn't the only actor with this capability.
And OAuth doesn't provide privacy. It's not even the problem that OAuth tries to solve. OAuth provides authentication, which is an element of privacy, but it takes more than simply showing that a person is who they claim to be to provide privacy.
> All privacy & security tools are are imperfect, but most of us find the right level, rather than live in a faraday cage in our mother's basements (that's the point).
That's exactly not what happens. The average user simply is not informed enough to make an educated choice about what level of privacy they want and make choices to get that level of privacy. As a result, people don't find the right level of privacy. Closeted gay people get outed by their Facebook friend graph, pregnant teenagers have their pregnancies publicized by their targeted ads, celebrities have their nude photos leaked to the public, adultery website users and corporate employees have their information leaked, women are found by their jealous law enforcement exes misusing surveillance technologies. Only a fraction of these people actually knew what risk they were taking when they friended someone on Facebook, searched for goods on Amazon, texted a nude photo to a lover, put their credit card into a website, gave their info to their employers, or made a phone call.
Obviously living in a faraday cage in your mother's basement isn't the answer: that's a straw man argument.
The answer, in my opinion, is both social and technical. Socially, we need to get people to prioritize privacy and use privacy by default, we need people in power to respect and protect the right to privacy rather than actively taking it away from people. From the technical side, we need privacy tools that are faster, more secure, and easier to use, and we need decentralization so that violating people's privacy is no longer an option.
I'm not saying you're wrong, I'm saying that I don't actually agree with you that brains being hardwired with information is that hard to explain.