> If I program in C, I need to defend against the compiler maintainers.
> If I program in Go, the language maintainers defend me from my mistakes.
> If you’re choosing software to deploy on your systems, which language would you rather that the vendors be writing in?
I think this is my favorite take-away. I spend a lot of time programming in Ruby, where more than a few parts of the standard library declare themselves to be security vulnerabilities, but there are no code-level protections to warn a programmer. I'm thinking specifically about FileUtils.rm_r, and all the various stdlib serialization modules that provide one or more ways to eval() when loading data. Is it YAML.load(x) or YAML.parse(x).to_ruby that's safe? I use YAML.safe_load, but how can I trust third-party libraries when the stdlib makes this SO EASY to get wrong?
+1 for the Golang maintainers and Go's approach here. Another reason to use Go...
If you're working with attacker controlled input, at best you can avoid evals while deserializing. As soon as you use the result, a sufficiently clever and informed attacker can almost certainly own you. YAML is just too powerful for anything else.
If you're only using YAML as JSON with different syntax, that's a different story...but then you should just pass the library the deserialized data.
> As soon as you use the result, a sufficiently clever and informed attacker can almost certainly own you.
This is equivalent to saying that we should just not write any program, ever.
> YAML is just too powerful for anything else.
The problem the original author is getting at is that the library makes unsafe operations extremely easy to do. YAML the language is not inherently unsafe — it just serializes a data structure. But several YAML libraries (I believe both Ruby and Python are in this bucket) make it extremely easy to create objects of any arbitrary type, regardless of what the programmer expects, making arbitrary code execution either easy or trivial. Were the parse/load function something like,
parse(input: serialized yaml, whitelisted_types)
that only allowed reconstruction of core YAML types + whitelisted types that the user needs for his specific use case, the API would be fairly safe. (You could still shoot yourself in the whitelisted types part, of course, but again, yes, any user-controlled input handled poorly "could" be a bug.)
This is a property of the construction of the API itself: the API encourages misuse.
I'm not trying to argue whether or not the API is well designed (the designers can do that if they so wish). My point is a pragmatic one: Saying the API lets you shoot yourself in the foot makes people believe that if they just use the right incantation everything will be fine and easy. YAML is just not that kind of format.
If I was designing the API today, I would name YAML::load something like YAML::unsafe_load because loading YAML is dangerous. Guiding naive users to a high-restricted subset of YAML is good. Making them think that they just need to avoid "easy foot-guns" is not.
> I'm not trying to argue whether or not the API is well designed (the designers can do that if they so wish). My point is a pragmatic one: Saying the API lets you shoot yourself in the foot makes people believe that if they just use the right incantation everything will be fine and easy.
But that's my point: if you use the right incantation, everything should be fine-and-easy, even in languages like Ruby and Python. The larger point is that the library user shouldn't need to know the "right incantation"; the library should make the safe thing the default, and you should need to very explicitly shoot yourself in the foot.
> YAML is just not that kind of format.
> If I was designing the API today, I would name YAML::load something like YAML::unsafe_load because loading YAML is dangerous. Guiding naive users to a high-restricted subset of YAML is good. Making them think that they just need to avoid "easy foot-guns" is not.
Reading your comment, I get the impression that you think YAML, as a format is unsafe; this isn't the case in any manner that I can see, and I explained a bit of that argument in my previous comment. The security issues have been around implementations of YAML libraries that allow the deserialization of custom tags that correspond to arbitrary language-specific objects (the !ruby tags). This isn't required by the YAML specification, and a library shouldn't do it by default because arbitrary object construction is dangerous. But YAML, as a format, doesn't require this; again, the core types + a whitelisted set of types covers 99% of use cases, and is safe.
(And I think tagging is a great feature that basically appears in no other serialization format that I'm aware of. CBOR comes close, but you have to register your types w/ IANA.)
Would it be possible to create a Ruby implementation of YAML which is compliant with the YAML v1.2 spec while also avoiding the more dangerous foot-guns? Sure. But the spec is simply a means to an end, and that end -- as per http://www.yaml.org/spec/1.2/spec.html -- is:
> In contrast, YAML's foremost design goals are human readability and support for serializing arbitrary native data structures.
In order to accomplish that goal in any sort of meaningful fashion, you need e.g. the !ruby tags. Hence my noting that the YAML format is not (generally) suitable for deserializing attacker controlled input.
Now here's the thing: You're totally right that the "core" functionality described in the YAML spec can be quite useful for more limited purposes, including possibly even safely deserializing and using attacker controlled input with only an intermediate amount of extra legwork. For better or worse, however, that's not the purpose that the people who designed and implemented YAML were going for. Too many people who comment on how various YAML APIs should be safer (for pragmatic reasons) ignore the truly awful (pragmatic) consequences of those comments when read by people who know far less than them.
> 99% of the PGP-encrypted emails we get to security@golang.org are bogus security reports. Whereas “cleartext” security reports are only about 5-10% bogus.
First guess: people with a high degree of paranoia tend to both think it worthwhile to encrypt their e-mails and incorrectly diagnose security problems where non exist.
Because you can't easily run them through a spam filter (without unlocking the PGP key for the spam filter or having the passphrase in plain text somewhere).
Spam is such a huge problem that most spam filters have gotten terribly over aggressive. Even with proper SPF and DKIM records, Google/Microsoft will ignore/silently drop most mail coming from small personally run e-mail servers:
I've run a pair of email servers - one small (maybe 250 outgoing emails/week) and one larger (80-100k/week) from 2007 to late 2016 and had no problems with deliverability. It's just there are way more factors than SPF & DKIM:
- static IPs that remain the same for years (no cloud)
- IPs not in questionable subnets (no cloud)
- forward-confirmed A DNS records for all sending servers
- a valid SSL cert on MX servers (anecdotal evidence here)
- never rapidly increase email volume. +50% day over day and +100% week over week is safe
- rate limit outgoing email by recipients server/domain
I think my biggest problems include limited e-mail output (<10 week really, because it's just me) and I'm currently on Linode; most likely on a spammy subnet.
I do have static IPs for IPv4 and IPv6. Every spam analysis tool I've thrown at it says all my records are valid. My TLS cert is valid (paid for, but going to move to Lets Encrypt when it expires).
I'm planning on moving to another hosting service like Vultr. They do require individuals apply to host e-mail servers (an unblock the SMTP port) so maybe I'll have better luck there.
Avoid cloud & cloud-like/mostly-vps providers. While they will most likely not have entire IP blocks straight blacklisted there might be a rule that just brings you closer to the spam score threshold.
Oddly enough medium-large enterprise is more likely to include those rules - I've recently seen 20% bounce rate on a large mailing list spanning some 300 companies with most of the bounces being due to AWS IP blocks being blacklisted. That's why using Sparkpost for that kind of recipients is a bad idea (all their sending servers are on AWS).
Going back to hosting - pick a company that:
- lets you set PTR records for your IPs
- lets you get your own IP block and assign it however you want (even it is just a tiny one)
- offers primarily dedicated servers/collocation - even if their vps is going to be a bit more expensive
Honestly, most people I've met with a serious interest in security wind up using some combination of Signal, XMPP OTR, IRC, and unencrypted email to communicate - it wouldn't surprise me that most of GPG's userbase is made up of hobbyist security types with little understanding of what they're doing.
I don't think they should, especially since sending and receiving PGP encrypted emails is easy when you use a non-web mail client.
IMO, it's ultimately a problem with Gmail - which makes integration with PGP hard - and not a problem with PGP emails itself. If I use Thunderbird, or even Mail.app, integrating PGP is as easy as installing a plugin, after which sending or receiving a PGP encrypted email requires a minimum of additional effort (once you have a key in your keyring, it's encrypted by default).
If Gmail made PGP integration easy, there would be no extra time wasted when viewing PGP encrypted emails. The validity signaling would even remain the same, but without the "this is hard for me so don't do it, even though we invite you to do it" idiocy.
> The handling of this issue by the Go maintainers was exemplary. They had no security problem but accepted that so many people misusing the library was a security problem and that they could do something about this.
I disagree with the author here. A poorly-designed API that encourages rampant insecure misuse is a security vulnerability.
A poorly-designed API, which they fixed, despite backwards incompatibility.
If you're going to insist that everybody gets everything security-related exactly right the first time, well, you may be morally in the right but you're signing up for a lot of disappointment.
I never said they had to get it right from the start, and I'm happy they fixed the API. My only contention was the author's (incorrect, in my view) distinction between a security vulnerability and an API design that encourages security vulnerabilities.
The latter are worse in my opinion, because they're harder to fix.
There is no sharp definition here. The relative ease or difficulty of using an API in secure ways is unavoidably imprecise with vague boundaries. It is a continuum, not a binary value.
That said, an issue like the one reported is inarguably on the wrong side of that boundary, both conceptually (not requiring host key verification by default) or empirically (by the data collected by the reporter).
Thanks, though, for your unnecessary, gratuitous, and frankly absurd editorialization.
API design is critically important to security whether or not you believe it to be — we've learned this the hard way through decades of treating keys and IVs as "stringly typed", as overwhelming numbers of developers use `rand` to generate keys and use hardcoded IVs. An API design that encourages misuse will be misused, and the onus to improve the default situation should be on the author of such APIs.
well I upvoted you but I can't fully agree.
While you are write that a poorly-designed API that will make my code insecure is a bad design, however sometimes you just want the insecure design.
Something like making a insecure web service call to a legacy device that mishandles every secure request.
However I think it should at least than give a compiler warning/or a runtime warning.
Also changing a API of a stdlib is probably not a cool way. I like Java that they don't change things so fast, however in Java they rate of change is too slow, after deprecating something it should be removed some day (java never did), however in go/rust the rate of change is way too much. (and I really like rust) well both have a different release procedure than java, but still when I develop something I really hope that my API keeps the same at least 3 years or more.
Having a way to "break the seal" and void the warranty is fine, but it should require conscious decision making. Designing APIs such that the easiest, most straightforward use of the API voids any security guarantees just results in situations like the one that happened here, where everyone gets it wrong.
>We finally have a realistic shot at migrating a lot of security-boundary services on Unix-heritage systems to code written in languages not particularly prone to buffer overflow attacks.
I'm not that optimistic. There have been languages surface in the past that were "not particularly prone to buffer overflow attacks."
I really think that hardware bounds checking will need to be prevalent in the hardware ecosystem before we can really be rid of buffer overflows. It has to become cheap -as in almost free- to check bounds to stop buffer overflows from being a problem.
Most languages from the past 40 years have either been interpreted or JIT'ed. While this has many benefits, it also creates a walled garden where software can only talk to other software from the same ecosystem. You have to make a conscious effort to make Java code talk to Python code, for example.
Systems-level programmers are more interested in targeting the "least common denominator", writing code that runs directly on the CPU without any outside help apart from standard OS calls. This is why almost every language has an FFI bridge that can talk to plan C - C is the least common denominator on any given system, so everybody knows how to talk to it.
To have a shot at replacing C, a new language needs to operate at the same level of the stack, relying only on the raw CPU and OS. C++ can do this, but it doesn't offer much safety benefit over C. ObjectiveC is the same.
This is why Go and Rust are so exciting. They are the first languages in a long time that actually run at the same level as C. Rust has the additional benefit of not even requiring a garbage collector, so it can be used in embedded microcontrollers and OS kernels. Go's garbage is a downside, but maybe not a fatal one.
Besides, even if we had hardware bounds checking, you would still need a language that could take advantage of that. C's memory model is too loose, so there is no place to even hook the bounds-checker in. Think about this:
struct Data {
char name[20];
void *next; // Points to another `Data` instance
}
If you try writing a string longer than 20 characters to `((Data *)data.next)->name`, there is no way the compiler could ever type-check that, or tell the hardware bounds checker where the limit is. This is why C has to go.
Go has the same interop issues as any other language. Indeed, what's the difference between a Go binary and an AOT compiled .NET/JVM program? Besides Go making that scenario the default?
Go and Rust aren't in the same class. Rust can call and be called from C with no overhead or issues (OK, panic across FFI isn't allowed). It's drop-in compatible with C libs, in addition to being a far better language.
I get your enthusiasm, but there are libraries for c that let you use the bounds check instructions in the latest Intel cpu's. (I think there is a pretty good write up about these new instructions on anandtech.)
It's not as sexy as a new language but it allows for incremental adoption and doesn't present more effort. Even if it would be a good idea, many people will adopt incremental improvements instead of going for a rewrite.
Bounds checking in hardware isn't a new phenomenon, either. The Burroughs had hardware bounds checking, I think.
Rust and Go are interesting, I agree. I'm just slightly more suspect about things in tech that have such vocal "evangelism". If they help, that will be seen over time as large projects adopt them. In your effort to make your point you claim that C is "too loose" to take advantage of hardware bounds checking. That kind of thing makes people mistrust the passion that advocates for these languages display.
I'd like to see some hardware that implements records in, well, hardware. collections of fields with counts and sizes. it'd be nice to do away with the idea that we're haphazardly (and implicitly) overlaying things like structs onto an untyped one-dimensional array of bytes (or in some cases, machine words).
that to me is more important than straight bounds checking for arrays, because it implies something a little stronger.
> 99% of the PGP-encrypted emails we get to security@golang.org are bogus security reports. Whereas “cleartext” security reports are only about 5-10% bogus.
This indicates a failing of the person who reads golang security emails. This person should be using mutt or something with similarly good support for PGP. Reading encrypted emails with a correctly configured setup is only marginally more difficult than reading cleartext emails (that is, you have to enter the key's password).
You can make arguments all day about PGP usability for the average person, but not for the person reading security emails for a programming langauge maintained by Google.
While theoretically the case... perhaps if real security people reporting real issues were to use GPG, there'd be an incentive to set it up properly. In practice, if 99% of emails you get via GPG are no better than spam - and in fact worse, since they require that you waste your time figuring out whether there is actually a bug - then GPG mostly just makes for a pretty good spam filter.
Back in the real world, most security-minded people realise there's little chance that anything particularly awful is going to happen as a result of sending the report as an unencrypted email, and GPG is pretty dreadful to use so next to nobody uses it except for hobbyists.
Because that person is the person reading security@golang.org, which advertises a PGP key for encrypting emails sent to them. This is the security channel for golang. They should be comfortable using PGP and if they're not then they either need to learn it or they're the wrong person for the job.
> If I program in Go, the language maintainers defend me from my mistakes.
> If you’re choosing software to deploy on your systems, which language would you rather that the vendors be writing in?
I think this is my favorite take-away. I spend a lot of time programming in Ruby, where more than a few parts of the standard library declare themselves to be security vulnerabilities, but there are no code-level protections to warn a programmer. I'm thinking specifically about FileUtils.rm_r, and all the various stdlib serialization modules that provide one or more ways to eval() when loading data. Is it YAML.load(x) or YAML.parse(x).to_ruby that's safe? I use YAML.safe_load, but how can I trust third-party libraries when the stdlib makes this SO EASY to get wrong?
+1 for the Golang maintainers and Go's approach here. Another reason to use Go...