Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
OpenSSL Security Advisory (openssl.org)
329 points by arkadiyt on March 25, 2021 | hide | past | favorite | 141 comments


So if I am reading this right, in addition to a null dereference, turning on strict certificate validation actually disabled the check asserting that non-CA certificates cannot issue other certificates?


Correct, but only in the rare case where the "purpose" value has been overridden to be empty.




Isn’t the vulnerability fixed in 1.1.1j, not 1.1.1d? And “j” is still in the “testing” repo.


1.1.1d is fixed if you patch that part of the code.

The version going from "+deb10u4" to "+deb10u6" indicates that there are supplementary changes that the Debian project made on top of the vendor release of 1.1.1d.

Going from 1.1.1d to e, f, g, h, i, or j may introduce other behaviour changes that could be undesirable, so Debian 10 (in this example) is frozen using 1.1.1d and any bugs are dealt with on an as-needed basis.


i.e. backported

(very common with long term support distros esp Red Hat)


So if I understand it correctly, the impact of these two "high" vulnerabilities is:

- if you use the non-default X509_V_FLAG_X509_STRICT flag and use the new openssl 1.1.1h feature to "disallow certificates in the chain that have explicitly encoded elliptic curve parameters" and configure a custom "purpose" value in your software then this second feature (disallowing explicit curve params) was ineffective, or

- if you run a TLS1.2 server and renegotiation is enabled then someone can craft a packet that will crash the server (e.g. Nginx disabled renegotiation in 2009 and seems thus unaffected).


> this second feature (disallowing explicit curve params) was ineffective

Not quite. It would override the previous check about being signed by a valid CA.

I imagine the implementation was something like this:

    is_valid = true;
    is_valid &= signed_by_valid_ca(cert);
    if (check_x509_strict) {
        is_valid = !has_explicitly_encoded_curve_params(cert);
    }
    is_valid &= some_other_test(cert);
    is_valid &= yet_another_test(cert);
And it should have been:

        is_valid &= !has_explicitly_encoded_curve_params(cert);


Can someone translate this for us dummies. Am I at risk of the DoS attack if I have TLSv1.2 enabled in Nginx?


Afaik Nginx doesn't do TLS renegotiation so I think you're safe.

Disclaimer: I know nothing.


This was my understanding, too. I checked:

http://nginx.org/en/CHANGES

>Changes with nginx 1.13.0

>Change: SSL renegotiation is now allowed on backend connections.


https://stackoverflow.com/a/20001598/843116

Looks like it since 0.7.64 or 0.8.23.


The backend has easier ways to DoS though. Like rejecting connections.


Does anyone know about apache?


It depends on the version of apache and openssl. Check the version of openssl that your apache binary is dependent on. All versions of 1.1.1 before 1.1.1k are vulnerable.

  ubuntu:~$ dpkg -s apache2-bin | grep ^Depends | sed -e 's/, /\n/g' | grep libssl | awk '{print $1}' | xargs dpkg -s | grep ^Version
  Version: 1.1.1j-1+ubuntu18.04.1+deb.sury.org+3
First try to just upgrade openssl on your system. Check the package's changelog (ex: http://changelogs.ubuntu.com/changelogs/pool/main/o/openssl/...) to see if a fix has been backported into it, the version number may not indicate it. If you can't tell, try to install an older 1.0.x version. Then restart apache. (The magic of dynamic libraries... it'll be fun when Go's ssl library has a bug)

If that doesn't work, try configuring SSLOptions -OptRenegotiate and then point ssllabs at it to see if reneg is disabled.

If that doesn't work, recompile apache against a not-vulnerable version of openssl. Maybe the easiest way to do that is take the Dockerfile (https://github.com/docker-library/httpd/blob/master/2.4/Dock...), take out libssl-dev, compile a specific openssl version, then link against it.



> This issue was reported to OpenSSL on 18th March 2021 by Benjamin Kaduk from Akamai and was discovered by Xiang Ding and others at Akamai. The fix was developed by Tomáš Mráz.

Thank you Akamai for reporting this, and thank you Tomáš for fixing this.



thats the first fix, here's the null pointer deference: https://github.com/openssl/openssl/commit/02b1636fe3db274497...


Not a Rust troll (never used it) but would this be one of those things Rust would prevent?


Ehh, sort of. This is a NULL pointer read; it's a crasher that can't be further weaponized. Approximately the same thing happens in Python, Javascript, Java, and Go when you mis-handle a nil value.

Rust goes through some trouble to avoid nil values altogether, and it's great. But in practice, applying matching and iflets to every return value everywhere makes for very noisy code (I mean, Rust is already very noisy, but bear with me), so the idiom in the language is to call "unwrap" on Option and Result values that are "known" to be safe. A mishandled "unwrap" will do approximately the same thing to your program as dereferencing a NULL pointer will.

Rust helps here a lot, more than most other languages. It does not foreclose on this kind of bug, though, the way it (and other memory safe languages) foreclose on other memory corruption vulnerabilities.

(This is notable not the case for NULL pointer _writes_, which occasionally can be weaponized. Use Rust in preference to C.)


Dereferencing null is UB; there is no guarantee that it will compile to something that segfaults or that it won't be exploitable. This is unlike None.unwrap(), which is a guaranteed panic.

Exploitable null dereference, for anyone who needs a reminder: https://lwn.net/Articles/342330/


The example you've provided isn't simply a NULL pointer dereference. The attacker had control over memory mapping! NULL pointers can (uncommonly) be exploitable --- especially in the kernel, where the 0 address can be mapped --- but you can't generally exploit them simply by attempting to read from them. The most common general pattern I'm aware of is a write, through a NULL pointer, that includes an unbounded offset.

This isn't that.


`?` would be the natural thing to do in Rust, rather than unwrap.


I know that's the case, but I looked a bunch of Rust codebases that I had lying around (exa, Firecracker, netlink, Servo), and I see a ton of `unwrap`s, too; not just in test code.


Yeah, I made this mistake in trust-dns. Went back and thought better on it, and replaced all unwraps with proper checks. After that no more random crashes, but you're correct that it's easy even in Rust to say, "I'm sure this is never None in this case", and be very wrong.

The Rust linter (clippy) does perform a bunch of checks on unwrap usage now. For example, a common mistake before TryFrom was added to the stdlib was to implement From between two types and do an unwrap on conversion (think String -> enum variant), and clippy will now suggest TryFrom as a replacement since the unwrap is hiding a fallible case.

The fact that folks use unwrap so much in tests and examples can also mislead people new to the language that this is a common practice.


I thought about that --- particularly for Firecracker, which has so much test code --- and no, it's in the actual code too.

I think Rust does more than any other mainstream language to mitigate this problem. I'm just saying, it still exists in Rust; it's just called a panic on unwrap/expect, instead of "null pointer exception".


Using unwraps is often the “proper check”. Trying to come up with error handling in cases where invariants are not met is usually done wrong.


That's not a check at that point, though, it's an assertion that the None case is invalid.

I'm not saying you should never use unwrap, but I've been burned by it when treating it a little too nonchalantly. If you're implementing a lower level library, like a dns stub resolver, panicking will bring down the software using your library... which is definitely not something people are generally happy about.

Error handling isn't always the right thing, sometimes just returning Option is the correct thing.


Yeah it may be valid, barring careful consideration of the exact situation. My canonical example is a web server.

In the context of handling a web request, never unwrap, because a web server should never die no matter what kind of weird or crazy thing a client somewhere sends it. Check every error for everything in that context, always at the very least drop back to a code path that returns a 500 and keeps the server running.

In the context of server init though, unwrap is good. If anything in the server init process goes wrong, that means I as the developer / deployment engineer did something wrong. The server should then blow up to let me know that something is very wrong and needs fixing. The server definitely should not try to rattle to life anyways and run in some sort of degraded state where it can't handle requests correctly.


Usually `.expect` is better, as you can explain the error.


It doesn't appear they pre-notified the major distributions - at least I'm not seeing updates already available as normally happens when it is coordinated.

Ubuntu 20.04 just dropped openssl 1.1.1f-1ubuntu2.3


https://launchpad.net/ubuntu/+source/openssl/1.1.1f-1ubuntu2...

it address 3449, so they backported the fix to 1.1.1f



If the cloud companies just paid a team of five people 200K each to spend a year rewriting OpenSSL from scratch, they would save multiple millions in scrambling to deploy bug fixes.


Nope, they'd just create new software with different bugs that have not been discovered yet. Then we'd all be scrambling to fix those bugs.


Not if the implementation is formally verified, like miTLS [0] and EverCrypt [1]. Parts of the latter were integrated into Firefox, which even provided a performance boost (10x in one case) [2].

I think what is needed is something like EverCrypt but for TLS. Or in other words, something like miTLS but which extracts to C and/or Assembly, to avoid garbage collection and for easy interoperation with different programming languages (preferably including an OpenSSL-compatible API for backwards compatibility).

[0]: https://mitls.org/ [1]: https://hacl-star.github.io/HaclValeEverCrypt.html [2]: https://blog.mozilla.org/security/2020/07/06/performance-imp...


Formally verified against what? and what assumption were made?

miTLS is formally verified against the TLS spec on handshaking, assuming the lower-level crypto routine are good. It is not even free from timing attack.

EverCrypt have stronger proof, but it is only safe as in not crashing and correct as in match the spec in all valid input. It is not proved to be free from DoS or invalid input attack.

OpenSSL do more then TLS. Lots of interesting thing are in the cert format parsing and management.


> Formally verified against what? and what assumption were made?

Well, tell me again, what is OpenSSL formally verified against? What assumptions were made in OpenSSL?

Formal verification does not eliminate very large classes of bugs only when absolutely everything is formally verified, including timing attacks. Instead, it consistently produces more reliable software, many times even when only certain basic properties are proved (such as, memory safety, lack of integer overflows or division by zero, etc).

Formal verification can be a continuum, like testing, but proven to work for all inputs (that possibly meet certain conditions, under certain assumptions). The assumptions and properties that are proven can always be strengthened later, as seen in multiple real-world projects (such as seL4 and others).

The result is code that is almost always a lot more bug-free than code that is not formally verified. And as I said, more properties can be proven over time, especially as new classes of attacks are discovered (e.g. timing attacks, speculation attacks in CPUs, etc).


It's also not like you'd just give up on fuzzers because you have formally verified code. All the same tools should still be available, plus one more.


To formally verify an implementation you need a... formal description of what is to be implemented and verified.

Constructing a formal specification from RFC8446 is possible.

Constructing a formal specification for PKIX... is not. PKIX is specified by a large number of RFCs and ITU-T/ISO specs, some of which are more formal than others. E.g., constructing a formal specification for ASN.1 should be possible (though a lot of work), while constructing a formal specification for certificate validation is really hard, especially if you must support complex PKIs like DoD's. Checking CRLs and OCSP, among other things, requires support for HTTP, and even LDAP, so now you have... a bunch more RFCs to construct formal descriptions of.

And there had better be no bugs in the formal descriptions you construct from all these specs! Recall, they're mostly not formal specs at all -- they're written in English, with smatterings of ASN.1 (which is formal, though the specs for it, though they're very very good, mostly aren't formal).

The CVE in question is in the PKIX part of OpenSSL, not the TLS implementation.

What you're asking for is not 5 man-years worth of work, but tens of man-decades. The number of people with deep knowledge of all this stuff is minute as it is -- maybe just a handful, tens at most. The number of people with deep knowledge of a lot of this stuff is larger, but still minute. So you're asking to spend decades' worth of a tiny band of people's time on this project, when there are other valuable things for them to do.

The number of people who can do dev work in this space is much larger, of course -- in the thousands. But very few of them have the right expertise to work on a formal, verified implementation of PKIX.

Plus, it's all a moving target.

Sure, we could... train a lot of people just for such a project, but it takes time to do that, and it still takes time from that tiny band of people who know this stuff really well.

I'm afraid you're asking for unobtanium.

EDIT: Plus, there's probably tens of millions of current dollars' (if not more) worth of development embodied in OpenSSL as it stands. It's probably at least that much to replace it with a verified implementation, and probably much more because the value of programmers expert enough to do it is much more than the $200K/year suggested above (even if you train new ones, it would take years of training, and then they would be just as valuable). I think a proper , formally verified replacement of OpenSSL would probably run into the hundreds of millions, especially if it's one huge project since those tend to fail.


Well, sure, if your start with those premises, then I'm not surprised that you reach the conclusion that the goal is unachievable.

First of all, if constructing a formal specification for PKIX is not possible, then that should be telling you that it either needs to be simplified, better specified or scrapped altogether for something better (the latter would require an extremely large transition period, I'm imagining, so the first two are much preferred in this situation).

Otherwise, how can you be sure that any implementation in fact implements it correctly?

> And there had better be no bugs in the formal descriptions you construct from all these specs!

Well, I don't think that is true. You should in fact allow bugs in the formal description, otherwise how will you ever get anything done in such a project?

You see, having a formal description with a bug is much better than having no formal description at all.

If you have no formal description, you can't really tell if your code has bugs. If you have a buggy formal description, then you are able to catch some bugs in the implementation and the implementation can also catch some bugs in the formal description.

Also, some parts of the formal description can catch bugs in other parts of the formal description.

So the end result can be strictly better than the status quo.

> Plus, it's all a moving target.

Sure, but hopefully it's a moving target moving in the direction of simplification rather than getting more complex, otherwise things will just get worse rather than get better, regardless if we keep with the status quo or not. I'm not a TLS expert by any means but I think TLS 1.3 moved in that direction for some parts of the protocol, at least (if I'm not mistaken).

Also, I think you are not fully appreciating that formal verification can be done incrementally.

You can start by doing the minimum possible, i.e. simply verifying that your code is free of runtime errors, which would eliminate all memory safety-related bugs, including Heartbleed.

This would already be better than reimplementing in Rust, because the latter can protect from memory safety bugs but not other runtime bugs (such as division by zero, unexpected panics, etc).

BTW, this minimal verification effort would already eliminate the second bug in this security advisory.

You can then verify other simple properties, even function by function, no complicated models are even necessary at first.

For example, you could verify that your function that verifies a CA certificate, when passed the STRICT flag, is really more strict than when not passed the STRICT flag.

This would eliminate the first bug in this security advisory and all other similar bugs in the same function call-chain.

BTW, what I just said is really easy to specify, and I'm guessing it's also very easy to prove, since I'm guessing that the strict checks are just additional checks, while the normal checks are shared between the two verification modes.

Many other such properties are also easy to prove. The more difficult properties/models, or even full functional verification, can be implemented by more expert developers or even mathematicians.

I think the problem is also that OpenSSL devs, like the vast majority of devs, probably have no desire/intention/motivation/ability to do formal verification, otherwise you could even do this in the OpenSSL code base itself (although that is not ideal because verifying a program written in C requires more manual work than verifying it in a simpler language whose code extracts to C).

I'm also guessing that your budget estimate is an exaggeration since miTLS and EverCrypt, although admittedly projects whose scope has still not reached your ambitious goals (i.e. full functional verification of all layers in the stack), was probably done with a much smaller budget.

And it's not like you can't build on top of that and incrementally verify more properties over time, e.g. more layers of the stack or whatever.

You don't need a huge mega-project, just a starting point, and sufficient motivation.


> Well, sure, if your start with those premises, then I'm not surprised that you reach the conclusion that the goal is unachievable.

I didn't say unachievable. I said costly.

> First of all, if constructing a formal specification for PKIX is not possible, then that should be telling you that it either needs to be simplified, better specified or scrapped altogether for something better (the latter would require an extremely large transition period, I'm imagining, so the first two are much preferred in this situation).

https://xkcd.com/927/

The specs for the new thing would basically have to be written in Coq or similar. Even if you try real hard to keep it small and not make the... many mistakes made in PKIX's history... it would still be huge. And it would be even less accessible than PKIX already is.

> For example, you could verify that your function that verifies a CA certificate, when passed the STRICT flag, is really more strict than when not passed the STRICT flag.

That's just an argument for better testing. That's not implementation verification.

> This would eliminate the first bug in this security advisory and all other similar bugs in the same function call-chain.

Only if you thought to write that test to begin with. Writing a test for everything is... not really possible. SQLite3, one of the most tested codebases in the world, has a private testsuite that gets 100% branch coverage, and even that is not the same as testing every possible combination of branches.

> BTW, what I just said is really easy to specify, and I'm guessing it's also very easy to prove, since I'm guessing that the strict checks are just additional checks, while the normal checks are shared between the two verification modes.

It's not. The reason the strictness flag was added was that OpenSSL was historically less strict than the spec demanded. It turns out that when you're dealing with more than 30 years of history, you get kinks in the works. It wouldn't be different for whatever thing replaces PKIX.

> I think the problem is also that OpenSSL devs, like the vast majority of devs, probably have no desire/intention/motivation/ability to do formal verification, otherwise you could even do this in the OpenSSL code base itself [...]

You must be very popular at parties.

> I'm also guessing that your budget estimate is an exaggeration since miTLS and EverCrypt, [...]

Looking at miTLS, it only claims to be an implementation of TLS, not PKIX. Not surprising. EverCrypt is a cryptography library, not a PKIX library.


> That's just an argument for better testing. That's not implementation verification.

No, I'm not talking about testing, I'm talking about really basic formal verification:

forall (x: Certificate), verify_certificate(x, flags = 0) == invalid ==> verify_certificate(x, flags = STRICT) == invalid

This is trivial to specify and almost as trivial to prove to be correct for all inputs. Testing can't do that, no matter how good your testing.

> Only if you thought to write that test to begin with. Writing a test for everything is... not really possible.

I agree, but I'm not talking about testing. I'm talking about formal verification.

> It's not. The reason the strictness flag was added was that OpenSSL was historically less strict than the spec demanded. It turns out that when you're dealing with more than 30 years of history, you get kinks in the works. It wouldn't be different for whatever thing replaces PKIX.

That's totally fine and wouldn't affect the verification of that function at all. This is a very simple property to verify and it would avoid this bug and all similar bugs in that function (even if that function calls other functions, no matter how large or complex they are).

Other functions also have such easy to verify properties, this is not hard at all to come up with (although sure, some more difficult properties might be harder to verify).

> You must be very popular at parties.

I didn't want to be dismissive of OpenSSL devs or other devs in general, I just find it frustrating that there are so many myths surrounding this topic, and a lot less education, interest and investment than I think there should be, nowadays.


You are confusing verification as in "certificate" with verification as in "theorem proving", and you are still assuming a formal description of what to verify (which I'll remind you: doesn't exist). And then you go on to talk about myths and uneducated and uninterested devs. Your approach is like tilting at windmills, and will achieve exactly as much.


If I understood you correctly, then I am not confusing those two things.

Maybe you haven't noticed but I actually wrote a theorem about a hypothetical verify_certificate() function in my previous comment. Maybe you also haven't noticed, but I didn't need a formal description of how certificate validation needs to be done to write that theorem.

And I assure you, if the implementation of verify_certificate() is anything except absolute garbage, it would be very easy to prove the theorem to be correct. I actually have some experience doing this, you know? I'm not just parroting something I read, I've proved code to be 100% correct multiple times using formal verification (i.e. theorem proving) tools. That is, according to certain basic and reasonable assumptions, of course, like e.g. the hardware is not faulty or buggy while running the code, and that the compiler itself is not buggy, which would be a lot more rare than a bug in my code -- and even then, note that compilers and CPUs can be (and have been) formally verified.

Maybe you don't agree with my approach but I think it's the most realistic one for a project such as this and I am absolutely confident it would be practical, would completely eliminate most (i.e. more than 50%, at least) existing bugs with minimal verification effort (which almost anyone would be capable of doing with minimal training), and would steadily become more and more bug-free with additional, incremental verification (i.e. theorem proving) and refactoring effort.


https://project-everest.github.io/ :

> Focusing on the HTTPS ecosystem, including components such as the TLS protocol and its underlying cryptographic algorithms, Project Everest began in 2016 aiming to build and deploy formally verified implementations of several of these components in the F* proof assistant.

> […] Code from HACL*, ValeCrypt and EverCrypt is deployed in several production systems, including Mozilla Firefox, Azure Confidential Consortium Framework, the Wireguard VPN, the upcoming Zinc crypto library for the Linux kernel, the MirageOS unikernel, the ElectionGuard electronic voting SDK, and in the Tezos and Concordium blockchains.

S2n is Amazon's formally verified TLS library. https://en.wikipedia.org/wiki/S2n

IDK about a formally proven PKIX. https://www.google.com/search?q=formally+verified+pkix lists a few things.

A formally verified stack for Certificate Transparency would be a good way to secure key distribution (and revocation); where we currently depend upon a TLS library (typically OpenSSL), GPG + HKP (HTTP Key Protocol).

Fuzzing on an actual hardware - with stochastic things that persist bits between points in spacetime - is a different thing.


Funny, the first hit for that search you linked is... my comment above. The "few things" other than that are for alternatives to PKIX, which is fine and good, but PKIX will be with us for a long time yet. As for Everest, it jives with what I wrote above, that verified implementations of TLS are feasible (Everest also implements QUIC and similar), but -surprise!- not listed is PKIX.

I know, it sounds crazy, really crazy, but PKIX is much bigger than TLS. It's big. It's just big.

The crypto, you can verify. The session and presentation layers, you can verify. Heck, maybe you can verify your app. PKIX implementations of course can be verified in principle, but in fact it would require a serious amount of resources -- it would be really expensive. I hope someone does it, to be sure.

I suppose the first step would be to come up with a small profile of PKIX that's just enough for the WebPKI. Though don't be fooled, that's not really enough because people do use "mTLS" and they do use PKINIT, and they do use IPsec (mostly just for remote access) with user certificates, and DoD has special needs and they're not the only ones. But a small profile would be a start -- a formal specification for that is within the realm of the achievable in reasonably short order, though still, it's not small.


Both a gap and an opportunity; someone like an agency or a FAANG with a budget for something like this might do well to - invest in the formal methods talent pipeline and - very technically interface with e.g. Everest about PKIX as a core component in need of formal methods.

"The SSL landscape: a thorough analysis of the X.509 PKI using active and passive measurements" (2011) ... "Analysis of the HTTPS certificate ecosystem" (2013) https://scholar.google.com/scholar?oi=bibs&hl=en&cites=16545...

TIL about "Frankencerts": Using Frankencerts for Automated Adversarial Testing of Certificate Validation in SSL/TLS Implementations (2014) https://scholar.google.com/scholar?cites=3525044230307445257... :

> Our first ingredient is "frankencerts," synthetic certificates that are randomly mutated from parts of real certificates and thus include unusual combinations of extensions and constraints. Our second ingredient is differential testing: if one SSL/TLS implementation accepts a certificate while another rejects the same certificate, we use the discrepancy as an oracle for finding flaws in individual implementations.

> Differential testing with frankencerts uncovered 208 discrepancies between popular SSL/TLS implementations such as OpenSSL, NSS, CyaSSL, GnuTLS, PolarSSL, MatrixSSL, etc.

W3C ld-signatures / Linked Data Proofs, and MerkleProof2017: https://w3c-ccg.github.io/lds-merkleproof2017/

"Linked Data Cryptographic Suite Registry" https://w3c-ccg.github.io/ld-cryptosuite-registry/

ld-proofs: https://w3c-ccg.github.io/ld-proofs/

W3C DID: Decentralized Identifiers don't solve for all of PKIX (x.509)?

"W3C DID x.509" https://www.google.com/search?q=w3c+did+x509


Thanks for the link about frankencerts!


Doing security is hard, it wouldn't be as easy as you describe to replace it! OpenSSL has some issues from time to time, but like any code that exists. However, it also benefits from billions of real-world use-cases and deployments, research work and analysis, that would be very hard to reproduce on an entierly new code base.


There have been attempts of repairing openssl. Every time it's a big problem due to lack of adoption.

LibreSSL would solve a lot of the surface area problems OpenSSL has, but people cling to OpenSSL under the guise that "a lot has improved" and of course that change is hard.

The point I'm trying to make is that there _are_ better SSL libraries, but none have the adoption of openssl, and adoption has inertia; nobody is going to go out of their way to avoid OpenSSL because a bug in openssl is not their fault.

Just like AWS's availability, people work on the premise that if AWS is down (or OpenSSL has bug!) then everyone is affected and therefore they can't be blamed.


I've been making personal moves towards libressl and openbsd in general. there's just way less friction involved in maintaining those servers vs my linux ones running stuff like this that constantly leaves me hitting up my clients like 'yeah we need to do this at midnight'

If libressl is affected by this too I would be incredibly surprised. They really did do a great job of removing all the bad stuff for something that just works as a drop-in replacement compatible with scripts and everything.


> a drop-in replacement

It is not. Many cert attribute related function are missing.


Damn that sucks. Works for me on literally all my client's sites otherwise, I feel happy as a clam... If it doesn't work for you I'm really curious on what's missing though, could you point out specifics?


A variant of

> No has ever been fired for choosing IBM/Microsoft/Oracle/etc.


Serious question: who are the decision makers on "using OpenSSL" vs other stuff? It feels like most application level people will be using whatever gets packaged with whatever framework/HTTP server they use, so shouldn't there be relatively few people to convince here? Or is that a big misread of the situation


It's an integral part of operating systems often pre-installed on there these days, so I think you could probably point out distros who have tried and failed to replace this component that they have to ship. I think a few distros tried libressl and had to end up going back because of reasons I'm not entirely sure of, but I suspect enterprise interests has something to do with anything like this breaking compat and causing them to have to do more work


For most people that’s kinda like asking who’s the decision maker on what brand of bearings get used in your cars engine.

It’s just not even on most people’s radar, it’s just something that comes along with the web hosting/Linux disto/AMI/SaaS they buy.

Most people don’t even think about whether Ford or Toyota make the bearings in their engines (they don’t), a very very few people tear down their engines and replace bearings and most of them will just order new ones from the car’s manufacturer. A tiny percentage of people tearing down engines will actively make decisions about whether Ford/Toyotas bearings are right for their project, and maybe if they’re drag racers or rally car builders they’ll choose different bearings. There’s an even smaller number of people who build engines from scratch, who’ll decide for themselves the right bearings for them. There are maybe a few dozen people on the planet who work at major car manufacturers who specify 99.999+% of all the engine bearings used in current passenger cars.

FAANG are kinda like Formula1 teams here, they work very closely with their engine manufactures to ensure they have the best possible bearings/ssl implementations for their specific use cases. Radha/Cannonical/Microsoft/et al. are the car manufacturers, who choose bearings/ssl libraries most generally suitable for the expected use and lifespans of their products - without optimising fo any one specific use case.

(And most HN commenters, like me, are the armchair quarterbacks second guessing and speculating endlessly about why a particular racer's engine expired during yesterday's race on internet forums every Monday morning... :-) )


> who are the decision makers on "using OpenSSL" vs other stuff?

Since you said you were being serious then I hope you don't find the answer flippant; The decision makers are usually the people who originally wrote the software.

Once software is written it is unusual to alter dependencies unless there is a very solid reason.

For a great example, check how many projects moved from MySQL to PostgreSQL (or vice-versa) despite fundamental issues in MySQL regarding safety and (previously) performance issues in PostgreSQL.


That is not what's happening with OpenSSL.


> There have been attempts of repairing openssl. Every time it's a big problem due to lack of adoption.

If the repair is a drop-in replacement then the issue shouldn't be too bad. If new one does prove to be more stable in terms of needing security updates to be rolled out, then people will start to adopt it to save the time.

Of course the problem with that is the new version will need to be entirely feature complete from day one, and surprisingly bug free for something of that size or confidence will be too low, and track changes in OpenSSL for long enough until it becomes the de facto standard instead.


OpenSSL's API has a lot of surface area.

As someone who has used it on a project, once you start trying to do "advanced" things like optimize round trips in conjunction with your application layer on top of it, you have to start poking around some obscure parts of that API, where the only documentation is "read the source code". Even some basic stuff like validating that the certificate corresponds to the domain you connected to (like, pretty important!) was historically not done by default and required interfacing with a bunch of low-level stuff. The things you used to have to do to use the Windows system certificate store were also pretty hideous. I believe these things are handled better in more recent versions, but of course much software was written in the past, including mine, and if you wanted it to work (and continue to work, with older versions), you had to do the hideous things.

So a drop-in replacement is really a tall order, and would also require repeating many of OpenSSL's mistakes to achieve true compatibility. Ironically OpenSSL 1.2 itself broke a lot of these APIs in ways that affected my project (and were not, in my opinion, always strictly better).

It's a mess.


OpenSSL's APIs are awful.


LibreSSL hasn't improved anything. Deleting all the code that was #ifdef'd out for old platforms might make you feel good, but it doesn't actually help security because none of the code was compiled anyway.


This does a great disservice to the work done by the OpenBSD guys.

For one thing they removed the home-grown memory allocator, which prevented a lot of issues and allowed debugging tools to notice memory corruption issues.


They also added a nice API, but they don't have manpower for a substantial refactoring.


> allowed debugging tools to notice memory corruption issues

You mean like the bug debian introduced here[1]?

[1] https://www.debian.org/security/2008/dsa-1571


They did more than that. See e.g. this slide: If you use normal coding patterns, then normal linting tools can notice the bugs. The openssl code was hiding the truth from these tools, for no reason.

https://www.openbsd.org/papers/bsdcan14-libressl/mgp00014.ht...


Honest question: does LibreSSL have this same vulnerability?

If they do, fair comment. If not, then they obviously changed something, ipso facto.


Good question. I found this, release 17-03-2021, that seems to be the same bug. So yes, they had the same vulnerability.

https://ftp.openbsd.org/pub/OpenBSD/LibreSSL/libressl-3.2.5-...

Update: AAAACHCHC!! Dates! Always the dates. OK, Very well. For your viewing pleasure:

* Americans: Patch was released on 03-17-2021

* Europeans: Patch was released on 17-03-2021

* World inhabitants: Patch was released on 2021-03-17


This is the fix for the LibreSSL issue [1]: https://github.com/libressl-portable/openbsd/commit/5f00b800...

This is the fix for the OpenSSL issue: https://github.com/openssl/openssl/commit/02b1636fe3db274497...

They don't appear to be related to me. One is a UAF, the other is a NULL pointer dereference.

[1] The LibreSSL issue was found by HAProxy's continuous integration pipeline: https://github.com/haproxy/haproxy/issues/1115. Disclosure: I'm a community contributor of HAProxy, I help maintain the issue tracker and I took part in debugging the issue.


Complete rewrite of decades old software that’s powering all of world is many orders of magnitude more expensive than what you suggest. And only thing you’re guaranteed is bazillions of incompatibilities, bugs, security issues and other horrible stuff.

Especially with such a sensitive thing like crypto. Just off the top - you have to be really careful with implementation to prevent timing attacks. And that’s just one of many hundreds things you need to care about.


If you were going to rewrite OpenSSL, it makes sense to break it into three pieces: ciphers (crypto), protocol, and certificates.

OpenSSL ciphers are generally good, and there's no need to rewrite that.

Certificate and protocol are where most of the tricky bugs would be found.

Protocol is actually not that much code, and a from scratch rewrite that only did TLS 1.2 and 1.3, and only allowed for currently deemed reasonable options wouldn't be too bad to do. Of course, that would eliminate renegotiation, so today's OpenSSL bug wouldn't be possible. I've done a TLS 1.3 client from scratch and it's about a monthish of prototyping, followed by six months to do it right and test and deploy and find bugs and have it audited by a 3rd party and eventually be happy with it. I did the prototyping, another developer did the production code; we did have other developers familiar with TLS to review and assist. I imagine TLS 1.2 would be similar, but having experience from 1.3 would help it along. Server side wouldn't add much, once the session is established everything is the same, it's just having to flip parsing and generating of the handshake pieces.

Certificate/x.509 handling is a beast. I imagine rewriting and testing that would take a long time. You would probably find bugs in other verifiers while you're doing it if you cross test though. You could save a little by limiting to commercially available certificate options and limiting verification modes, but it's still a lot. ASN.1 parsing might be significantly nicer not in C though.


s2n is a good rewrite of just the TLS part.

I concur with everything you wrote.

Re: ASN.1, you really want a compiler for it, not hand-rolled codecs -- I don't care how good your library and macros for hand-rolling ASN.1 DER codecs might be, it's going to yield very difficult to maintain code full of bugs.

ASN.1 is not the biggest part. Certificate validation, revocation status checking, and everything to do with that, and all the many certificate extensions and subject alternative name forms, and... all of that requires a very large body of code, and there's no formal specifications for any of it.


But what if we rewrite it all in Rust bugs become impossible! /s


You jest, but at least in this case looking at both of the changes, these bugs wouldn't have happened in Rust.

Better options for standard data structures and initialization: https://github.com/openssl/openssl/commit/fb9fa6b51defd48157...

Error handling: https://github.com/openssl/openssl/commit/2a40b7bc7b94dd7de8...


I don't think there's a strong argument for the second one in Rust. It's not an error handling bug as much as a simple logic bug; it's aggregating the results of multiple checks.


Perhaps not a strong argument, but in idiomatic Rust (i.e. if you ran clippy on it and it passes checks) I'm fairly certain it would be less likely to happen.

That said, of course you can end up with a similar bug in Rust, there are just nice ways the language encourages practices that would avoid it.


I don't want to belabor this too much. Rust has a lot of exhaustive checking tools that other languages don't, so I don't want to claim that it would do nothing to reduce the likelihood of these kinds of mistakes. But if I go to my 3p directory and grep across all the Rust dependencies I've got, I can find plenty of code that has, like, `let mut checked = false` and that goes on to compute `checked`. That's what basically happened here. None of the individual checks were missed; they were just aggregated into a result improperly.


Moving all ASN/X509 handling to Rust might be worth it though. That stuff is a minefield and has had plenty of memory-safety issues in the past.


Genuine question as I'm not a Rust professional, but if one was able to maintain the C API and headers to produce a drop in replacement for the current openssl implementation, would rewriting it in Rust produce a safer variant?

This would be based off using the #[no_mangle] extensively, I presume, probably with some amount of "unsafe" usage? At which point has the primary use case of Rust in this situation been lost?


What you're probably looking for is this project, which is a C API (I think ABI compatible) wrapper around Rustls that can be a "drop-in" replacement for OpenSSL: https://mesalink.io/

(note: not commenting on if one should or should not use it, only mentioning it's existence)


You're joking, but the audit of rustls last year by Cure53 concluded with "We were unable to uncover any application-breaking security flaw" and "the team of auditors considered the general code quality to be exceptional".

https://github.com/ctz/rustls/blob/master/audit/TLS-01-repor...


Seems like everyone has forgotten the rewrite of Apache that led to the bug flood that was 2.0


I understand the sentiment - but this may not be the case.

A lot of software was written a long time ago with as-of-yet-unestablished good practices.

Almost no concern for security or anything else.

I wonder if they actually took really good best practices both for security and just software in general, used something like Rust, used a very open dev process with a lot of eyes on it - and somehow avoided the feature creep / political feature orientation trap ... if it could not only be re-written, but made more simple.

Maybe it would take 5 years (1 design 1 develop 3 deploy and test) but it would still be worth it.

I believe that security is a much worse problem than it needs to be, because everything we depend on today was designed without much security in mind.


Sure, there’s tons on issues with legacy software like OpenSSL. But in almost all cases, the only way to deal with it is to slowly refactor and modernize it, not throw it away and start from scratch. Especially, if you have lots of things depending on it.


Many of our system are existentially flawed and can't be truly fixed.

Our current OSs for example, were designed in an era where security was no concern. Any app can do 'anything'.

If Linux, Mac and Windows were to have been designed from the start with good containerization, no direct memory pointers (unless special case), if we were using ATM-like networking with identity etc. ... the world would be an entirely different place.

It's obviously unlikely to happen, but if the powers that be actually wanted it to happen, it would.

I would hope that the US military/DARPA designs a system that is hardened in a way that could be made use of in the civilian sector as well. Though it's obviously very unlikely.



That doesn't replace all the PKIX bits from OpenSSL.


BoringSSL is now very widely used. There is also rustls (fully safe rust code) with OpenSSL API available.


Yeah. Thinking about it seriously, the problem is that if you do a “drop in replacement” you’ve just recreated all the existing bugs. If you don’t do a drop in replacement, adoption will be poor, see BoringSSL etc. It’s sort of no-win.


Considering that BoringSSL is the library in Chrome, Android, iOS, and Google's GFE, I imagine it's adoption is not "poor". It is probably the most widespread TLS implementation, by traffic.


Okay, but today I upgraded my Ubuntu box which had the OpenSSL bug on it. Why does my Ubuntu box use OpenSSL? I haven't looked into it, but I imagine Python is using it and God knows what else. Can those things move to BoringSSL? Probably, long term, but they haven't yet because it's non-trivial. No easy answer here. It will probably just take a decade of slowly moving everything using OpenSSL onto something else, piece by piece.


Also cloudflare


Just pay a bunch of talented security researchers to find the bugs. Also donate 200k/year to the OpenSSL team so they have the capacity to fix the bugs whenever found.


If only the billionaires would listen to you... (Mind you, I fully agree with you here.)


My biggest gripe is the inability, or my lack of knowledge on how, to enable observability (metrics and events) on the failure cases. Maybe it is possible with recompiling the code but this won’t help if OpenSSL comes packaged in something like envoy.

I want to be able to see metrics of how often which rejection occurred. Preferably even security events for each failed tls connection including peer identity, and the failure reason.

(I’m silently rooting that someone will tell me I’m wrong and shows that it is possible to do the above...)


If you had something with a lot of traffic logging TLS issues I wonder if you would see noise from packets corrupted by hardware that were not caught by the TCP checksum. That or some low, interesting number of users with very badly misconfigured peers, where it's not your responsibility to fix.

I can't recall the exact details. But i remember once logging crypto failures on a high usage app, and it seemed noisy for something like that. Similarly to collecting logs for i/o failures in the field, if you have legit bugs they are often drowned out by issues caused by bad disks.


Ok. From our basic metrics on failed/successful tls connection established, I can conclude we only are dealing with incorrectly configured clients and not from random glitches. Peers keep connecting properly after they have figured out how the first time.

I would expect the events to be rich enough In information to distinguish corruption from really sending incompatible parameters/certs vs checksum failures.


Envoy comes with ssl support and it uses the google variant of openssl (boringssl).


Ah indeed, I was put on the wrong foot because the error messages are similar.

But it has the same downside of being hard to troubleshoot.


Changes between 1.1.1j and 1.1.1k [25 Mar 2021]

https://www.openssl.org/news/cl111.txt


> An OpenSSL TLS server may crash if sent a maliciously crafted renegotiation ClientHello message from a client. If a TLSv1.2 renegotiation ClientHello omits the signature_algorithms extension (where it was present in the initial ClientHello), but includes a signature_algorithms_cert extension then a NULL pointer dereference will result, leading to a crash and a denial of service attack.

> A server is only vulnerable if it has TLSv1.2 and renegotiation enabled (which is the default configuration). OpenSSL TLS clients are not impacted by this issue.

TLS Renegotiation strikes again!

One of the best parts of TLS 1.3 is that they completely scrapped it. So, IIUC, if you're running only TLS 1.3 then the NULL deference for the (enabled by default) feature in TLS 1.2 will not impact you.

Also a lot of end user facing software such as nginx has renegotiation explicitly disabled so hopefully this won't be particularly far reaching.


"An OpenSSL TLS server may crash if sent a maliciously crafted renegotiation ClientHello message from a client. If a TLSv1.2 renegotiation ClientHello omits the signature_algorithms extension (where it was present in the initial ClientHello), but includes a signature_algorithms_cert extension then a NULL pointer dereference will result, leading to a crash and a denial of service attack."

That sounds like enough info that exploits will be out pretty soon. And it sounds like most software that uses OpenSSL would crash. So, guess I'm expecting broad DoS incidents soon.


> Also a lot of end user facing software such as nginx has renegotiation explicitly disabled so hopefully this won't be particularly far reaching.

Do you know if the same if true for HAProxy ?


According the the link below: "All major software disabled renegotiation by default since as far as 2009 (nginx, haproxy, etc...)."

https://security.stackexchange.com/questions/24554/should-i-...


Wonder if LibreSSL is affected


LibreSSL is a fork of 1.0.1, so you can read this security advisory and get a good guess.


Probably not affected since this bug was introduced after version 1.0.1 that is the version where the fork happened.

> Starting from OpenSSL version 1.1.1h a check to disallow certificates in the chain that have explicitly encoded elliptic curve parameters was added as an additional strict check.

> An error in the implementation of this check meant that the result of a previous check to confirm that certificates in the chain are valid CA certificates was overwritten. This effectively bypasses the check that non-CA certificates must not be able to issue other certificates.


Does anyone have a PoC? Someone posted this on Github but the git log is squashed and doesn't show the changes they made. https://github.com/terorie/cve-2021-3449


Both of these vulnerabilities are problematic, though it’s some comfort that only one of them exists with the default OpenSSL configuration.

DoS attacks are problematic, though I’m wondering what creative uses attackers will find for buggy certificate chain verification.


Is it possible to test with openssl s_client if TLS renegotiation is on or off?

I’m also not sure if it is about legacy or secure renegotiation (or both).


Send a line consisting of a single 'R' character.

https://www.openssl.org/docs/man1.1.1/man1/openssl-s_client....


Does anyone know the best way to update a mac’s OpenSSL from 1.1.1j to 1.1.1k?

Homebrew seems to still be using 1.1.1j


Mac's OpenSSL should be LibreSSL 2.8.3 as of Big Sur 11.2.3; for brew openssl you'll have to wait till they update it.


Update: as of now, homebrew has updated to openssl 1.1.1k.


How can I check if X509_STRICT mode is enabled or not?


Any idea if this impacts OpenWRT as well?

I haven't seen anything on their security advisory feed yet.


The path to greater Security is bumpy.


TLDR - upgrade to 1.1.1k.


How is this not found till now? Doesn't openssl have negative tests? Even if they didn't don't companies have negative tests? Not even one company that relies on openssl have this test?


OpenSSL is extensively tested. These bugs are just hard to find.


I mean, branch coverage could have fingered this hole in the unit tests, couldn't it? The problem is X509_V_FLAG_X509_STRICT was added in 2004 without any corresponding tests. The project still has decades of debt from past poor practices to pay down.


I 100% buy that OpenSSL hasn't dug itself out of the hole it was in back in 2012. But building a new TLS library puts you in the same hole with respect to exhaustive unit test coverage, right? You're better off starting from a place where you already have a fairly extensive (if incomplete) unit test infrastructure than one where you're starting from scratch.

This is a weakly held opinion. I'm mostly motivated by the knee-jerk response OpenSSL gets on threads like these; I think those responses are based on a reputation that doesn't capture the level of work that has gone into the project in the years since 2012. That doesn't mean it's the apogee of what we can do with a TLS library.


I agree that it is in much better shape today and any competing project starting from scratch would be at a disadvantage with respect to the size of their test suite. However, I also like the BoringSSL approach of forking the project and deleting all the dumb features that nobody needs. They deleted the option flag involved in this advisory, years ago. By deleting it they avoided having to add tests for it.


Ultimately I agree with you about what the right approach is, but the route there is convoluted. BoringSSL is able to be simpler by defining pretty tight criteria for what it cares about and everything else either doesn't work or isn't guaranteed to exist at all.

I happen to think the safe way forward is to agree that OK then, we can't do all these other weird things, too bad. But this is an extremely unpopular position because lots of people have at least one weird thing they very much want to do.


Ideally at least the integration tests are interchangeable between implementations of the same thing.


I don't see how it could. If they had a test that X509_V_FLAG_X509_STRICT carried out the intended additional checks, and a test that non-CA certificates could not issue other certificates, and a test that valid certificate chains were accepted, wouldn't that cover all the branches involved without ever detecting the particular problematic combination involved in this bug?


I dunno, at least the new release slightly uh, changes?, the coverage.

  ~/openssl-1.1.1i % gcov crypto/x509/x509_vfy.c -bm
  File 'crypto/x509/x509_vfy.c'
  Lines executed:63.20% of 1625
  Branches executed:73.70% of 1194
  Taken at least once:53.85% of 1194
  Calls executed:58.47% of 496
  Creating 'x509_vfy.c.gcov'

  ~/openssl-1.1.1k % gcov crypto/x509/x509_vfy.c -bm
  File 'crypto/x509/x509_vfy.c'
  Lines executed:63.21% of 1628
  Branches executed:73.75% of 1196
  Taken at least once:53.93% of 1196
  Calls executed:58.47% of 496
  Creating 'x509_vfy.c.gcov'
There are still hundreds of never-taken branches in this file alone.

  ~/openssl-1.1.1k % grep -c never\ executed x509_vfy.c.gcov
  520


. . . the easy ones have been found and fixed. What remains are the hard ones.


Is it really extensively tested?

Should one even write these kinds of libraries in C?


In an ideal world, if you could get the same calibre of talent patrolling the Rust port of OpenSSL and everyone using it, of course you'd want the Rust version.

We live in a fallen world. For a lot of software, the Rust-Go/legacy-C decision is trivial; you'd always take the Rust or Go. But OpenSSL is a tricky case and you can see why in today's announcement: there's a crasher memory corruption problem (but you can crash Rust programs too! today's bug is like a bad "unwrap"). But the first bug is an X.509 logic problem and I don't believe any practical language forecloses on those.

Meanwhile: OpenSSL gets a lot of attention and has a lot of built-up institutional knowledge that other libraries don't. I probably trust it more than I trust any of the other C-language TLS libraries.


The problem seems deeper. From my overview of the hardware/ software interface deep down bellow our infrastructure we have built on top, modern CPUs, assembly, systems programming languages and compilers make it extra hard to do cryptography right. As I understand, most of the stuff is written with "in our experience it is constant time" attitude but I don't think it is guaranteed. CPUs can internally reorder stuff (basically the whole Spectre/ Meltdown and other timing related attacks use this) and we cannot be 100% sure basically about anything. Introduce compilers in the stack and some new optimizations might make previously constant time functions variable time under some circumstances.

Of course OpenSSL is a huge project and mixes many concerns, hard-to-use-right APIs and support for many platforms. This is basically the opposite of what cryptographers and security minded engineers strive for. On the other hand, I am not a cryptographer myself, I am not a C/ Rust/ Assembly or whatever master and I am very far from really understanding modern CPUs though I probably understand a lot more than the average programmer or sysadmin.

I just have the feeling with security, that it is a game of whack-a-mole rather than systematic engineering at times and that doesn't seem to be a sustainable approach.


"But the first bug is an X.509 logic problem and I don't believe any practical language forecloses on those."

I'm likely wrong (this isn't my field), but isn't this particular logic bug avoided by exhaustive enums?

I understand this bug took the form of

(1) a var was assigned an enum type, encoding a logic state ("which X.509 error")

(2) the var gets passed around to different functions, which handle the enum values case-by-case

(3) an update adds a new logic state at (1) (explicit ECC params), but doesn't update the case handling at (2) to handle the new value. Plus, since the error types overwrite each other into the same field, the new error type can clobber an older error type so that they both go unhandled.

If you use strict, exhaustive enums, this type of bug is excluded (isn't it?) The type of the enum will propagate to every place it's read, and the compiler will flag the non-exhaustive case handling in the stale code.


I don't know about that. Look at the diff, it looks like a counting bug. Rust programs also use counting logic, including in places where you could in theory model with exhaustive enums, but nobody does.


In rust, you'd just make a ValidCert type and then it can never be invalid.


Crashing is fine. Continuing with unknown behavior is bad. With rust you might find some obscure bug which lets you crash the software and then docker/k8s/systemd will restart it automatically. The worst you could do is spam the bug and dos the server. But at least no user data is compromised.


There are some issues with cryptographic code in high level languages, so I'm not sure what is better. Even C is considered too high level some times.


Oh great, here we go again... I mean, which OS is available in something other than C?

Frankly, C has gotten us quite far in the systems engineering space. No matter what language, bugs will surface. Starting over on the hype-train isn’t always the best idea!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: