Yeah thanks for saying this; I agree. And as cliche as it is to look for a technical solution to a social problem, I also think better tools could help a lot here.
The current situation is ridiculous - if I pull in a compression library from npm, cargo or Python, why can that package interact with my network, make syscalls (as me) and read and write files on my computer? Leftpad shouldn’t be able to install crypto ransomware on my computer.
To solve that, package managers should include capability based security. I want to say “use this package from cargo, but refuse to compile or link into my binary any function which makes any syscall except for read and write. No open - if I want to compress or decompress a file, I’ll open the file myself and pass it in.” No messing with my filesystem. No network access. No raw asm, no trusted build scripts and no exec. What I allow is all you get.
The capability should be transitive. All dependencies of the package should be brought in under the same restriction.
In dynamic languages like (server side) JavaScript, I think this would have to be handled at runtime. We could add a capability parameter to all functions which issue syscalls (or do anything else that’s security sensitive). When the program starts, it gets an “everything” capability. That capability can be cloned and reduced to just the capabilities needed. (Think, pledge). If I want to talk to redis using a 3rd party library, I pass the redis package a capability which only allows it to open network connections. And only to this specific host on this specific port.
It wouldn’t stop all security problems. It might not even stop this one. But it would dramatically reduce the attack surface of badly behaving libraries.
The problem we have right now is that any linked code can do anything, both at build time and at runtime. A good capability system should be able to stop xz from issuing network requests even if other parts of the process do interact with the network. It certainly shouldn't have permission to replace crc32_resolve() and crc64_resolve() via ifunc.
Another way of thinking about the problem is that right now every line of code within a process runs with the same permissions. If we could restrict what 3rd party libraries can do - via checks either at build time or runtime - then supply chain attacks like this would be much harder to pull off.
I'm not convinced this is such a cure-all as any library must necessarily have the ability to "taint" its output. Like consider this library. It's a compression library. You would presumably trust it to decompress things right? Like programs? And then you run those programs with full permission? Oops..
It’s not a cure-all. I mean, we’re talking about infosec - so nothing is. But that said, barely any programs need the ability to execute arbitrary binaries. I can’t remember the last time I used eval() in JavaScript.
I agree that it wouldn’t stop this library from injecting backdoors into decompressed executables. But I still think it would be a big help anyway. It would stop this attack from working.
At the big picture, we need to acknowledge that we can’t implicitly trust opensource libraries on the internet. They are written by strangers, and if you wouldn’t invite them into your home you shouldn’t give them permission to execute arbitrary code with user level permissions on your computer.
I don’t think there are any one size fits all answers here. And I can’t see a way to make your “tainted output” idea work. But even so, cutting down the trusted surface area from “leftpad can cryptolocker your computer” to “Leftpad could return bad output” sounds like it would move us in the right direction.
Of course we need to trust people to some degree. There's an old Jewish saying - put your trust in god, but your money in the bank. I think its like that. I'm all for trusting people - but I still like how my web browser sandboxes every website I visit. That is a good idea.
We (obviously) put too much trust in little libraries like xz. I don't see a world in which people start using fewer dependencies in their projects. So given that, I think anything which makes 3rd party dependencies safer than they are now is a good thing. Hence the proposal.
The downside is it adds more complexity. Is that complexity worth it? Hard to say. Thats still worth talking about.
i guess the big opensource community should put a little bit more trust in statistics or integrate statistic evaluation in their decission making to use specific products in their supply chains.
This approach could work for dynamic libraries, but a lot of modern ecosystems (Go, Rust, Swift) prefer to distribute packages as source code that gets compiled with the including executable or library.
The goal is to restrict what included libraries can do. As you say, in languages like Rust, Go or Swift, the mechanism to do this would also need to work with statically linked code to work. And thats quite tricky, because there are no isolation boundaries between functions in executables.
It should still be possible to build something like this. It would just be inconvenient. In rust, swift and go you'd probably want to implement something like this at compile time.
In rust, I'd start by banning unsafe in dependencies. (Or whitelisting which projects are allowed to use unsafe code.) Then add special annotations on all the methods in the standard library which need special permissions to run. For example, File::open, fork, exec, networking, and so on. In cargo.toml, add a way to specify which permissions your child libraries get. "Import serde, but give it no OS permissions". When you compile your program, the compiler can look at the call tree of each function to see what actually gets called, and make sure the permissions match up. If you call a function in serde which in turn calls File::open (directly or indirectly), and you didn't explicitly allow that, the program should fail to compile.
It should be fine for serde to contain some utility function that calls the banned File::open, so long as the utility function isn't called.
Permissions should be in a tree. As you get further out in the dependency tree, libraries get fewer permissions. If I pass permissions {X,Y} to serde, serde can pass permission {X} to one of its dependencies in turn. But serde can't pass permission {Q} to its dependency - since it doesn't have that capability itself.
Any libraries which use unsafe are sort of trusted to do everything. You might need to insist that any package which calls unsafe code is actively whitelisted by the cargo.toml file in the project root.
>It should still be possible to build something like this. It would just be inconvenient.
Inconvenient is quite the understatement. Designing and implementing something like this for each and every language compiler/runtime requires hugely more effort than doing it on the OS level. The likelihood of mistakes is also far greater.
Perhaps it's worth exploring whether it can be done on the LLVM level so that at least some languages can share an implementation.
A process can do little to defend itself from a library it's using which has full access to its same memory. There is no security boundary there. This kind of backdoor doesn't hinge on IFUNC's existence.
Honestly, I don't have a lot of hope that we can fix this problem for C on linux. There's just so much historical cruft in present, spread between autotools, configure, make, glibc, gcc and C itself that would need to be modified to support capabilities.
The rule we need is "If I pull in library X with some capability set, then X can't do anything not explicitly allowed by the passed set of capabilities". The problem in C is that there is currently no straightforward way to firewall off different parts of a linux process from each other. And dynamic linking on linux is done by gluing together compiled artifacts - with no way to check or understand what assembly instructions any of those parts contain.
I see two ways to solve this generally:
- Statically - ie at compile time, the compiler annotates every method with a set of permissions it (recursively) requires. The program fails to compile if a method is called which requires permissions that the caller does not pass it. In rust for example, I could imagine cargo enforcing this for rust programs. But I think it would require some changes to the C language itself if we want to add capabilities there. Maybe some compiler extensions would be enough - but probably not given a C program could obfuscate which functions call which other functions.
- Dynamically. In this case, every linux system call is replaced with a new version which takes a capability object as a parameter. When the program starts, it is given a capability by the OS and it can then use that to make child capabilities passed to different libraries. I could imagine this working in python or javascript. But for this to work in C, we need to stop libraries from just scanning the process's memory and stealing capabilities from elsewhere in the program.
Or take the Chrome / original Go approach: load that code in a different process, use some kind of RPC. With all the context switch penalty... sigh, I think it is the only way, as the MMU permissions work at a page level.
Firefox also has its solution of compiling dependencies to wasm, then compiling the wasm back into C code and linking that. It’s super weird, but the effect is that each dependency ends up isolated in bounds checked memory. No context switch penalty, but instead the code runs significantly slower.
The current situation is ridiculous - if I pull in a compression library from npm, cargo or Python, why can that package interact with my network, make syscalls (as me) and read and write files on my computer? Leftpad shouldn’t be able to install crypto ransomware on my computer.
To solve that, package managers should include capability based security. I want to say “use this package from cargo, but refuse to compile or link into my binary any function which makes any syscall except for read and write. No open - if I want to compress or decompress a file, I’ll open the file myself and pass it in.” No messing with my filesystem. No network access. No raw asm, no trusted build scripts and no exec. What I allow is all you get.
The capability should be transitive. All dependencies of the package should be brought in under the same restriction.
In dynamic languages like (server side) JavaScript, I think this would have to be handled at runtime. We could add a capability parameter to all functions which issue syscalls (or do anything else that’s security sensitive). When the program starts, it gets an “everything” capability. That capability can be cloned and reduced to just the capabilities needed. (Think, pledge). If I want to talk to redis using a 3rd party library, I pass the redis package a capability which only allows it to open network connections. And only to this specific host on this specific port.
It wouldn’t stop all security problems. It might not even stop this one. But it would dramatically reduce the attack surface of badly behaving libraries.