Yeah thanks for saying this; I agree. And as cliche as it is to look for a techn...

Guvante · on March 30, 2024

Doesn't this exact exploit not fixed by your capability theory?

It is hijacking a process that has network access at runtime not build time.

The build hack grabs files from the repo and inspects build parameters (in a benign way, everyone checks whether you are running on X platform etc)

josephg · on March 30, 2024

The problem we have right now is that any linked code can do anything, both at build time and at runtime. A good capability system should be able to stop xz from issuing network requests even if other parts of the process do interact with the network. It certainly shouldn't have permission to replace crc32_resolve() and crc64_resolve() via ifunc.

Another way of thinking about the problem is that right now every line of code within a process runs with the same permissions. If we could restrict what 3rd party libraries can do - via checks either at build time or runtime - then supply chain attacks like this would be much harder to pull off.

im3w1l · on March 30, 2024

I'm not convinced this is such a cure-all as any library must necessarily have the ability to "taint" its output. Like consider this library. It's a compression library. You would presumably trust it to decompress things right? Like programs? And then you run those programs with full permission? Oops..

josephg · on March 30, 2024

It’s not a cure-all. I mean, we’re talking about infosec - so nothing is. But that said, barely any programs need the ability to execute arbitrary binaries. I can’t remember the last time I used eval() in JavaScript.

I agree that it wouldn’t stop this library from injecting backdoors into decompressed executables. But I still think it would be a big help anyway. It would stop this attack from working.

At the big picture, we need to acknowledge that we can’t implicitly trust opensource libraries on the internet. They are written by strangers, and if you wouldn’t invite them into your home you shouldn’t give them permission to execute arbitrary code with user level permissions on your computer.

I don’t think there are any one size fits all answers here. And I can’t see a way to make your “tainted output” idea work. But even so, cutting down the trusted surface area from “leftpad can cryptolocker your computer” to “Leftpad could return bad output” sounds like it would move us in the right direction.

Guvante · on April 1, 2024

There are attacks that embed hacks into built compilers so unless you are looking to write your software from scratch you need to trust people.

And by scratch I mean "without modern hardware" given supply chain attacks also apply to the hardware you build from.

josephg · on April 2, 2024

Of course we need to trust people to some degree. There's an old Jewish saying - put your trust in god, but your money in the bank. I think its like that. I'm all for trusting people - but I still like how my web browser sandboxes every website I visit. That is a good idea.

We (obviously) put too much trust in little libraries like xz. I don't see a world in which people start using fewer dependencies in their projects. So given that, I think anything which makes 3rd party dependencies safer than they are now is a good thing. Hence the proposal.

The downside is it adds more complexity. Is that complexity worth it? Hard to say. Thats still worth talking about.

ui2RjUen875bfFA · on April 2, 2024

i guess the big opensource community should put a little bit more trust in statistics or integrate statistic evaluation in their decission making to use specific products in their supply chains.

there are some researches on the right track already https://www.se.cs.uni-saarland.de/projects/congruence/

fauigerzigerk · on March 30, 2024

This approach could work for dynamic libraries, but a lot of modern ecosystems (Go, Rust, Swift) prefer to distribute packages as source code that gets compiled with the including executable or library.

josephg · on March 30, 2024

Yes, and?

The goal is to restrict what included libraries can do. As you say, in languages like Rust, Go or Swift, the mechanism to do this would also need to work with statically linked code to work. And thats quite tricky, because there are no isolation boundaries between functions in executables.

It should still be possible to build something like this. It would just be inconvenient. In rust, swift and go you'd probably want to implement something like this at compile time.

In rust, I'd start by banning unsafe in dependencies. (Or whitelisting which projects are allowed to use unsafe code.) Then add special annotations on all the methods in the standard library which need special permissions to run. For example, File::open, fork, exec, networking, and so on. In cargo.toml, add a way to specify which permissions your child libraries get. "Import serde, but give it no OS permissions". When you compile your program, the compiler can look at the call tree of each function to see what actually gets called, and make sure the permissions match up. If you call a function in serde which in turn calls File::open (directly or indirectly), and you didn't explicitly allow that, the program should fail to compile.

It should be fine for serde to contain some utility function that calls the banned File::open, so long as the utility function isn't called.

Permissions should be in a tree. As you get further out in the dependency tree, libraries get fewer permissions. If I pass permissions {X,Y} to serde, serde can pass permission {X} to one of its dependencies in turn. But serde can't pass permission {Q} to its dependency - since it doesn't have that capability itself.

Any libraries which use unsafe are sort of trusted to do everything. You might need to insist that any package which calls unsafe code is actively whitelisted by the cargo.toml file in the project root.

fauigerzigerk · on March 31, 2024

>It should still be possible to build something like this. It would just be inconvenient.

Inconvenient is quite the understatement. Designing and implementing something like this for each and every language compiler/runtime requires hugely more effort than doing it on the OS level. The likelihood of mistakes is also far greater.

Perhaps it's worth exploring whether it can be done on the LLVM level so that at least some languages can share an implementation.

saagarjha · on March 30, 2024

Do you understand how ifuncs work? They are in the address space in the application that they run in. liblzma is resolving its own pointers!

Bulat_Ziganshin · on March 30, 2024

if I got it right, the attack uses glibc IFUNC mechanism to patch sshd (and only sshd) to directly run some code in liblzma when sshd verifies logins.

so the problem is IFUNC mechanism, which has its valid uses but can be EASILY misused for any sort of attacks

AgentME · on March 30, 2024

A process can do little to defend itself from a library it's using which has full access to its same memory. There is no security boundary there. This kind of backdoor doesn't hinge on IFUNC's existence.

josephg · on March 30, 2024

Honestly, I don't have a lot of hope that we can fix this problem for C on linux. There's just so much historical cruft in present, spread between autotools, configure, make, glibc, gcc and C itself that would need to be modified to support capabilities.

The rule we need is "If I pull in library X with some capability set, then X can't do anything not explicitly allowed by the passed set of capabilities". The problem in C is that there is currently no straightforward way to firewall off different parts of a linux process from each other. And dynamic linking on linux is done by gluing together compiled artifacts - with no way to check or understand what assembly instructions any of those parts contain.

I see two ways to solve this generally:

- Statically - ie at compile time, the compiler annotates every method with a set of permissions it (recursively) requires. The program fails to compile if a method is called which requires permissions that the caller does not pass it. In rust for example, I could imagine cargo enforcing this for rust programs. But I think it would require some changes to the C language itself if we want to add capabilities there. Maybe some compiler extensions would be enough - but probably not given a C program could obfuscate which functions call which other functions.

- Dynamically. In this case, every linux system call is replaced with a new version which takes a capability object as a parameter. When the program starts, it is given a capability by the OS and it can then use that to make child capabilities passed to different libraries. I could imagine this working in python or javascript. But for this to work in C, we need to stop libraries from just scanning the process's memory and stealing capabilities from elsewhere in the program.

estebarb · on March 30, 2024

Or take the Chrome / original Go approach: load that code in a different process, use some kind of RPC. With all the context switch penalty... sigh, I think it is the only way, as the MMU permissions work at a page level.

josephg · on March 30, 2024

Firefox also has its solution of compiling dependencies to wasm, then compiling the wasm back into C code and linking that. It’s super weird, but the effect is that each dependency ends up isolated in bounds checked memory. No context switch penalty, but instead the code runs significantly slower.

saagarjha · on March 30, 2024

The problem is that the attacker has code execution in sshd, not ifuncs