This is neat. But. Docker isn't sandboxing in a security sense. It's sandboxing ...

jeswin · on Feb 22, 2015

> Given an unfriendly app, Docker is no different from running the unfriendly app directly.

I understand the risk of a kernel zero-day. But for Desktop apps, when used in conjunction with SELinux, can't Docker be considered to provide a level of sandboxing? In fact, I have not heard of any container breakouts. AFAIK, the only incidence that came close (but only worked on unpatched Docker) was https://news.ycombinator.com/item?id=7909622 and this was before their 1.0 release.

I'd like to hear your thoughts since a project I'm working on assumes that in a year's time containers will provide a Sandboxing alternative for server-side use. I can see some companies doing this already.

EDIT: X11 issues aside, as mentioned above by alexlarsson

geofft · on Feb 22, 2015

Without user namespaces (CLONE_NEWUSER), which Docker currently doesn't use, uid 0 inside a container is the same thing as uid 0 outside it. If you let Docker run apps as root, which seems to be not uncommon, then it is, in a strong sense, the same as the root user outside the container. That's why Jessie's gparted process can partition her disk: as long as it can get at the device node, it has full permissions on it.

Apart from things that you've explicitly given it access to (like device nodes), the risk of zero-days is higher because these sorts of things aren't quite zero-days: it's not fundamentally a violation of the kernel's security model for uid 0 to be able to do root-y things. You might want it to be unable to, and you might mostly succeed by not exposing certain device nodes, using a process namespace (CLONE_NEWPID) so it can't attach itself as a debugger to other things on the system, etc. etc. But there's no intent in the kernel to make this safe. It's mostly an emergent feature of other things, and emergent features make bad security features.

What you can do is run as not root, which still makes you the same as some other UID on the system, but guarantees that you're not risking increasing privileges. User namespaces give you a few additional features here: first, even the process of entering a chroot / container doesn't require root or a setuid binary, which is neat. Second, even if you're root inside the container, you're not root on the host system in the same sense, and it's just kernel code that's checking for uid == 0 (which should almost all be gone, since they changed the uid_t type in userns-enabled kernels) that thinks you're root.

SELinux might be able to help you here, and I'm not really familiar with what the standard recommendations for SELinux and Docker are. I'd basically consider applying it as if the container didn't exist: if you're comfortable with something running as root with SELinux confinement, then it's definitely fine to run as root inside Docker with SELinux confinement. If not, I wouldn't risk it.

For server containment, you should take a look at https://sandstorm.io/ , which uses user namespaces and runs apps as uid 1000 inside the namespace. This means that it's running with no more privilege than the host user in the worst case.

kentonv · on Feb 22, 2015

The reason you don't hear about Docker breakouts is because Docker never claimed to be a secure sandbox in the first place, so a breakout is a non-event.

Note that local privilege escalation exploits in Linux are found regularly, like on a monthly basis, and every one is likely a Docker breakout.

geofft mentioned Sandstorm -- my project -- which actually does claim to be a sandbox, and isn't affected by most of these kernel exploits. Here's a blog post discussing the differences:

https://blog.sandstorm.io/news/2014-08-13-sandbox-security.h...