Docker isn't sandboxing in a security sense. It's sandboxing in a deployment sense: given a friendly app and a friendly host, the app can get an environment it wants without bothering the host to adapt too much. Given two friendly apps and a friendly host, the two apps can see different environments.
Given an unfriendly app, Docker is no different from running the unfriendly app directly.
I think the really cool thing about this is that, given how straightforward these examples look, you can use this as a deployment platform: go use whatever weird Linux distro you want, and still be able to run software that's only supported on an Ubuntu LTS.
But I think the comparison to Apple's sandbox is misleading, and also vaguely unfair to the good work that Apple has done in building a security sandbox.
> Given an unfriendly app, Docker is no different from running the unfriendly app directly.
I understand the risk of a kernel zero-day. But for Desktop apps, when used in conjunction with SELinux, can't Docker be considered to provide a level of sandboxing? In fact, I have not heard of any container breakouts. AFAIK, the only incidence that came close (but only worked on unpatched Docker) was https://news.ycombinator.com/item?id=7909622 and this was before their 1.0 release.
I'd like to hear your thoughts since a project I'm working on assumes that in a year's time containers will provide a Sandboxing alternative for server-side use. I can see some companies doing this already.
EDIT: X11 issues aside, as mentioned above by alexlarsson
Without user namespaces (CLONE_NEWUSER), which Docker currently doesn't use, uid 0 inside a container is the same thing as uid 0 outside it. If you let Docker run apps as root, which seems to be not uncommon, then it is, in a strong sense, the same as the root user outside the container. That's why Jessie's gparted process can partition her disk: as long as it can get at the device node, it has full permissions on it.
Apart from things that you've explicitly given it access to (like device nodes), the risk of zero-days is higher because these sorts of things aren't quite zero-days: it's not fundamentally a violation of the kernel's security model for uid 0 to be able to do root-y things. You might want it to be unable to, and you might mostly succeed by not exposing certain device nodes, using a process namespace (CLONE_NEWPID) so it can't attach itself as a debugger to other things on the system, etc. etc. But there's no intent in the kernel to make this safe. It's mostly an emergent feature of other things, and emergent features make bad security features.
What you can do is run as not root, which still makes you the same as some other UID on the system, but guarantees that you're not risking increasing privileges. User namespaces give you a few additional features here: first, even the process of entering a chroot / container doesn't require root or a setuid binary, which is neat. Second, even if you're root inside the container, you're not root on the host system in the same sense, and it's just kernel code that's checking for uid == 0 (which should almost all be gone, since they changed the uid_t type in userns-enabled kernels) that thinks you're root.
SELinux might be able to help you here, and I'm not really familiar with what the standard recommendations for SELinux and Docker are. I'd basically consider applying it as if the container didn't exist: if you're comfortable with something running as root with SELinux confinement, then it's definitely fine to run as root inside Docker with SELinux confinement. If not, I wouldn't risk it.
For server containment, you should take a look at https://sandstorm.io/ , which uses user namespaces and runs apps as uid 1000 inside the namespace. This means that it's running with no more privilege than the host user in the worst case.
The reason you don't hear about Docker breakouts is because Docker never claimed to be a secure sandbox in the first place, so a breakout is a non-event.
Note that local privilege escalation exploits in Linux are found regularly, like on a monthly basis, and every one is likely a Docker breakout.
geofft mentioned Sandstorm -- my project -- which actually does claim to be a sandbox, and isn't affected by most of these kernel exploits. Here's a blog post discussing the differences:
But.
Docker isn't sandboxing in a security sense. It's sandboxing in a deployment sense: given a friendly app and a friendly host, the app can get an environment it wants without bothering the host to adapt too much. Given two friendly apps and a friendly host, the two apps can see different environments.
Given an unfriendly app, Docker is no different from running the unfriendly app directly.
I think the really cool thing about this is that, given how straightforward these examples look, you can use this as a deployment platform: go use whatever weird Linux distro you want, and still be able to run software that's only supported on an Ubuntu LTS.
But I think the comparison to Apple's sandbox is misleading, and also vaguely unfair to the good work that Apple has done in building a security sandbox.