Monday, February 8, 2016

Adventures in Containerization - Part 1: What are Containers?

Hey everyone - We have been talking about databases and data security for the past few weeks.  I thought it would be nice to switch gears a little bit and talk about another aspect of information security: containerization.  Containerization is the process of deploying applications and services in isolated environments.  It is similar to virtualization, but does not require installing a new instance of the operating system (and all of the overhead that comes along with that).  Containers in Linux are similar to jails in BSD and zones in Solaris.  I have not worked with containers and containerization extensively, so this will be a learning experience for me as well.  Time to get started.

Why do containers matter to a security professional?

You might be wondering what place containerization has on an information security blog.  I think this is a useful topic for several reasons:
  • Containerization is becoming more popular as organizations shift to the cloud.  Containers are easy to spin up and destroy, and many of the popular cloud platforms (Amazon EC2, Rackspace, and Azure) support containers.
  • In order to protect (or attack) the data that transits through or is stored on these containers, you need to understand how containers work.
  • Containers are different than virtual machines (VMs), so there are different considerations one must take into account before choosing them as a solution.
  • Containerization is relatively new, so there are likely security vulnerabilities that are currently unknown.

What is the difference between containers and VMs?

As a high level, containers and VMs perform the same function: they provide isolation for applications and services.  From a security perspective, this is great because compromise of one application does not necessarily lead to compromise of all applications and services in the organization.  For example, if you have separate VMs for your web server and your mail server, compromise of your web server does not mean that your mail server is necessarily compromised.

Isolation is also good because it allows you to be more granular in your approach to securing the various applications and services in your organization.  The security needs of a server holding super-sensitive information may be a bit different than your externally facing website which does not connect with your internal network at all.  Isolating these services allows you to tailor the security approach you take to different services which could save money in labor and software to maintain the desired level of security.

Containers and VMs differ on a few key points:
  • To deploy a VM, you have to deploy a new instance of the operating system.  That involves additional overhead which containers do not have since containers are all deployed on the same instance of the operating system.  The containerization features (which we will talk about) are part of the kernel on the host machine.  When you deploy a VM, there is the potential for a much larger attack surface that needs to be locked down.  With a container, you only have to lock down the few things running in that container (usually the application or service and its dependencies).
  • Containers are meant to be temporary instances that are spun up when needed and destroyed when not needed.  VMs are intended to be a bit more permanent.  They can be spun up and destroyed with relative ease, but this usually takes more time.
  • Right now, you can deploy the container infrastructure on Windows, OS X, and Linux (you can use software such as Vagrant on Windows and OS X since containerization is not built into those operating systems).  However, as of right now, you can only deploy Linux containers since the Windows kernel does not support containers.  This is going to change once Windows Server 2016 hits the streets.  Software like Vagrant uses a virtual machine running Linux to give you containers in other operating systems.  Therefore, if you have a Windows application or service you want to containerize, you are out of luck for now.
  • Since containerization is still relatively new (it was experimentally added to the Linux kernel around 2.6.24), there may be security issues that have not been found yet.  There is still more work to be done when it comes to isolation in containers.  The biggest example of this is user namespace isolation.  We will talk about this in a bit.
  • Patching and system management is different when you have a containerized environment.  With VMs, you can treat those like other systems on the network when it comes to patch management because they act like another system for better or worse.  Since containers are tightly integrated with the host and deployment is relatively easy, patches can be integrated with container deployment. 

Security Concerns with Containers

We touched on a few of these points already, but I wanted to dive into how containers provide isolation a little bit.  Isolation and a smaller attack surface are main security benefits of using containers.  We talked about where the smaller attack surface comes from, but the isolation aspect of containers deserves a bit more discussion.

On a system, each container gets its own file system, network stack, and process name space.  That means a container cannot access the files, ports, and processes of another container.  The Linux kernel accomplishes this with use of control groups (also called cgroups) and namespaces which have been around since kernel 2.6.24.  There have been improvements to cgroups integrated in ever since.  Control groups allow an administrator to define the parameters and limits of a set of processes (like a container).  The parameters for a number of subsystems (like memory, network, and CPU usage) can be defined for a control group.  Control groups allow for the setup of the general sandbox that containers play in.  To make sure the individual containers play nice with each other, they are further restricted using namespaces.

There are six main namespaces that the Linux kernel supports as of this writing:
  • Network Namespace: Isolation of firewall rules, ports, network traffic, routing tables, and network interfaces.
  • Process / PID Namespaces: Isolation of process identifiers (PIDs).  That means that PIDs inside of a container are different and separate from the PIDs of the host and other containers on that host.  An example of this is PID1 (init) which provides system initialization.  Each container can have its own PID1 that is separate from the PID1 of the host, so if the PID1 in the container is compromised somehow, it will not affect the init process on the host.
  • Mount / Filesystem Namespace: Isolation of mounted filesystems
  • Inter-process Call (IPC) Namespace: Isolation of inter-process communication
  • Unix Time-Sharing System (UTS) Namespace: The name for this one derives from computing history.  Back in the day (circa 1970), multi-user systems were a big deal.  When you had more than one user on the system, the system could be more effectively utilized.  Think about it: when you are the only person using the computer, you are not using it to its full potential all the time, but if multiple people are using the computer at the same time, it can be more effectively utilized.  The UTS namespace means that each namespace can have its own hostname (and NIS domain name, but NIS is not used too much nowadays).
  • Username Namespace: User and group IDs are isolated.  For example, user 1000 in Container A is isolated and not the same as user 1000 on the host or in Container B.
These namespaces need to be implemented by whatever wants to use them.  The kernel supports them, but the container software must provide the implementation that manages these for the various containers you want to deploy.  Currently, one of the more popular Linux containerization platforms (Docker) implements all of these except the username namespace.  This means that a user in your container is mapped to a user on the host system.  To be fair, user namespaces are relatively new (much of the work was completed in kernel 3.8).  That means there could be security vulnerabilities that we do not know about yet.

Why does the lack of Full Username Namespace Support in Docker Matter?

Without username namespace support, users inside of the container map to users outside of the container (on the host).  For example, if you are root in a container, you are root when it comes to the kernel of the host.  That means that if a user is able to break out of the container where he or she is root, he or she will have root privileges outside of the container.  Fortunately, as of Docker 1.9, the root user in a container is mapped to a non-root user on the host.   Extending this further to allow for userspace mapping per container or mapping of users other than root is still being worked on as of this writing.  That means when deploying containers, you should use user IDs that do not already exist on your host.  This is because user 1000 in your container will have the same privileges as user 1000 on your host, at least when it comes to the kernel.  Further discussion on username namespaces and Docker is available from this presentation from ContainerCon 2015 (PDF).

What choices are there for Containers (on Linux)?

There are two popular choices for containers in Linux: LXC (LinuX Containers) and Docker.  The major difference is that Docker containers are intended to be used with a single process whereas containers in LXC are more flexible.  For example, if you want to deploy a django application (which may need a web server, a database server, and django itself), you would need separate Docker containers for each component.  With LXC, you could have one container that has all of the components.  With a container that has everything included, it is very portable.  It is arguably easier to work with LXC containers, because you can package everything up in a single container (all of the moving parts are together).  However, if one component in the container is compromised, the other components are available for compromise.  With separate containers for each component, the compromise of one component does not necessarily affect the other components in other containers.

One thing about LXC is that it supports username namespaces that we talked about before.  There are benefits and drawbacks to each approach, and the solution you choose needs to take your needs in mind.  Therefore, the solution you choose depends on how want to use containers in your environment (your use case).  If you want to use a container as a lightweight, very portable VM of sorts, LXC would be a good choice.  Docker's popularity has spawned a large community and lots of official and community-made images to choose from.  This might be an easier way for some to start with containers.

Despite the fact that containers are relatively new and there are known and unknown security concerns associated with them, it is likely that they are here to stay, so we should make an effort to understand them.  Docker containers are actually built from LXC containers, so there is a lot of similarity in how they work.  The difference is in the container philosophy (a more application centric, separation of processes) and how they are administered.

Many of the components in LXC have analogous Docker components (or vice versa).  For easy reference:
Linux Kernel Namespace LXC Component Docker Component
Process Namespace Templates deployed as Containers Images deployed as Containers
Networking Namespace Host-only NAT, port forwarding using iptables (or firewalld) Host-only NAT, port forwarding using links
Mount Namespace Aufs or mounts from host Overlayfs or mounts from host


LXC is a set of tools that allow you to take advantage of the cgroup and namespace features we talked about before.  We will talk about the main components of LXC: templates, containers, networking, and storage.


LXC templates are similar to VM templates.  Containers are deployed from templates.  Any changes you make to the container are specific to that container.  That way, you can deploy the same template multiple times and customize it for the task you need it for.


Containers are deployed instances of a template.  They actually run the applications you want to deploy.


LXC containers are deployed on a network that is only visible to the host they are on.  Containers can talk to each other through this network but not to anyone else beyond the confines of the host they are running on. That means that if you want to talk to a service running in a container, you will need to use iptables (or firewalld) to forward the port you want to be visible externally to the port you want to talk to on the container.  For example, if you have a web server running on port 8080 in your container that you want to be visible to your LAN on port 80, you could set up a firewall rule to forward traffic on port 80 to port 8080 on the container.


Storage in LXC is similar to snapshots with virtual machines.  The changes from the template specific to that container are stored with that container, and when the container is destroyed, those changes are destroyed.  The template is never changed.  If you need more persistent storage, you can create a mount point on the host (that could point to a NAS or something similar) and expose that mount point to the container (or containers) you want to use it with.


LXC containers can be deployed with the assistance of configuration files.  The syntax for these files is available in the manpages (lxc.container.conf).  A mirror is available here.  The configuration file allows you to specify a number of different kind of parameters such as storage and networking.


Before we dive in, we should talk about the components of Docker.  There are four main parts we will talk about: images, containers, links, and volumes.  Images and containers are associated with the operation of the container, links are for networking within your containers and to and from them, and volumes are associated with the storage for the container.


Docker images are similar to VM templates.  Images are a baseline of what will be run in the container.  For example, you can have an image that contains the software your application needs to run (but not the application itself).  Images are identified by a name and tag pair (like debian:latest or django:1.8).  When you want to deploy your application, you deploy one or more of these images to create separate containers for what your applications need.

Containers (Process Namespace Separation)

Containers are similar to deploying a new VM from a template.  They are separate instances spawned from a given container.  Your application actually runs from a container, not the base image it was spawned from.  Docker containers are designed to be disposable, so you will need somewhere else to store data if you need to (which we will talk about in a bit).

Links (Networking Namespace Separation)

Networking with containers is similar to 'host-only' networking with traditional VMs.  Containers are assigned private IPs and they can talk to each other and the host, but they are not exposed directly to the network beyond the host.  Linking containers is when you refer to one container from within another.  An alias for the linked container (call it B) is created within the container you create the link in (call it A).  Then, you can refer to container B by its name in container A.   As a side note, when you link B to A, you need to specify which ports on B are available.

Volumes (Mount Namespace Separation)

Volumes are the storage underpinnings of Docker containers.  Storage with containers is similar to snapshots with virtual machines.  When you deploy an image, the storage for that container is specific to that instance.  As an example, pretend you have an Ubuntu image and you deploy a database and a django container from it.  The storage for the database container will be separate from the django container even though they are based off of the same image.  The storage for the database container will contain only the additional or modified files based on the original image that the database container needs to do its job.  Similarly, the django container will contain only the additional or modified files that differ from the base image that django needs.  These changes (or deltas) are called layers.

If you destroy the container, that data is gone.  If you want persistent data with your containers, you can tell Docker what storage is for the application and what storage is for data.  These are defined in volumes.  Volumes live on the filesystem of the host but appear to the container as a traditional drive.  For example, with container A, /home will appear to be /home, but on the host, it might be stored as /containers/A/home.

Volumes are specific to each container, but these volumes can be shared between containers.  That way, you could have some storage set aside for data that you want to use in your containers, and no matter what happens to those containers (destroy them, upgrade them, change them out), the data will not be affected.


All of this is defined in a dockerfile.  Dockerfiles are like VMX files in VMware or the XML files used by Virtualbox to describe the parameters of a virtual machine.  Dockerfiles specify the parameters for images.  These images are then used to deploy containers.  For example, your dockerfile can specify what to build on top of an Ubuntu image to create a django image which can be used to deploy django containers.  Here is the documentation for dockerfiles.

Here is a sample one from the link above:
# Firefox over VNC
# VERSION               0.3

FROM ubuntu

# Install vnc, xvfb in order to create a 'fake' display and firefox
RUN apt-get update && apt-get install -y x11vnc xvfb firefox
RUN mkdir ~/.vnc
# Setup a password
RUN x11vnc -storepasswd 1234 ~/.vnc/passwd
# Autostart firefox (might not be the best way, but it does the trick)
RUN bash -c 'echo "firefox" >> /.bashrc'

CMD    ["x11vnc", "-forever", "-usepw", "-create"]

This dockerfile builds on the ubuntu image (FROM ubuntu), installs a few programs (RUN apt-get ...), exposes the VNC port (EXPOSE 5900), and runs the VNC server.  When this image is deployed as a container, these commands will be executed, and it will always be up to date (because the container's repositories are updated before any software is installed thanks to the apt-get update command).

You could spin a container like this up to visit a website and then promptly destroy it.  The next time you deploy a container using this dockerfile and the associated image, you will have a fresh install of firefox and VNC as if the other container never existed.

Summary and Next Time

Next time, we will try working with LXC containers.  Since support for unprivileged containers is only partially supported in Docker, I want to see where that goes before talking about using Docker.  Personally, I do not like the idea of running containers with root privileges in an environment where they will be exposed to untrusted users.  For small groups to test things, privileged containers are fine, but for any sort of production deployment, I am hesitant to recommend them over VMs.  I suppose if you took adequate steps to secure the host running the containers (and maybe made the host a VM with only containers on it), it would be okay.  However, I think it is better to wait for the implementation to mature a bit before relying on containers for everything (at least from a security perspective).

If you are dead set on containers, I would make a VM for your Linux containers and run them in that.  In case of a compromise, the containers are at risk, but the rest of the system should be harder to compromise.  For Windows containers, I would run Hyper-V containers because they are isolated from the host in their own virtual machines.  I tried to get an install of Windows Server 2016 Technical Preview 4 going, but there were hardware incompatibilities with the platform I tried to get it working on.  If you want to try it for yourself, you can download it here.  It is good until July 15, 2016.  This article is good for getting started with containers in HyperV.  I think it is pretty cool that Windows supports Docker images, but keep in mind that these are Windows Docker images.  Your existing Linux Docker images will not work here.

What are your thoughts?  Am I off base with security concerns and containers?

Thanks for reading!

No comments:

Post a Comment