I’ve been toying around with containers using LXC, and I decided to use this technology to do some performance technique for a PHP web application. So I set-up a VM with Ubuntu 14.04 LTS and decided to use containers to test various stacks, e.g. MySQL 5.5 + PHP 5.6 + Nginx 1.4.6 vs MariaDB 10 + PHP 7 RC3 + Nginx 1.9 (and all possible combinations), and HTTP/2, latencies, etc. On top of this experiment I wanted to learn more about Ansible.
Therefore my needs were the following:
- One account to rule them all: one user account on the VM with sudo privileges. Used by Ansible to administer the VM and its containers.
- Run each container as unprivileged ones.
- Each container is bound to one unique user.
As of Ansible 1.8, LXC containers are supported, but as second class citizen: this extra module needs some dependency to work (which are not in the default repositories) and it would hide me how to configure the LXC containers, so I discarded this solution.
Unprivileged containers using the LXC command line UI
Standard containers usually run as root and the root user within that container maps to the root user outside of the container. This is not exactly how I think of a container.
In my view, a container is meant to hold something which I don’t want to leak out of the container, a
chroot on steroids! So although I was really looking forward for containers on Linux, the first implementation were not matching my expectations or use cases.
Being able to run unprivileged containers is one of the great thing which finally decided me to check those containers on Linux, they finally cut the mapping between the privileged users on the host and those in the container.
Unprivileged containers are not as easy to set-up as normal fully privileged containers and you have to accept that you need to download an image from some website you need to trust. Not ideal, but resources about how to build an image for an unprivileged container are really scarce online, and I decided to first try it with the images, and then once I master LXC, to try building my own images.
Setting up LXC unprivileged containers require a few more packages, especially for the user and group IDs mapping, some preliminary account setup and giving LXC proper access to where you store the LXC containers data in your home folder. I did all of this, including setting up ACLs for the access. But when I hit
lxc-start (...) it all failed!!!
What I did after all preliminary configurations was:
$ sudo -H -u dbusr lxc-create -t download -n mysql55 -- --dist ubuntu --release trusty --arch amd64 $ sudo -H -u dbusr lxc-start -n mysql55 -d lxc-start: lxc_start.c: main: 344 The container failed to start. lxc-start: lxc_start.c: main: 346 To get more details, run the container in foreground mode. lxc-start: lxc_start.c: main: 348 Additional information can be obtained by setting the --logfile and --logpriority options
lxc-start command failed miserably.
Why? A bit of background first, I will try to describe at a high level how containers “contain” on Linux. LXC should not really be compared to Solaris Zones or FreeBSD Jails. LXC uses Linux Control Groups (cgroups) to contain (allow/restricting/limiting) access by some process to some resources (e.g. CPU, memory, etc.). When one creates a “local” session on Ubuntu 14.04 LTS (and this is still valid on the up-coming Ubuntu 15.10 as of writing), such as login through the console or via SSH, Ubuntu allocates for the user different control groups controllers which can be viewed by doing:
$ cat /proc/self/cgroup 12:name=systemd:/user/1000.user/4.session 11:perf_event:/user/1000.user/4.session 10:net_prio:/user/1000.user/4.session 9:net_cls:/user/1000.user/4.session 8:memory:/user/1000.user/4.session 7:hugetlb:/user/1000.user/4.session 6:freezer:/user/masteen/0 5:devices:/user/1000.user/4.session 4:cpuset:/user/1000.user/4.session 3:cpuacct:/user/1000.user/4.session 2:cpu:/user/1000.user/4.session 1:blkio:/user/1000.user/4.session
Each line is a type of control group controller assigned to the user. For example the line about memory concerns the Memory Resource Controller which can be used for things such as limiting the amount of memory a group of process can use.
So when I run my script via Ansible, Ansible first establish a SSH session using my “rule-them-all” user, an SSH session is considered “local” and PAM is triggering
systemd-logind to create automatically cgroups for my user shell process. Yes, even-though Ubuntu 14.04 LTS is still using upstart for init system, it has already a few dependencies on systemd! Now my “rule-them-all” user, when he is using
sudo to execute commands as another user (be it root or one of my container users), the executed command is not considered as a “local” session for the sudo-ed user. So no cgroups are created for the new process, and it actually inherit the cgroups of the callee. This is easily visible by doing, you can see that the username and UID did not change despite the command being run as another user:
$ sudo -u userdb cat /proc/self/cgroup 12:name=systemd:/user/1000.user/4.session 11:perf_event:/user/1000.user/4.session 10:net_prio:/user/1000.user/4.session 9:net_cls:/user/1000.user/4.session 8:memory:/user/1000.user/4.session 7:hugetlb:/user/1000.user/4.session 6:freezer:/user/masteen/0 5:devices:/user/1000.user/4.session 4:cpuset:/user/1000.user/4.session 3:cpuacct:/user/1000.user/4.session 2:cpu:/user/1000.user/4.session 1:blkio:/user/1000.user/4.session
This usually does not really matter, unless you are LXC and you use cgroups heavily!! So what happened is that
lxc-start wanted to write to the various cgroups to create the container. But
lxc-start was called by the user dbuser (via the
sudo command), however the cgroups it inherited were from my “rule-them-all” user and obviously (and thankfully) dbuser does not have the right to change the cgroups of my “rule-them-all” user. So
lxc-start failed due to some permission denied:
lxc_container: cgmanager.c: lxc_cgmanager_create: 299 call to cgmanager_create_sync failed: invalid request lxc_container: cgmanager.c: lxc_cgmanager_create: 301 Failed to create hugetlb:mysql55 lxc_container: cgmanager.c: cgm_create: 646 Error creating cgroup hugetlb:mysql55 lxc_container: start.c: lxc_spawn: 861 failed creating cgroups lxc_container: start.c: __lxc_start: 1080 failed to spawn 'mysql55' lxc_container: lxc_start.c: main: 342 The container failed to start.
Getting Ubuntu 14.04 LTS ready to run our Unprivileged Containers
I did not manage to solve the problem on Ubuntu 14.04 LTS in a first attempt. But after some extra steps (which I will details below), I made it work on the up-coming Ubuntu 15.10 (still alpha) release! So revisiting my Ubuntu 14.04 LTS setup, I identified the required packages needing an upgrade: kernel, lxc and cgmanager (and a few dependencies). For the Kernel, I’ve used the latest Ubuntu LTS Enablement Stack and upgraded to kernel 3.19. For the other packages, despite my aversion for 3rd party repositories, I decided to trust the Ubuntu LXC team (they are the ones who do the work to get LXC/LXD in Ubuntu in the first place) and their LXC PPA. So the commands were:
$ sudo apt-get install --install-recommends linux-generic-lts-vivid $ sudo apt-add-repository ppa:ubuntu-lxc/lxc-stable $ sudo apt update $ sudo apt full-upgrade $ sudo apt-get install --no-install-recommends lxc lxc-templates uidmap libpam-cgm
lxc-templates package are necessary for me as I’m using the “download” template. The
uidmap is mandatory for unprivileged containers. And
libpam-cgm is necessary to resolve my problem, it is a PAM module for the
cgmanager which is the Control Group Manager daemon (installed as a dependency to LXC on Ubuntu).
Now armed with this updated kernel and container stack it is time to present you the extra setup steps I was talking about. We need first to update either (or both depending which method you use) the PAM configuration for
su. I will show the extra line for the former, they should apply for the later if you wish to use it. You need to edit as root the file
/etc/pam.d/sudo and add the following lines after the line ‘
session required pam_loginuid.so session required pam_systemd.so class=user
The above 2 lines will register the new session created by sudo with the systemd login manager (systemd-logind). That’s the guy we wanted notified so that the creation of the cgroups for our user can now work. I’m still scouring the internet for the exact explanation of how this work. If I find it, I will probably write another post with the information.
If I would simply use
su -l userdb or
sudo -u userdb -H -s, I would just have to execute the following:
$ sudo cgm create all $USER $ sudo cgm chown all $USER $(id -u) $(id -g) $ cgm movepid all $USER $$
This will create all cgroups under the user
$USER which is userdb. It will then set the owner of these new cgroups to the UID and GID of the userdb. And the last command move the SHELL process (
$$) within these new cgroups. Then, the rest is trivial:
$ lxc-start -n mysql55 -d
And it is working. And if you want to run those commands using
sudo, this is how you do it:
$ sudo -H -i -u userdb bash -c 'sudo cgm create all $USER; sudo cgm chown all $USER $(id -u) $(id -g)' $ sudo -H -i -u userdb bash -c 'cgm movepid all $USER $$; lxc-start -n mysql55 -d'
If you close later the SSH session and you reconnect to it, you have to run the 2 commands again, even though it might display some warning that the paths or what-not are already existing. I still have those pesky warning and extra commands which I’m not sure why I still have.
While trying to run unprivileged containers on Ubuntu 14.04 LTS I’ve met several problems which I could only solved by using the latest LXC and CGManager packages and some special PAM configuration for which I’m not 100% sure of the impact. So unprivileged LXC containers on Ubuntu 14.04 LTS is still quite rough.
But during my journey with LXC, I’ve found an even easier way to create unprivileged LXC containers, without touching PAM, but still requiring the latest LXC and CGManager packages. This solution is based on LXD, the Linux Containers Daemon. There is a really good getting started guide by Stéphane Graber, the man behind LXD. I’m exploring this avenue at the moment and will report soon on this very blog.
All of this make me really look forward to the next Ubuntu LTS due to next Spring, the newer LXC and the under-heavy-development LXD will be part of Ubuntu 16.04 LTS and this could give a really great experience out-of-the-box with Linux Containerisation powers.