Make LXC great again

Published in

OpsOps

9 min readAug 19, 2019

Warning: This post contains debugging log, which is messy, verbose, and hard to read.

My playbook to create staging environment on my laptop with LXC has been broken after Trump become a president of United States. I don’t know whom to blame: Trump, fake news or upstream changes for LXC. Because I can’t fight the fake news nor can I impeach Trump, my single solution is to fix my playbook to deal with breaking changes in upstream (But I still insist that impeachment would be an easier solution).

The problem

lxc creation task fails with following message:

lxc-create: foobar: conf.c: chown_mapped_root: 3226 lxc-usernsexec failed: No such file or directory - Failed to open ttyNo such file or directory - Failed to open tt
lxc-create: foobar: tools/lxc_create.c: main: 327 Failed to create container foobar

Initial struggle

I skip Ansible part here, as it quickly was narrowed to this call:

/usr/bin/lxc-create --name foobar --template download --bdev dir -- -d ubuntu -r xenial -a amd64

This bug hints on the source of the problem.

Unfortunately, I already havekernel.unprivileged_userns_clone = 1, and I have both /etc/subuid and /etc/subguid properly configured.

Oh, wait, it’s a different line number. My case is here. They asks to run lxc-usernsexec

$ lxc-usernsexec
Failed to find subuid or subgid allocation

Good! Someone complains on something particular, finally. Google is passive: unanswered.

Trace it!

strace -e openat -f lxc-usernsexec
...
openat(AT_FDCWD, "/proc/filesystems", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/sys/module/apparmor/parameters/enabled", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libnss_compat.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libnss_nis.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libnsl.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/etc/subuid", O_RDONLY) = 3

Stop. I’m going to read source code. src/lxc/cmd/lxc_usernsexec.c if someone is interested.

ret = read_default_map(subuidfile, ID_TYPE_UID, pwent.pw_name);
if (ret < 0)
    goto out;

Looks like we are here, and it matches the trace.

conf.h says that subuidfile is /etc/subuid. As I said earlier, mine is present:

df:100000:65536
auser:165536:65536
steam:231072:65536

I have no idea why steam and df are here (shame me, I have both Steam and Dwarf Fortress running under separate user accounts).

Before reading all that I want to be sure no dwarfs was harmed. I trimmed those files to a single auser line. Nah, I got the same error.

Now, let’s read read_default_map aloud.

… no, I can’t. Too much C there. I can’t.

Meanwhile I peeked into manual page for lxc-usernsexec. There is something unusual there about format of those files.

Each map consists of four colon-separate values. First a character 'u', 'g' or 'b' to specify whether this map pertains to user, ids, group ids, or both; next the first userid in the user namespace; next the first userid as seen on the host; and finally the number of ids to be mapped.

WUT?

I’ve changed file content to:

b:user:165536:65536

And it started to show different messages. Meeeh. Thank you very much. It’s second silent breaking change in the configuration file format from LXC in last 3 years.

The previous one was with in ~/.config/lxc/default.conf (changes from lxc.idmap to lxc.id_map).

Now I need to understand when this has happened. I need to support systems at least from Xenial till modern systems.

I wasn’t able to find anything related to my issue. Was my original format wrong? I think I need to test this manually on all my versions (eoan+, bionic, xenial).

But who write those configs? I checked my playbook — I don’t. I use them to find my id ranges, but nothing else. I found that login.postinst script touches those files, but that’s all.

Oh. We got newgidmap utility has something to do with it. Man page pointed me to man user_namespaces. I feel like I need to read it through… It has a section called ‘ User and group ID mappings: uid_map and gid_map’. No, it’s a kernel stuff.

There is another something coming.

man subuid

DESCRIPTION
 Each line in /etc/subuid contains a user name and a range of subordinate user ids
 that user is allowed to use. This is specified with three fields delimited by
 colons (“:”). These fields are:• login name or UID• numerical subordinate user ID• numerical subordinate user ID count

What, wait…

Should I have ‘b’/’u’/’g’ in front of line in /etc/subuid or shouldn’t I?

I feel I’m forced to read this C aloud. I don’t want. The best thing I thought is to put as much printf into that damn file as I can and to see where and what is happens.

I put tons of printf’s into source code and what I got?

read_default_map called
. while
. p1 ok
. p2 ok
. newmap ok
. p1 starts with 165536:65536
lxc_safe_ulong return -22
lxc_safe_ulong for p1 is failed

Booo!

lxc_safe_ulong does not like string 165536. Why?

int lxc_safe_ulong(const char *numstr, unsigned long *converted)
{
    char *err = NULL;
    unsigned long int uli;while (isspace(*numstr))
        numstr++;if (*numstr == '-')
        return -EINVAL;errno = 0;
    uli = strtoul(numstr, &err, 0);
    if (errno == ERANGE && uli == ULONG_MAX)
        return -ERANGE;if (err == numstr || *err != '\0')
        return -EINVAL;*converted = uli;
    return 0;
}

The thing I worried about is that p1 conversion will have non zero string in err. I patched it and found something else:

/tmp/lxc-3.0.3$ newuidmap $$ 0 94912564338704 65536
newuidmap: subuid overflow detected.

which is true:

newuidmap $$ 0 94912564338704 65536
newuidmap: subuid overflow detected.

Why loweruid is 94912564338704?

I checked newmap value at the end of read_default_map, and it was reasonable:

read_default_map's newmap:
 : idtype 0
 : hostid 165536
 : nsid 0
 : range 65536

… about half-hour later, somewhere in src/lxc/utils.c:run_command

. before 1st snprintf newuidmap 2666
: map->nsid: 0
: map->hostid: 94259564703760
: map->range: 65536
. after 1st snprintf newuidmap 2666 0 94259564703760 65536

Yes. What is hostid, why it’s so big? Hostid was 165536 (and as far as I could understand, this is ‘start id’ for mapped users, here they recommends to make it 100000.

… and it keep changing on each run.

* 94742822752272
* 94003596795920
* 94355804921872

Looks like a huge bug. … Is it?

Am I really doing it right? I don’t want to debug something I don’t understand.

Let me step back to a working config on Bionic and compare results.

… This is absolute mystery. My code stopped working even with old OSes. Did they ‘fixed’ something? I reported a bug, unfortunately, It’s August, so I got no response in last three days.

Day Two

As I got nothing important to do, I will indulge myself with this problem. The proper business decision would be just to switch containerization technology, or just gave up on “user containers” and switch into ‘root’, but this would put my precious workstations into perils of a rogue sudoed Ansible, which I not fancy at all.

Nevertheless, root containers is good idea to check. Are they working? Of course, not on my workstation, but on some spare bionic VM…

Few pieces missed for root containers:

gnupg to download images
apparmor package to control apparmor

I got another error:

lxc-start u1 20190819080821.946 ERROR    lsm - lsm/lsm.c:lsm_process_label_set_at:174 - No such file or directory - Failed to set AppArmor label "lxc-container-default-cgns"

I had to use apt-file to search for lxc-container-default-cgns. Nothing! Not funny at all. Should I continue to read source code? Let’s google first.

Oh, thanks, google. /etc/apparmor.d/lxc-containers.

There is such file in the repo.
There isn’t such file in any ubuntu package.

Oh, bioinc is old. That stuff is called lxc-default-cgns and it is in liblxc-common package, which I have installed.

Before trying to mess with those files, I will try something simpler. How about rolling back security updates?

apt-cache policy liblxc-common
liblxc-common:
  Installed: 3.0.3-0ubuntu1~18.04.1
  Candidate: 3.0.3-0ubuntu1~18.04.1
  Version table:
 *** 3.0.3-0ubuntu1~18.04.1 500
        500 http://mirror.servers.com/ubuntu bionic-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     3.0.1-0ubuntu1~18.04.2 500
        500 http://mirror.servers.com/ubuntu bionic-security/main amd64 Packages
     3.0.0-0ubuntu2 500
        500 http://mirror.servers.com/ubuntu bionic/main amd64 Packages...
sudo apt install liblxc-common=3.0.0-0ubuntu2 liblxc1=3.0.0-0ubuntu2 lxc=3.0.0-0ubuntu2 lxc-utils=3.0.0-0ubuntu2 lxcfs=3.0.0-0ubuntu1

(I downgraded them and installed lxcfs of older version)

… and, forget the root containers, I can do lxc-usernsexec!

So, here is is: 3.0.3–0ubuntu1~18.04.1 is broken, 3.0.0–0ubuntu2 is not.

I still have a plenty of free time, therefore I can do bisect at my pleasure.

My plan:

Create a new vanilla bionic ubuntu
Install lxc=3.0.0–0ubuntu2
Check if it works
Try upgrade to other versions and see when it breaks.
Do bisect in source code.

Fresh installation:

apt install lxc=3.0.0-0ubuntu2

.. new .config/lxc/default.conf, and… the same error. Why?

Ok, the thing I made outside of downgrade was reboot. So, reboot. …and the same error: Failed to find subuid or subgid allocation.

Good thing, I hadn’t reinstalled the Ubuntu where this command had worked.

Let’s equal installed packages.

aha! I said lxc=3.0.0–0ubuntu2, but other packages installed of a new version… So, downgrade them… Nah.

Install apparmor and gnupg — nah… Copy subuid/gid/default.conf from working copy — nah.

Something more subtle is involved here. The error message, meanwhile, become different:

lxc 20190819085909.735 ERROR    lxc_conf - conf.c:write_id_mapping:2707 - Operation not permitted - Failed to write uid mapping to "/proc/696/uid_map"

Good, the last one I fixed by installing uidmap.

So, something doing it. Before repeating fresh installation I want to try to find the core problem in a dirty state.

set a simpler subuid/gid config. Good, simple cloud-user:100000:65536 does work, without reboot and after reboot.
Upgrade lxc packages one by one: liblxc-common -> 3.0.3–0ubuntu1~18.04.1: BUM! Now I have a bug.

The installation actually caused a multiple installations:

The following additional packages will be installed:
  liblxc1 lxc-utils
Suggested packages:
  btrfs-tools lvm2 lxc-templates lxctl
Recommended packages:
  libpam-cgfs
The following packages will be upgraded:
  liblxc-common liblxc1 lxc-utils

Something here is broken. Manwhile, downgrade does help. I can remove lxcfs package with no issues, so we are down to those:

ii  liblxc-common                         3.0.0-0ubuntu2                    
ii  liblxc1                               3.0.0-0ubuntu2                    
ii  lxc                                   3.0.0-0ubuntu2                    
ii  lxc-utils                             3.0.0-0ubuntu2

And lxc is transitional package for lxc-utils. Can I remove it without breaking anything?

Yep. Three packages left: liblxc-common, liblxc1 and lxc-utils. All three have the same source package:

Source: lxc

That means, we can bisect problem with only one source package. Good!

Now, I need to find the minimal version delta. Insofar I have three candidates to install:

3.0.3-0ubuntu1~18.04.1 500
        500 http://mirror.servers.com/ubuntu bionic-updates/main amd64 Packages
     3.0.1-0ubuntu1~18.04.2 500
        500 http://mirror.servers.com/ubuntu bionic-security/main amd64 Packages
 *** 3.0.0-0ubuntu2 500
        500 http://mirror.servers.com/ubuntu bionic/main amd64 Packages
        100 /var/lib/dpkg/status

Upgrading to 3.0.1–0ubuntu1~18.04.2

It does work before reboot and after reboot.

So, Something Horrbile has happened between 3.0.1–0ubuntu1~18.04.2 and 3.0.3–0ubuntu1~18.04.1

But wait! Let me ask you, which version is newer? 3.0.1 or 3.0.3? Looks like 3.0.3. But look at the tail of the verison. Is ~18.04.1newer than ~18.04.2? I don’t believe so. So, it’s some creepy Ubuntu-style versions which break things.

Let’s look into changelogs… They are useless, and there is something about ‘bugfixes’ for lxc-usernsexec. Should I try to bisect this manually to the source? Unfortunately, there are tones of changes, I won’t go that way.

I posted update to the bugreport to help the afflicted.

Back to my machine

I’m out of luck as my machine have only one option: lxc=3.0.3, and bug is here.

What to do? Well, my first simple attempt would be to install bionic’s version of 3.0.1.

I’ll simply copy those three packages from remote bionic to my machine… Now I need to be a bit more carefull to avoid breaking My Precious…

I done this. I got another error:

lxc 20190819094533.418 ERROR    lxc_conf - conf.c:lxc_map_ids:2919 - newgidmap failed to write mapping "newgidmap: write to gid_map failed: Invalid argument": newgidmap 14003

That’s funny. I even checked the uidmap version: it’s the same.

The thing they called:

newuidmap $$ 0 165536 65536

Does not work neigher on my machine nor remote. But remote lxc-usernsexec works. Let’s see working strace… Their exec from strace does not work eithgher, so I thing something fishy is about ‘$$’. Let’s test with external sleep in background. Does not work on working machine with sleep either… I’m doing something wrong…

newuidmap pid uid loweruid count
...
newuidmap verifies that the caller is the owner of the process indicated by pid and that for each of the above sets, each of the
       UIDs in the range [loweruid, loweruid+count] is allowed to the caller according to /etc/subuid before setting
       /proc/[pid]/uid_map

I’m doing something wrong. Let’s me make newuidmap works… Why can’t I write this into my sleep? I do own it!

newuidmap 1121 0 231072 30
newuidmap: write to uid_map failed: Operation not permitted

Eh… Forget about it. I found reason. My /etc/subguid was mangled after my old tries (it had ‘b’ before username).

Now it works! MUhahahahaha.

What is left is to hardcode downgrade/hold code into my playbook and we are ready to go. I really hopes this bug will be fixed shortly.

Well, I’ve reported it to Ubuntu: https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1840639

Finale

If you found this article and have troubles to read this through, short answer to probable cause is:

Try to downgrade to lxc packages from version 3.0.1, as 3.0.3 looks broken.