Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapd fails to function in L2 LXD container (HOST -> L1 LXD -> L2 LXD) #14770

Open
1 of 6 tasks
flotter opened this issue Jan 13, 2025 · 4 comments
Open
1 of 6 tasks

Snapd fails to function in L2 LXD container (HOST -> L1 LXD -> L2 LXD) #14770

flotter opened this issue Jan 13, 2025 · 4 comments
Assignees

Comments

@flotter
Copy link

flotter commented Jan 13, 2025

Why would this help us

Not all Canonical hosted running infrastructure on GH supports VM nesting today. This means that for a consistent Spread test matrix, we cannot simply fall back to Multipass backends in our test infrastructure with Spread. Some architectures today simply will not support it.

The reason nesting is needed is because:

Spread -> Test Suite runs in LXD / Multipass instance -> Craft tools build using Multipass or LXD backend.

So no matter how you spin it, nesting is needed either at the VM level or the LXD level. A LXD solution is preferred as its less processing intensive and also leaner in general.

I believe the entire company can benefit from this, since many products (most?) have or will have a craft tool. All craft tools share the same software architecture, so they have the same requirements, more or less.

Required information

  • Distribution: Ubuntu Desktop
  • Distribution version: 24.04
  • The output of "lxc info" or if that fails:

root@test1:~# lxc info --project=Xcraft --show-log base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713
Name: base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713
Status: RUNNING
Type: container
Architecture: x86_64
PID: 925
Created: 2025/01/08 15:47 UTC
Last Used: 2025/01/12 15:52 UTC

Resources:
Processes: 8
CPU usage:
CPU usage (in seconds): 9
Memory usage:
Memory (current): 173.43MiB
Network usage:
eth0:
Type: broadcast
State: UP
Host interface: veth4e343162
MAC address: 00:16:3e:d3:01:b1
MTU: 1500
Bytes received: 34.45kB
Bytes sent: 37.40kB
Packets received: 229
Packets sent: 340
IP addresses:
inet: 10.128.136.230/24 (global)
inet6: fd42:57fd:66d2:84e7:216:3eff:fed3:1b1/64 (global)
inet6: fe80::216:3eff:fed3:1b1/64 (link)
lo:
Type: loopback
State: UP
MTU: 65536
Bytes received: 0B
Bytes sent: 0B
Packets received: 0
Packets sent: 0
IP addresses:
inet: 127.0.0.1/8 (local)
inet6: ::1/128 (local)

Log:

lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250112155231.414 ERROR cgroup2_devices - ../src/src/lxc/cgroups/cgroup2_devices.c:bpf_program_load_kernel:332 - Operation not permitted - Failed to load bpf program: (null)
lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250112155231.427 WARN idmap_utils - ../src/src/lxc/idmap_utils.c:lxc_map_ids:165 - newuidmap binary is missing
lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250112155231.427 WARN idmap_utils - ../src/src/lxc/idmap_utils.c:lxc_map_ids:171 - newgidmap binary is missing
lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250112155231.428 WARN idmap_utils - ../src/src/lxc/idmap_utils.c:lxc_map_ids:165 - newuidmap binary is missing
lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250112155231.428 WARN idmap_utils - ../src/src/lxc/idmap_utils.c:lxc_map_ids:171 - newgidmap binary is missing
lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250113065423.514 WARN idmap_utils - ../src/src/lxc/idmap_utils.c:lxc_map_ids:165 - newuidmap binary is missing
lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250113065423.514 WARN idmap_utils - ../src/src/lxc/idmap_utils.c:lxc_map_ids:171 - newgidmap binary is missing
lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250113065833.819 WARN idmap_utils - ../src/src/lxc/idmap_utils.c:lxc_map_ids:165 - newuidmap binary is missing
lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250113065833.820 WARN idmap_utils - ../src/src/lxc/idmap_utils.c:lxc_map_ids:171 - newgidmap binary is missing
lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250113065933.617 WARN idmap_utils - ../src/src/lxc/idmap_utils.c:lxc_map_ids:165 - newuidmap binary is missing
lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250113065933.617 WARN idmap_utils - ../src/src/lxc/idmap_utils.c:lxc_map_ids:171 - newgidmap binary is missing
lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250113070344.811 WARN idmap_utils - ../src/src/lxc/idmap_utils.c:lxc_map_ids:165 - newuidmap binary is missing
lxc Xcraft_base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 20250113070344.812 WARN idmap_utils - ../src/src/lxc/idmap_utils.c:lxc_map_ids:171 - newgidmap binary is missing

Issue description

A brief description of the problem. Should include what you were
attempting to do, what you did, what happened and what you expected to
see happen.

https://github.com/canonical/lxd-ci/blob/19fab31a94862a6eb24f33994839d26c8e778a19/tests/container#L41

Steps to reproduce

  1. Create an L1 LXD instance on the host
  2. Install a craft tool like snapcraft or rockcraft on it
  3. Create a minimal YAML file
  4. Use the LXD backend to build the artefact --use-lxd

L1 LXD ~/> lxc --project Xcraft exec local:base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 -- env CRAFT_MANAGED_MODE=1 DEBIAN_FRONTEND=noninteractive DEBCONF_NONINTERACTIVE_SEEN=true DEBIAN_PRIORITY=critical systemctl restart snapd.service

Job for snapd.service failed because the control process exited with error code.
See "systemctl status snapd.service" and "journalctl -xeu snapd.service" for details.

L1 LXD ~/> lxc exec --project=Xcraft base-instance-Xcraft-buildd-base-v7-c-a839ea97c42df2065713 -- /bin/bash

L2 LXD ~/> journalctl -xeu snapd.service

Jan 13 06:59:35 Xcraft-hello-world-on-amd64-for-amd64-38803601 snapd[836]: overlord.go:274: Acquiring state lock file
Jan 13 06:59:35 Xcraft-hello-world-on-amd64-for-amd64-38803601 snapd[836]: overlord.go:279: Acquired state lock file
Jan 13 06:59:35 Xcraft-hello-world-on-amd64-for-amd64-38803601 snapd[836]: daemon.go:250: started snapd/2.66.1+24.04 (series 16; classic) ubuntu/24.04 (amd64) linux/6.8.0-51-generic.
Jan 13 06:59:35 Xcraft-hello-world-on-amd64-for-amd64-38803601 snapd[836]: main.go:142: system does not fully support snapd: apparmor detected but insufficient permissions to use it
Jan 13 06:59:35 Xcraft-hello-world-on-amd64-for-amd64-38803601 snapd[836]: daemon.go:353: adjusting startup timeout by 30s (pessimistic estimate of 30s plus 5s per snap)
Jan 13 06:59:35 Xcraft-hello-world-on-amd64-for-amd64-38803601 snapd[836]: backends.go:58: AppArmor status: apparmor is enabled and all features are available
Jan 13 06:59:35 Xcraft-hello-world-on-amd64-for-amd64-38803601 snapd[836]: cannot run daemon: state startup errors: [cannot reload snap-confine apparmor profile: cannot load apparmor profiles: exit status 243
Jan 13 06:59:35 Xcraft-hello-world-on-amd64-for-amd64-38803601 snapd[836]: apparmor_parser output:
Jan 13 06:59:35 Xcraft-hello-world-on-amd64-for-amd64-38803601 snapd[836]: /usr/sbin/apparmor_parser: Unable to replace "mount-namespace-capture-helper". /usr/sbin/apparmor_parser: Access denied. You need policy admin privileges to manage profiles.
Jan 13 06:59:35 Xcraft-hello-world-on-amd64-for-amd64-38803601 snapd[836]: /usr/sbin/apparmor_parser: Unable to replace "/usr/lib/snapd/snap-confine". /usr/sbin/apparmor_parser: Access denied. You need policy admin privileges to manage profiles.
Jan 13 06:59:35 Xcraft-hello-world-on-amd64-for-amd64-38803601 snapd[836]: ]
Jan 13 06:59:35 Xcraft-hello-world-on-amd64-for-amd64-38803601 systemd[1]: snapd.service: Main process exited, code=exited, status=1/FAILURE

Information to attach

  • Any relevant kernel output (dmesg)
  • Container log (lxc info NAME --show-log)
  • Container configuration (lxc config show NAME --expanded)

security.nesting=true set on both container levels.

  • Main daemon log (at /var/log/lxd/lxd.log or /var/snap/lxd/common/lxd/logs/lxd.log)
  • Output of the client with --debug
  • Output of the daemon with --debug (alternatively output of lxc monitor while reproducing the issue)
@flotter
Copy link
Author

flotter commented Jan 13, 2025

@mihalicyn

Hi Frederik!

I can reproduce your problem quite easily with just 2 levels of LXD containers and yes snapd refuses to work inside L2 container. I can say that it's not a degradation, at least, from LXD side.

I'll play with snapd more to figure out what we can do (if we can) from LXD side to make it work (maybe, interception can help or some changes in AppArmor profile for LXD instances).

@flotter
Copy link
Author

flotter commented Jan 13, 2025

@mihalicyn

oh so 3 levels deep? not something ive tested, why is this needed in this case?

containers themselves are 2 level nested, i.e. - - ,
but snapd yes, it's snapd that we run in - - -

and as I can see we even have this skipped in tests:
https://github.com/canonical/lxd-ci/blob/19fab31a94862a6eb24f33994839d26c8e778a19/tests/container#L41

@flotter
Copy link
Author

flotter commented Jan 13, 2025

@tomponline

this is most likely a limitation of apparmor stacking, but @mihalicyn is investigating if there is a workaround

@flotter
Copy link
Author

flotter commented Jan 13, 2025

@jrjohansen

apparmor's stacking itself doesn't have a limit but if we are talking about nesting user namespaces, apparmor currently can only properly distinguish between the root ns and its first child. This is an LSM limitation that we are working on fixing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants