Commit 245f7772bd added code to abort the build if a snap wants to
install "core" (the 16.04 runtime). That's great but there are still
some CPC maintained image builds that use snaps based on "core". So
make it possible to continue the build if the "ALLOW_CORE_SNAP" env
variable is set.
This fixes GCE shielded VM instances integrity monitoring failures on
focal and later. Our images are built with an empty /boot/grub/grubenv
file, however after the first boot `initrdless_boot_fallback_triggered`
is set to 0. This change in `grubenv` results in integrity monitoring
`lateBootReportEvent` error.
It seems that the only thing that's checking for this `grubenv` variable
is `grub-common.service`, and it is looking specifically for a `1`
value:
if grub-editenv /boot/grub/grubenv list | grep -q
initrdless_boot_fallback_triggered=1; then echo "grub:
GRUB_FORCE_PARTUUID set, initrdless boot paniced, fallback triggered.";
fi
Unsetting this variable instead of setting it to 0 would prevent issues
with integrity monitoring.
LP: 1960537 illustrates an issue where the calls to e2fsck in the
umount_partition call are failing due to an open file handle. At this
time, we are unable to find a root cause, and it's causing many builds
to fail for CPC. Adding a sleep 30 as a workaround as the file handle
releases within that timeframe. This does not address root cause.
livecd-rootfs creates non-private mounts. When building locally using
the auto/build script unmounting fails.
To unmount dev/pts it is insufficient to make the mount private. Its
parents must be private too. Change teardown_mountpoint() accordingly.
Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
LP: 1944004 described an issue where a libc transition caused snapd
seccomp profiles to reference a path that no longer existed, leading to
permission denied errors. The committed fix for snapd then raised an
issue where running `snapd debug seeding` would present a
preseed-system-key and seed-restart-system-key due to a mismatch
between the running kernel capabilities and the profiles being loaded by
snapd. By mounting a cgroup2 type to /sys/fs/cgroup, the capabilities
match for snapd as mounted in the chroot. This is done similarly to
live-build/functions:138-140 where apparmour and seccomp actions are
mounted after updating the buildd.
With that, the Dockerfile modifications[0] currently done externally
are done now here. That means that the created rootfs tarball can be
directly used within a Dockerfile to create a container from scratch:
FROM scratch
ADD livecd.ubuntu-oci.rootfs.tar.gz /
CMD ["/bin/bash"]
[0]
https://github.com/tianon/docker-brew-ubuntu-core/blob/master/update.sh
This is a copy of the ubuntu-base project.
Currently ubuntu-base is used as a base for the docker/OCI container
images. The rootfs tarball that is created with ubuntu-base is
published under [0]. That tarball is used in the FROM statement of the
Dockerfile as base and then a couple of modifications are done inside
of the Dockerfile[1].
The ubuntu-oci project will include the changes that are currently
done in the Dockerfile. With that:
1) a Dockerfile using that tarball will be just a 2 line thing:
FROM scratch
ADD ubuntu-hirsute-core-cloudimg-amd64-root.tar.gz /
CMD ["/bin/bash"]
2) Ubuntu has the full control about the build process of the
docker/OCI container. No external sources (like [1]) need to be
modified anymore.
3) Ubuntu can publish containers without depending on the official
dockerhub containers[2]. Currently the containers for the AWS ECR
registry[3] use as a base[4] the official dockerhub containers. That's
no longer needed because a container just needs a Dockerfile described
in 1)
When the ubuntu-oci project has the modifications from [1] included,
we'll also update [1] to use the ubuntu-oci rootfs tarball as a base
and drop the modifications done at [1].
Note: Creating a new ubuntu-oci project instead of using ubuntu-base
will make sure that we don't break users who are currently using
ubuntu-base rootfs tarballs for doing their own thing.
[0] https://partner-images.canonical.com/core/
[1]
https://github.com/tianon/docker-brew-ubuntu-core/blob/master/update.sh
[2] https://hub.docker.com/_/ubuntu
[3] https://gallery.ecr.aws/ubuntu/ubuntu
[4]
https://launchpad.net/~ubuntu-docker-images/ubuntu-docker-images/+oci/ubuntu/+recipe/ubuntu-20.04
One can call divert_grub; replace_kernel; undivert_grub. And
replace_kernel will call into force_boot_without_initramfs, which
under certain conditions can call divert_grub &
undivert_grub. Resulting in undivert_grub called twice in a row.
When undivert_grub is called twice in a row it wipes
systemd-detect-virt binary from disk, as the rm call is unguarded to
check that there is something to divert if systemd package is
installed. And if the systemd package is not installed, it does not
check that systemd-detect-virt file is in-fact what divert_grub has
created.
Add a guard to check that systemd-detect-virt is the placeholder one,
before removing it.
LP: #1902260
There was a question on if the comment removals in the `sed` were
required. The comments (`#`) are created by vmdk-stream-converter and
seem to cause no issues. `ddb.comment` is no longer being written by the
tool anymore. Moved the check earlier to ensure the new header isn't too
large before running truncate (otherwise it may be too long, and we
remove bits we want)
LP: #1893898 describes missing vmtools version from the vmdk headers.
The version should be added as ddb.toolsVersion = "2147483647" however
the sed was no longer replacing a ddb.comment field with the tools
version. Rather than subbing ddb.comment with toolsVersion, this commit
deletes ddb.comment (which the comment mentions could cause errors),
and adds the correct value. There was no visibility into the descriptor
during hook creation, so debug statements were added. This allows us to
quickly verify in the logs that bad statements are removed (the possibly
offending commetns), as well as ensuring that the toolsVersion is added
MOUNTPOINT_BACKUP_SOURCE_LIST is exposed when you call
setup_mountpoint. Consumers can use this variable if they need to
explicitly change something in sources.list wihout relying on the name
livecd-rootfs chooses.
In v2.672 the default boot behavior of cloud images changed:
- Prior to v2.672, cloud images with the linux-generic kernel attempt
to boot without an initramfs, would fail, and then retry with an
initramfs.
- After v2.672, cloud images with the linux-generic kernel boot with
an initramfs on the first try.
While the behavior is different between the two, they both result in
an instance that has booted with an initramfs. To ensure the changes
in v2.672 do not regress, we need an automated way to check if we are
attempting to boot without an initramfs and failing.
With this change, when we attempt to boot with an initramfs and fail,
initrdless_boot_fallback_triggered is set to non-zero in the grubenv.
This value can be checked after boot by looking in /boot/grub/grubenv
or by using the grub-editenv list command.
The snap-preseed command can do a number of things during the build
that are currently performed at first boot (apparmor profiles, systemd
unit generation, etc). This patch adds a call to reset the seeding and
apply these optimizations when adding a seeded snap. As a prerequisite
to calling snap-preseed we need to make /dev/mem available as well as
mounts from the host to perform this work, so those are also added here.
I recently pulled initramfs logic out of the base build hook, and
dropped that into the `replace_kernel` function. Any cloud image that
does not leverage the generic virtual kernel was expected to call
`replace_kernel` to pull in a custom kernel. That function will
disable initramfs boot for images that use a custom kernel.
Minimal cloud images on amd64 use the linux-kvm kernel, but the build
hook does not utilize the `replace_kernel` function. Instead, the
kernel flavor is set in `auto/config`. I pulled that logic out of
`auto/config` and am now calling `replace_kernel` in the build hook.
I also moved a call to generate the package list so that it will pick
up the change to the linux-kvm kernel.
snap_name[/classic]=track/risk/branch is now the supported snap name
specification, which allows to specify the full default track and
optional classic confinemnt.
Supporting such specification in the seedtext allows one to specify a
better default channel. For example, this will allow lxd to switch
from latest/stable/ubuntu-20.04 to 4.0/stable/ubuntu-20.04 as 4.0 is
the LTS track matching 20.04 support timeframe.
LP: #1882374
Initramfs-less boot, which is a boot optimization, should only be
applied where we know it could work for users and provide an improved
boot boot experience; images with custom kernels are candidates for
that.
Seeing any snap via snap_preseed will evaluate the base for each snap
and seed the appropriate base. There should be no reason to explicitly
seed the 'core' snap and with snaps moving to 'core18' this will add
'core' without need.
The _snap_post_process function is meant to install snapd if core18 is the
only core snap installed or removed snapd if core is installed and snapd
was not explicitly installed. But the current logic in _snap_preseed
will never call _snap_post_process. $core_name will never be empty
with the existing logic, but even if it were that would only be for the
'core' snap and we'd miss using the 'core18' logic that pulls in snapd.
Given the case statement in _snap_post_process can handle doing the
right thing given any snap we can just call it unconditionally.
Back in 2017 some code was added to ignore failures tearing down loop
devices. But debugging that growpart race on cloud images made me (very)
aware of a potential cause of the race: doing something like zerofree on
a device will cause udev scripts to run, and if they are still running
by the time kpartx is called, you would expect the kpartx -d to fail. So
lets see if a udevadm settle helps, and get rid of one of the "sometimes
this fails but we don't know why" comments...