In v2.672 the default boot behavior of cloud images changed:
- Prior to v2.672, cloud images with the linux-generic kernel attempt
to boot without an initramfs, would fail, and then retry with an
initramfs.
- After v2.672, cloud images with the linux-generic kernel boot with
an initramfs on the first try.
While the behavior is different between the two, they both result in
an instance that has booted with an initramfs. To ensure the changes
in v2.672 do not regress, we need an automated way to check if we are
attempting to boot without an initramfs and failing.
With this change, when we attempt to boot with an initramfs and fail,
initrdless_boot_fallback_triggered is set to non-zero in the grubenv.
This value can be checked after boot by looking in /boot/grub/grubenv
or by using the grub-editenv list command.
Builds in LP with the Xenial kernel were happy with the recursive mount of
/sys inside the chroot while performing snap-preseeding but autopkgtests
with the groovy kernel failed. With the groovy kernel the build was
unable to unmount sys/kernel/slab/*/cgroup/* (Operation not permitted).
This patch mounts /sys and /sys/kernel/security in the chroot in the
same way we've added for binary hooks. This provides the paths under
/sys needed for snap-preseed while avoiding issues unmounting other
paths.
The snap-preseed command can do a number of things during the build
that are currently performed at first boot (apparmor profiles, systemd
unit generation, etc). This patch adds a call to reset the seeding and
apply these optimizations when adding a seeded snap. As a prerequisite
to calling snap-preseed we need to make /dev/mem available as well as
mounts from the host to perform this work, so those are also added here.
I recently pulled initramfs logic out of the base build hook, and
dropped that into the `replace_kernel` function. Any cloud image that
does not leverage the generic virtual kernel was expected to call
`replace_kernel` to pull in a custom kernel. That function will
disable initramfs boot for images that use a custom kernel.
Minimal cloud images on amd64 use the linux-kvm kernel, but the build
hook does not utilize the `replace_kernel` function. Instead, the
kernel flavor is set in `auto/config`. I pulled that logic out of
`auto/config` and am now calling `replace_kernel` in the build hook.
I also moved a call to generate the package list so that it will pick
up the change to the linux-kvm kernel.
snap_name[/classic]=track/risk/branch is now the supported snap name
specification, which allows to specify the full default track and
optional classic confinemnt.
Supporting such specification in the seedtext allows one to specify a
better default channel. For example, this will allow lxd to switch
from latest/stable/ubuntu-20.04 to 4.0/stable/ubuntu-20.04 as 4.0 is
the LTS track matching 20.04 support timeframe.
LP: #1882374
Initramfs-less boot, which is a boot optimization, should only be
applied where we know it could work for users and provide an improved
boot boot experience; images with custom kernels are candidates for
that.
Generic cloud images with the linux-generic kernel are not able to
boot without an initramfs. Previously, these images attempted to boot
without an initramfs, would fail, and then retry with an initramfs.
This slows the boot and is confusing behavior.
It was reported and confirmed in LP bug #1875400
(https://bugs.launchpad.net/cloud-images/+bug/1875400) that on the public
KVM cloud image there exists a large list of packages marked for auto-removal.
This should never be the case on a released cloud image.
These packages are marked for auto-removal because in the KVM image binary hook
we removed both initramfs-tools and busybox-initramfs packages. Due to package
dependencies this also removed:
busybox-initramfs* cloud-initramfs-copymods* cloud-initramfs-dyn-netconf*
cryptsetup-initramfs* initramfs-tools* initramfs-tools-core* multipath-tools*
overlayroot* sg3-utils-udev* ubuntu-server*
But it did not remove all the packages that the above list depended on.
This resulted in all those packages being marked for auto-removal because they
were not manually installed nor did they have any manually installed packages
that depended on them.
The removal of initramfs-tools and busybox-initramfs was to avoid the
generation of initramfs in images that should boot initramfsless.
This requirement is obsolete now because the initramfsless boot handling
is now handled via setting GRUB_FORCE_PARTUUID in /etc/default/grub.d/40-force-partuuid.cfg.
In test images I have verified that GRUB_FORCE_PARTUUID is set and that
boot speeds have not regressed.
LP: #1875400
Vagrant images were previously put at 10G, but this was a regression
from Trusty, in which they were 40G. This made it a tough sell for
users to upgrade if they were using a Ubuntu desktop experience.
This change does not impact disk usage as Vagrant with the virtualbox
provider dynamically allocates space with the VMDK. On a test system,
the VMDK took up 1.1G of disk space according to df, and after
creating a 2G file in Vagrant, the VMDK grew to 3.1G.
Therefore, users who are running on a system with little free space will
not see adverse effects if they upgrade to a new vagrant image
Seeing any snap via snap_preseed will evaluate the base for each snap
and seed the appropriate base. There should be no reason to explicitly
seed the 'core' snap and with snaps moving to 'core18' this will add
'core' without need.
The _snap_post_process function is meant to install snapd if core18 is the
only core snap installed or removed snapd if core is installed and snapd
was not explicitly installed. But the current logic in _snap_preseed
will never call _snap_post_process. $core_name will never be empty
with the existing logic, but even if it were that would only be for the
'core' snap and we'd miss using the 'core18' logic that pulls in snapd.
Given the case statement in _snap_post_process can handle doing the
right thing given any snap we can just call it unconditionally.
In the buildd image chroot, /etc/resolv.conf is a symbolic link to
a configuration file in the /run directory. A call to truncate will
modify that file, which we should not do. Instead, we want to remove
the symbolic link and replace it with an empty file.
Back in 2017 some code was added to ignore failures tearing down loop
devices. But debugging that growpart race on cloud images made me (very)
aware of a potential cause of the race: doing something like zerofree on
a device will cause udev scripts to run, and if they are still running
by the time kpartx is called, you would expect the kpartx -d to fail. So
lets see if a udevadm settle helps, and get rid of one of the "sometimes
this fails but we don't know why" comments...