Cloud test framework can run into errors (like SSH connection drops)
which cause it to error out and not report test results. To handle these
kinds of errors the policy now checks the return code and stores
std_err.
Also updates the test already run check to not count states of zero
failures and non-zero errors; since this means the test framework
encountered an error external to the tests.
Cloud policy now uses the ADT_PPAS field as the source but will only
take private PPAS due to requiring a fingerprint. If ADT_PPAS is empty
then the 'proposed' archive is used.
Adds logic to try and retrieve the install source of a package from the
logs. This is useful if multiple PPAs are defined since the policy won't
explicitly know when contains the package under test.
Changes made
- Multiple hardcoded fields moved to config
- Series is now retrieved from options
- Pocket is now called source and retrieved from config
- Adds source type config which can be either archive or ppa
- Returns REJECTED_PERMANENTLY policy verdict when test failures or
errors occur. Adds verdict info the the excuse.
The cloud policy is currently used to test whether the proposed
migration will break networking for an image in the Azure cloud. Cloud
testing will likely increase in scope in the future.
britney currently spends a majority of its runtime querying for baseline
test results that it won't find, and that it doesn't need. Refactor to
eliminate many of these excess queries.
The initial db population for the series takes quite a while, so to not block
on this for the release opening process we can let britney talk directly to
swift in the short term.
@canonical.com is now DKIM signed and SPF published which means emails
from proposed-migration running on snakefruit sending direct would
likely be caught out. Since we're here, the project is Ubuntu related
so switch to using an @ubuntu.com address instead.
When querying swift there is no way to take results only newer than a
specified point, you can only query newer than or equal to. But for sqlite
we can absolutely use > instead of >= and avoid re-processing results we've
already seen.
Logging all force-reset-test hints for every package causes
about 850 MB of logs in the last run of 880 MB of logs in total,
let's only log ones matching the package instead, as we do for
force-badtest.
In Ubuntu, we only fetch results on demand, so we might not
have seen the results yet.
Debian always fetches results at the beginning so has all the
data ready.
This check has been present for a long time but there is no reason for it -
there is code elsewhere that explicitly checks for both options being set
together and DTRT. And this saves a minute on each britney run to not
regenerate uninstallability information that was just generated.
Due to the number of hints in standing use in Ubuntu, hints.search() is an
expensive operation, and we call it once for *every single test* referenced
from -proposed. Since force-reset-test are a small proportion of the hints
in use, searching once for all the hints of this type and only searching
this subset for each autopkgtest improves performance (with 23000
autopkgtests referenced in -proposed, this saves roughly 1 minute of
runtime, or 11% on a 9-minute britney run; the number of packages in
-proposed is typically much higher at other points in the release cycle,
therefore the absolute improvement in performance is expected to be
greater.)
The force-reset-test hints are an Ubuntu delta so this is not expected to be
upstreamed; and it could eventually be dropped if and when baseline
retesting is implemented in Ubuntu and the number of hints required drops.
This could be implemented with a more generic, elegant solution in
HintsCollection, but again, the scalability problem of hints is hopefully
short-lived so I didn't consider it worth the investment here.