This avoids endlessly requeuing the test if the test produces
an older result.
This will make tests "disappear" if the infrastructure returns
old results for newer triggers but avoids the problem right
now where we end up queuing the same tests every run.
Tim assures me the auth token in the name is read-only, and also exposed via
http redirects whenever one accesses the bucket via autopkgtest.u.c
frontend, so there is no security issue here; and accessing direct gets us
results even when the autopkgtest.db is out of date (which is the problem we
have right now that we want to route around).
the autohinter is currently hitting the default Python stack limit; we should
try raising it, this is the intended britney behavior and the system is here
primarily to run proposed-migration so we should not be constrained by the
default
For rolling out britney on a new machine, we want to generate update_excuses
and update_output to confirm it's working correctly all the way through, so
we don't want to use the global --dry-run option; but we *do* want to
disable queuing tests and instead let the production instance of britney
queue the tests while we simply query the results. Add support for
ADT_ENABLE=dry-run in britney.conf, parallelling the behavior of other
policies.
britney currently spends a majority of its runtime querying for baseline
test results that it won't find, and that it doesn't need. Refactor to
eliminate many of these excess queries.
The initial db population for the series takes quite a while, so to not block
on this for the release opening process we can let britney talk directly to
swift in the short term.
@canonical.com is now DKIM signed and SPF published which means emails
from proposed-migration running on snakefruit sending direct would
likely be caught out. Since we're here, the project is Ubuntu related
so switch to using an @ubuntu.com address instead.
When querying swift there is no way to take results only newer than a
specified point, you can only query newer than or equal to. But for sqlite
we can absolutely use > instead of >= and avoid re-processing results we've
already seen.
Logging all force-reset-test hints for every package causes
about 850 MB of logs in the last run of 880 MB of logs in total,
let's only log ones matching the package instead, as we do for
force-badtest.
In Ubuntu, we only fetch results on demand, so we might not
have seen the results yet.
Debian always fetches results at the beginning so has all the
data ready.
This check has been present for a long time but there is no reason for it -
there is code elsewhere that explicitly checks for both options being set
together and DTRT. And this saves a minute on each britney run to not
regenerate uninstallability information that was just generated.
Due to the number of hints in standing use in Ubuntu, hints.search() is an
expensive operation, and we call it once for *every single test* referenced
from -proposed. Since force-reset-test are a small proportion of the hints
in use, searching once for all the hints of this type and only searching
this subset for each autopkgtest improves performance (with 23000
autopkgtests referenced in -proposed, this saves roughly 1 minute of
runtime, or 11% on a 9-minute britney run; the number of packages in
-proposed is typically much higher at other points in the release cycle,
therefore the absolute improvement in performance is expected to be
greater.)
The force-reset-test hints are an Ubuntu delta so this is not expected to be
upstreamed; and it could eventually be dropped if and when baseline
retesting is implemented in Ubuntu and the number of hints required drops.
This could be implemented with a more generic, elegant solution in
HintsCollection, but again, the scalability problem of hints is hopefully
short-lived so I didn't consider it worth the investment here.