Add new state "IGNORE-FAIL" for regressions which have a 'force' or
'force-badtest' hint. In the HTML, show them as yellow "Ignored failure"
(without a retry link) instead of "Regression", and drop the separate
"Should wait for ..." reason, as that is hard to read for packages with a long
list of tests.
This also makes retry-autopkgtest-regressions more useful as this will now only
run the "real" regressions.
This was a giant copy&paste, was disabled four months ago, and the
infrastructure for this ceased to exist.
If this comes back, the AutoPackageTest class should be generalized to also
issue phone boot tests (exposed as new architectures, which should then be
called "platforms"), to avoid all this duplicated code.
Generate https://autopkgtest.ubuntu.com/retry.cgi links for re-running tests
that regressed.
Change Excuse.html() back to usual % string formatting to be consistent with
the rest of the code.
If we have a result, directly link to the log file on swift in excuses.html.
The architecture name still leads to the package history as before.
If result is still pending, link to the "running tests" page instead.
Traceback (most recent call last):
File "/home/ubuntu-archive/proposed-migration/code/b2/britney.py", line 3380, in <module>
Britney().main()
File "/home/ubuntu-archive/proposed-migration/code/b2/britney.py", line 3329, in main
self.write_excuses()
File "/home/ubuntu-archive/proposed-migration/code/b2/britney.py", line 1992, in write_excuses
upgrade_me.remove(e.name)
ValueError: list.remove(x): x not in list
Splitting up the processes of request(), submit(), and collect() makes our data
structures, house keeping, and code unnecessarily complicated. Drop the latter
two and now do all of it in just request(). This avoids having to have a
separate requested_test map, having to fetch test results twice, and gets rid
of some state keeping.
For using britney on PPAs we need to add the "ppas" test parameter to AMQP
autopkgtest requests. Add ADT_PPAS britney.conf option which gets passed
through to test requests.
We want to treat linux-$flavor and linux-meta-$flavor as one set in britney
which goes in together or not at all. We never want to promote linux-$flavor
without the accompanying linux-meta-$flavor.
Introduce a synthetic linux* → linux-meta* dependency to enforce this grouping.
When we need to blow away and rebuild results.cache we want to avoid
re-triggering all tests. Thus collect already existing results for requested
tests before submitting new requests.
This is rather hackish now, as fetch_one_result() now has to deal with both
self.requested_tests and self.pending_tests. The code should be refactored to
eliminate one of these maps.
Add Excuse.addtest() for adding a test type/package/arch/result, so that the
excuses YAML will get structured test results instead of pre-formatted HTML.
Move the HTML rendering into Excuse.html() instead.
This supports a "test type" whose only value is "autopkgtest" right now, but
we will have "bootest", perhaps "piuparts" and other tests in the future.
Drop the "(<ver> is unbuilt/uninstallable)" note from excuses.html as this is
really a per-architecture property, not a per-tested-source one. This needs to
be re-thought and generalized.
Don't close underlying fd right after opening an apt_pkg.TagFile, as that will
prematurely end the iteration. This seems to work with more recent python3-apt,
but not with Ubuntu 12.04 LTS.
Commit 463 ("Don't promote packages with unbuilt reverse dependencies") turned
out to be too strict: This holds up too many innocent packages in -proposed.
If unstable has an unbuilt/uninstallable reverse dependency D of a package P,
trigger a test anyway (which will then most likely run against the testing
version of D). If that succeeds, the unstable P did not break D and can be
accepted. If it fails, D needs to be fixed.
Ideally we would set up some clever apt pinning to force installation of
testing-D, to avoid running into the uninstallability of unstable-D, but this
is tricky and error prone.
Drop the temporary "UNINST" state from commit 466 again. Instead, excuses.html
will now show a test against the testing version of D together with a note that
the unstable version is unbuilt/uninstallable.
This should ideally clear up all cases where a requested result is neither
present or pending. Log an error if that still happens (will be checked in the
next couple of runs), and ensure in the tests that we don't trigger any
outstanding "FIXME" log messages.
Change AutoPackageTest.results() to evaluate the Swift results instead of the
adt-britney ones.
TODO:
- Add more tests (like for adt-britney)
- Drop triggering of adt-britney tests
- Drop adt-britney tests (which fail now)
- Adjust TestBoottestEnd2End.test_with_adt
Change AutoPackageTest.results() to evaluate the Swift results instead of the
adt-britney ones.
TODO:
- Add more tests (like for adt-britney)
- Drop triggering of adt-britney tests
- Drop adt-britney tests (which fail now)
When collecting results, not only check pending tests, but also new results for
failed tests. This picks up new test results from manual retries which might
now have succeeded.
Until now, autopkgtest results were triggered via an external "adt-britney"
command from lp:auto-package-testing. This required a lot of state files and
duplicated effort, uses hardcoded absolute paths to these external tools, and
is quite hard to understand and maintain. We also want to move away from
Jenkins and rsyncing state files.
Directly retrieve autopkgtest results from a publicly readable and browsable
Swift container, with a debci-compatible layout
(https://wiki.debian.org/debci/DistributedSpec). This now tracks both requests
and results on a per-architecture granularity, so that we can track
per-architecture regressions/always-failed.
Introduce a new ADT_SWIFT_URL config option that sets the swift base URL. If
this key is not set, the behaviour does not change compared to previous
versions, and no results will be retrieved from the cloud.
This still keeps the old adt-britney requests/results as the authoritative
data and for now merely shows the swift results in addition. With that we can
compare the results and run the cloud testing in parallel to find/fix problems
until we switch over. Due to that, the code to britney.py is temporary, does
*not* use AutoPackageTest.results(), and instead just reads the internal
results map.
Extend read_sources to store a new AUTOPKGTEST boolean flag, which is true if
the Testsuite: field exists and starts with "autopkgtest" (this covers autodep8
cases like autopkgtest-pkg-perl).
Extend TestData.add() to take a new testsuite argument which specifies the
source's Testsuite: field.
Move nuninst cloning out of the check loop and always populate the new
nuninst entirely.
This will allow some simplifications in other places.
Signed-off-by: Niels Thykier <niels@thykier.net>
Use a set to filter out seen items to avoid doing O(n^2)
de-duplication. For very large hints, this can take considerable
time.
Using "seen_items" to build the actual hints on the (unverified)
assumption that Python can do something "smart" to turn a set into a
frozenset faster than it can with a list.
Signed-off-by: Niels Thykier <niels@thykier.net>
Britney is now smart enough to produce the same result from hints
regardless of the order of the items in the hint. With this in mind,
we can have the original auto-hinter produce hints as sets and filter
out duplicates as we produce them.
Note that the hints are sorted to produce deterministic output (to
make it easier to compare the hints between runs and changes).
Signed-off-by: Niels Thykier <niels@thykier.net>
Avoid some cases of O(n^2) behaviour in sort_actions and reduce the
size of n for the remaining O(n^2)-ish behaviour by filtering out
removals early on.
Signed-off-by: Niels Thykier <niels@thykier.net>