Late last year, I started to research Outreachy and read a plethora of blogs by alumni to see if it was something I could do. It can be intimidating reading about all of these technologies you don’t understand yet. But have no fear — I am going to break down my project, and my tiny corner of the Mozilla codebase, for anyone new to programming. My project is titled: “Add support for multiple allowed statuses per test to mozlog & Gecko CI”. In a nutshell, I am working on improving the way we handle web-platform-tests that only fail sometimes, but regularly, without a bug.
Mozilla runs a suite of roughly a million tests for every change pushed to their code base; the mozilla-central repository. This extensive testing is done to ensure that when a change is merged, no regressions are introduced. A regression could be anything that breaks the existing code in the repository. In a perfect world, tests are designed to be deterministic, or without randomness; if everything in the code being tested worked, then all of these tests would pass. However, the modern web browser is complex, with lots of moving pieces. It is a much bigger task to test an entire web-browser, end-to-end, as there are many external factors and pieces that need to line up at just the right time. Combine this with the sheer enormity of the test suite and some tests are likely fail even when no bug has been introduced. Because of this, tests can fail without a bug for any number of reasons — they were run in a different order, the environment changed, the test timed out, the many moving pieces did not align at the right time, etc. Tests that fail like this randomly but regularly are known as intermittent tests.
Currently, because randomly passing or failing tests provide no valuable information and can make the view of the test results messy, intermittent tests are disabled or skipped if they are unable to be fixed. This means that we lose test coverage over that portion of the code base, risking the introduction of regressions. Right now, the only solution is to disable these tests as there as there is no way to record if a status is intermittent or not. My project aims to add support for more than one allowed test status to Mozlog and Gecko CI by recording if a test status is a “known intermittent” in the test metadata. Metadata is a record of what each test result expects. This would mean a test could be expected to have any number of allowed statuses. For example, a test could be expected to PASS or intermittently FAIL, and either result would be seen as expected. With this support, a test that is intermittent can still be run and we maintain test coverage.
Mozlog is a Python library Mozilla maintains and uses to record test results. It is available on Pypi and you can find more detailed information in the documents here: https://firefox-source-docs.mozilla.org/mozbase/mozlog.html. For every test, information is logged about that test. Each log entry, or “message”, is a JSON-compatible object that follows a specific format. Mozlog also has options to convert those logs and messages into different formats such as HTML, TBPL, Mach. As part of my project, my first steps were to add a new field to the “test_status” (subtest results) and “test_end” (test results) messages. There was already a “status”, representing the actual test result (eg. PASS or FAIL) and “expected”, representing the expected result. My aim was to add a “known_intermittent” field which listed expected intermittent results, if any. This was added to the log itself, as well as each of the optional formatters that the user may use to display the log.
# Mozlog before
}# Mozlog after
My next steps were/are to integrate this “known_intermittent” capability with the web-platform-tests harness. This is what I am currently working on.
A test harness is a tool that is used to run tests. Mozilla has a “wptrunner” test harness in their central repository that automatically runs these web-platform-tests for every change that is merged. I am adding the ability to recognise and record “known_intermittent” test statuses to this test harness, ensuring that we can continue to run these very important tests even when they might be intermittent. Expected test results are already stored in test metadata files — my addition will allow known intermittent statuses to be stored and accessed alongside expected statuses. In the example below, the expected status would be the first listed (PASS) and the expected intermittent statuses would follow (FAIL).
# Metadata before[test1]
expected: PASS# Metadata after[test1]
expected: [PASS, FAIL]
What would that look like? A developer runs a suite of tests and notices one is failing. After looking deeper into the test log, they realize it is an intermittent. The developer could then use this log to update the metadata, recording this intermittent as another expected status in the above “expected” list. The next time this test is run and fails, it will be shown as an expected intermittent status rather than a fail.
If successful, this may be an ability that the shared Web Platform Tests project could benefit from, as well as other Mozilla test suites. If time permits, I hope to add a further method to track intermittent statuses over time, by accessing the Active Data warehouse of test information. With this data, it could be possible to track and determine if a test is no longer intermittent.
Hopefully, in not-so-complicated terminology, this has shed a little light on what an Outreachy project with Mozilla could look like. Mine is one of many, and each have different skills and challenges to explore. Reading intern blogs is a great way to discover what sort of project might interest you! If you have any questions, please don’t hesitate to reach out!