https://www.stefan-winter.net/presentations/flaky_tests_coping.html
Flaky tests: Non-deterministic failures of automated regression tests in continuous integration pipelines that are not caused by software regressions.
Comprehensive root cause analyses (Luo et al. 2014; Gruber et al. 2021; Hashemi, Tahir, and Rasheed 2022)
libkdumpfileExcerpt from:
https://github.com/ptesarik/libkdumpfile/blob/c54a90c2756e0ca7f9b45662ad3c987403ee7360/tests/xlatmap-check
libkdumpfileExcerpt from accepted fix:
https://github.com/ptesarik/libkdumpfile/blob/e6c5fde6ac7201185292539bef7203c9618ac773/tests/xlatmap-check
If detected: Proceed with integration (no regression),
skip execution of test in the future
Result: 5 flaky tests in the study reveal regressions in the commit history
→ Quarantining tests diminishes test suite power
→ Better approaches than “flag + skip” desirable
Source: (Luo et al. 2014)
Source: (Gruber et al. 2021)
libkdumpfile has 184 testslibkdumpfileInsight: No shared resource access → no order dependency
Idea: Run every test once and record access rights on files, sockets, …
Insight: No shared resource access → no order dependency
Idea: Run every test once and record access rights on files, sockets, …
libkdumpfile:
33,672 test pair runs and > 2h
→
4 test pair runs and < 1s
Flaky tests threaten regression testing.
Coping strategies:
Research focus: Software Dependability, Software Testing, Reproducibility