top of page

Flaky Tests: How to Get Rid of Them!

By Danny Paradis: Solution Architect specialized in UI Automation

May 16, 2024


Recently, while on assignment with one of my clients, I had to deal with the issue of unstable tests in the continuous integration and delivery (CI/CD) process. "Flaky tests" are those tests that sometimes pass and sometimes fail, causing instability in the test pipeline. I invite you to take the time to consider three questions that will help us develop effective strategies to manage unstable tests, explore the most common reasons behind these unexpected results, and identify the most common root causes.

 

Why is it important to manage flaky tests?

  • Reduced time loss: Flaky tests often require unnecessary re-runs, wasting resources and valuable time for technical analysts.

  • Increased delivery stress: The unpredictability of flaky tests causes stress and leads to a loss of interest among team members, affecting morale and productivity.

  • Loss of confidence in your test pipeline: Test results, whether false positives or false negatives, lead to a loss of confidence in the testing infrastructure. It becomes difficult for developers to rely on the results, which could lead to defects being introduced into production.


What are the strategies for managing flaky tests?

  • Implement monitoring: Set up monitoring that can produce a report on flaky tests and their reliability percentage. This report will help you quickly track the improvement of your test suite execution and prioritize fixing problematic tests.

  • Run your tests as often as possible: This gives you a good sampling of your tests' status.

  • Implement temporary retries: Re-running a test when it fails can be an effective strategy for managing intermittent errors. This will allow you, in the short term, to avoid adding misleading information to test reports. However, beware, as this might be just a band-aid hiding the problem. Remember that you have only circumvented the issue, so keep working to eliminate the flaky tests.

  • Disable parallel test execution: Serial testing can help detect the impact of running certain tests simultaneously. The same applies to environmental resources.

  • Limit the impact of external components on the tested code: Avoid using external components like a third-party credit card system undergoing maintenance during your tests. The goal is to ensure that tests focus solely on the behavior of the code being tested, without interference from external factors.

  • Identify root causes: It is crucial to understand why tests fail to resolve the issue. Common reasons include unstable environments, test infrastructure problems, race conditions, external dependencies, and actual bugs!

  • Refactor test code: Review and refactor your test code to improve stability, reduce false positives, and make your tests more robust and efficient.

  • Regularly update your test infrastructure: Regular maintenance and updates can ensure that your test infrastructure remains stable, reliable, and free of known issues that can contribute to instability.


Can we identify the most common root causes of flaky tests?

  • Random tests: Using random data for your tests guarantees only one thing: flaky tests. A test should always produce the same result to validate expected behavior.

  • Page synchronization: A common issue arises when the test tries to click a button that hasn't appeared on the page yet. However, this problem often encountered with Selenium tends to be resolved with new tools like Playwright.

  • Time validation: Some tests need to consider what they are truly validating. For example, the time of agents running the tests might not be in the same time zone, or your infrastructures might not all be in the same location.

  • Parallelism: Running tests in parallel is very useful for speeding up the execution of your test suite but can easily introduce flaky test results. For example, a test expecting to find 10 items in a list might be affected by another test that removes one. Ensure that your tests are all independent of each other concerning data.

  • Data sets: A data set used by multiple tests requires careful planning of its use to avoid false positives introduced by the random order of test execution.

  • Environments: Stay in touch with your infrastructure team to be informed about activities that might affect your tests. It is not uncommon for nightly tests to be impacted by weekly backups.


As you have probably understood, managing flaky tests is crucial for ensuring a stable and efficient test pipeline. By applying strategies such as identifying root causes, implementing retries, disabling parallelism, refactoring code, and regular monitoring, you can significantly reduce the occurrence of flaky tests, thereby improving the overall reliability of tests and the team's productivity.


Yes, it’s possible! Now, my client's 1000 tests run successfully every day. The flaky tests have disappeared, allowing us to confidently detect regressions. Feel free to share your thoughts and experiences in the comments below!


Ready to eliminate flaky tests from your CI/CD process? Find out how to optimize your test suite now to ensure flawless execution and gain reliability with the help of one of our experts!


 

2 views0 comments

Comments