Once I started to work on a terrible project. Where all the possible mistakes were done. But I’m really thankful that it has happened, because now I know how the project shouldn’t look like. One of the learning that I took from this situation is that you should do the proper data management. The test data should be reliable and reusable, otherwise your tests will be flaky.
Independence from the test system
My vision is that a test should be as independent of the test system as possible. From my point of view the test system dependency is a dangerous dependency in your test automation project. We can’t completely get rid of this dependency, but what we can do is to reduce the test system influence on the project. Ideally, I want to see all the tests in my project that can be run against any system (test, staging or live) by changing only environment URL.
If your test systems are isolated from the influences from other parts except your automation, you can use static data. What does it mean? Let’s imagine that you have a test system that is fully under your control and you are 100% sure that nobody will inject or remove some data while your tests are executing. It would be awesome to have also a possibility to reset your data to the initial state. So, in this setup you will be able to add all the data you need for your tests and use it. The advantage of this approach is simplicity. You add something once and re-use it for unlimited times. Also you don’t have to think of data generation in run-time. The disadvantage is inability to change the environment. So basically, if your test environment is broken, you can’t do anything regarding automation. I was in a such situation and it took us more than one month to prepare another test environment to be compatible with the data expected in the tests.
This is my favorite one. This approach presumes that the data for your tests is being creating on the flight, while your tests are executing. You can create new data for your test system via API calls, database queries or executing tasks that will inject all the data you need. This approach means that tests are not dependent of the changes in the test system, and you can create tests in the way you want without being afraid of breaking other tests. The advantages of this approach is flexibility and a variety of test ideas which are possible to automate. The disadvantages: you have to be able to execute API calls or DB queries; you have to implement all the methods that will generate data on the run-time; you have to figure out how to generate generic but valuable data.
Another questionable entity of the static data approach is the value of the tests. It feels to me that nothing can happen with the data that exists in your data set for years. I’ve seen these tests and it looks that they are never broken. On the contrary, I would expect discovering bugs in the process of creation something new every time and checking how this new stuff looks in the system.
Static or dynamic data
It is up to you which approach to pick out. There are situations where only one of them is possible to use, because of the project/access/people limitations. But if you have both possibilities, I would definitely choose the dynamic data. Yes, this is a bigger technical task to allow everything to be generated on the flight. But your tests will become stabler in the end. Also you will get the flexibility that is worth to be invested in.