Software systems today contain hundreds of thousands to millions of lines of code written from anywhere between a few developers at a start up to thousands at today’s software giants. Working with large amounts of code with many developers results in overlapping usage and modification of the API’s being used by the developers. With this comes the danger of a small change breaking large amounts of code. This raises the question of how we as developers working on projects of this scale can ensure that the code that we write not only works within the context in which we are working, but also doesn’t cause bugs in other parts of the system. (Getting it to compile always seems to be the easy part!)
Here at Sumo Logic we currently have over 60 developers working on a project with over 700k lines of code1 split up into over 150 different modules2. This means that we have to be mindful of the effects that our changes introduce to a module and the effect that the changed module has on other modules. This gives us two options. First, we can try really really hard to find and closely examine all of the places in the code base that our changes affect to make sure that we didn’t break anything. Second, we can write tests for every new functionality of our code. We prefer the second option, not only because option one sounds painful and error prone, but because option two uses developers’ time more efficiently. Why should we have to waste our time checking all the edge cases by hand every time that we make a change to the project?
For this reason, at Sumo Logic we work by using the methods of test driven development. (For more on this see test driven development.) First we plan out what the end functionality of our changes should be for the system and write both unit and integration tests for our new functionality. Unit tests offer specific tests of edge cases and core functionality of the code change, while integration tests exercise end to end functionality. Since we write the tests before we write the new code, when we first run the updated tests, we will fail them. But, this is actually what we want! Now that we have intentionally written tests that our code will fail without the addition of our new functionality, we know that if we can manage to pass all of our new tests as well as all of the pre-existing tests written by other developers that we have succeeded with our code change. The benefits of test driven development are two-fold. We ensure that our new functionality is working correctly and we maintain the previous functionality. Another instance in which test driven development excels is in refactoring code. When it becomes necessary to refactor our code, we can refactor at ease knowing that the large suites of tests that we wrote during the initial development can tell us if we succeeded in our refactoring. Test driven development calls this red-green-refactor where red means failing tests and green means passing tests. Rejoice, with well-written tests we can write new code and refactor our old code with confidence that our finished work will continue to function without introducing bugs into the system.
Despite all of this testing, it is still possible for bugs to slip through the cracks. To combat these bugs, here at Sumo Logic we have multiple testing environments for our product before it is deemed to be of a high enough standard to be released to our customers. We have four different deployments. Three of them are for testing and one is for production. (For more on this see deployment infrastructure and practices.) Our QA team performs both manual and additional automated testing on these deployments, including web browser automation tests. Since we need a sizable amount of data to test at scale, we route log files from our production deployment into our pre-production environments. This makes us one of our own biggest customers! The idea of this process is that by the time a build passes all of the unit/integration tests and makes it through testing on our three non-production environments, all of the bugs will be squashed allowing us to provide our customers with a stable high performing product.
1 ( find ./ -name ‘*.scala’ -print0 | xargs -0 cat ) | wc -l
2 ls -l | wc -l