Testing Data-driven Microservices. When it comes to testing microservices, the level of complexity is raised up a notch. Microservices architecture requires the data to be passed around between components (MQ, DataBase…), which can cause erosion and damages that are hard to detect
Testing always opens a lot of questions, which cases to test? what are the edge cases? Which testing platform to use..etc.
There isn’t a single answer to any of those questions.
But when it comes to testing microservices, the level of complexity is raised up a notch. As they often cater to massive batches of data that are very varied in their nature. Besides, microservices architecture requires the data to be passed around between components (MQ, DataBase…), which can cause erosion and damages that are hard to detect — i.e.: when the data streams into MQ between microservices and gets rounded, or casting error occurs which causes “silent” issues. In this sense, the integrity of the data is another challenge to be added to the ones of velocity and diversity.
What’s more, data services are commonly performing a lot of computational functions, i.e.: Min, Max, Chi-Square, Log loss, Std,…While working with massive batches and calculating those computational functions, errors start to appear on granular parts of the data.
Moreover, While working in microservices architecture we want to ensure our application integrity. Working with microservices required to pass the data between a lot of components(MQ, DataBase…), which cause sometimes data damages without we even know.
An example of it could be when data stream into MQ between microservices and data gets rounded or casting error which causes uncaught issues.
Altogether, these errors are so hard to find that selecting the right edge cases at the testing stage becomes a mission (almost) impossible. This is what I want to address, and attempt to simplify here.
Let’s talk about an amazing tool using by most data science teams today called Jupyter Notebook. The tool offers an efficient interface for data analysis and exploration through interactive visualization. While working with this tool and implementing all the computational functions inside the Jupyter Notebook, I have found it much easier to select edge cases as they are clear, and almost effortlessly visible.
A concrete example of the use of Jupyter Notebook for testing
In the example selected below, we are going to review for testing the data microservices based on Jupyter Notebook. In this instance, we calculated all the computational functions inside the Notebook and compared them to the microservices results stored in the DataBase, making it easy to detect and fix any discrepancies in the results.
After writing the testing notebooks, the testing notebook should be integrated with the CI, to avoid running the tests manually on a daily basis.
One cool framework to run the notebook as a part of the CI part is “Papermill”, a great tool for parameterizing and executing Jupyter Notebooks.
It allows you to run a notebook using CLI and save the notebooks outputs for diagnosing it later.
Last, we can configure the testing notebooks to send notifications if there are some gaps in the results of the notebooks to our slack channel and we made this testing pipe fully automated.
Diagram of the whole pipeline:
Testing platform architecture
In the graph, one can see that Jenkins triggers the CI to start the Papermill tool, which runs all the testing Jupyter notebooks. Once the notebooks are done running, their outputs are stored in AWS S3, and the notebook's results (passed or failed) are sent to Slack.
After understanding the whole architecture, let mark the benefits of testing using the Jupyter notebook:
V Runs over a massive batch of data.
V Runs as part of the CI and ensures application integrity.
V Supports prompt diagnoses and data explore exploration.
V Gives clearly visible insights on errors and edge cases.
V Easy for debugging.
Now, Let’s perform a simple “getting started” task for testing using Jupyter Notebook and Papermill.
Are you looking for experienced, reliable, and qualified Python developers? If yes, you have reached the right place. At **[HourlyDeveloper.io](https://hourlydeveloper.io/ "HourlyDeveloper.io")**, our full-stack Python development services...
In this article, look at different ways to test microservices and how you can have a suitable testing strategy to begin with.
Looking to build robust, scalable, and dynamic responsive websites and applications in Python? At **[HourlyDeveloper.io](https://hourlydeveloper.io/ "HourlyDeveloper.io")**, we constantly endeavor to give you exactly what you need. If you need to...
The shift towards microservices and modular applications makes testing more important and more challenging at the same time. Learn more here.
After analyzing clients and market requirements, TopDevelopers has come up with the list of the best Python service providers. These top-rated Python developers are widely appreciated for their professionalism in handling diverse projects. When...