Service Integration

Since the 90s we know that Service Integrations via a shared file storage should be avoided. The system cannot be tested very well, debugging is hard and data not trustworthy as it’s manipulated by many sources.

Surprisingly, what we QAs of ThoughtWorks have observed in different teams with different clients over the past few months is a very strong tendency to service integrations via S3 buckets in Amazon Web Services (AWS). Many people (e.g. while Inceptions at the beginning of a project) that were working for our clients for many years actually suggested to read from / write to existing buckets. The main reason for those buckets to be widely used and accepted throughout the organizations were the data science projects that were starting or even already established in the companies.

Hence, the justifications we heard for such an integration between services was along the lines of “the data is there anyways” and/or “the data scientist will also need this data”. But being a data-centric or even a data-driven company should not compromise the way the integrations are built on the technical side. Doing “Data Science” does not require anti-patterns in architecture.

Also, we believe that this setup makes it actually more difficult to do accurate data science. No one could point us to a resource that helped to get an overview of what data is stored where. It’s unclear which data is used for what. Even worse, it’s not always possible to connect any two data stores in the data analysis. If we think of more than a couple of services, the integrations via S3 buckets looks like in the following diagram:

All of those services integrate via some S3 buckets. Those buckets are distributed throughout the architecture. Most of them already have old data but no one is really sure if it’s old data or still used as they have multiple read & write accesses and there is no way to keep an overview. This results in the current big data swamps we are working with. Mind the plural: swamps. Also, we don’t see them as clean lakes. Those buckets are sometimes not connected at all and the data is a pile of mud. This is what we reference to as a “data swamp”:

As QAs, we like to have clean integrations that can easily be tested and a clean flow of data through the system that enables debugging. It does not necessarily need to be predictive for big data, but an understanding of what type of data is flowing from Service A to Service B would undoubtedly be helpful. It is very hard to write integration- or maintainable end-to-end tests for those kinds of architectures with integrations of that kind. That is obviously one of the main motivations for us QAs to look more into this issue.

We advocate for a clean data storage for the data engineers and in our role we are actually in a good spot to do this: QAs typically care more than other people about Cross Functional Requirements (CFR). Storing masses of data in a proper system is as much a valid CFR as looking out for security, performance or e.g. usability.

Hence, in the last weeks we started discussing and we are wondering where this (quite enthusiastic!) buy-in for old anti-patterns is coming from. We are trying to understand more of the root causes of this type of architecture (besides doing “data science”). Does it come down to the typical “it’s quick to develop”, “we all understand the technology” or “we did not know what else to use”?

A more sustainable approach would obviously be to enable services to directly communicate with each other and not via shared file storage. That would allow any client to have stateless services that each store just the information they need.

That would allow us to build a more scalable system. It would allow us to test the integration between services much better – or at all for that matter! It does not necessarily all be via well defined APIs in synchronous calls. When one providing service has many consumers which only need new information at some point of time or if you have synchronous calls are cascading all through your system you may want to use something like the AWS simple notification service (SNS) instead (or Kafka to not just talk amazon here or rabbitmq or…). But any separation that is more structured will enable a company to actually scale faster.

But implementing APIs alone won’t do the trick. This suggestion for improvement has the base assumption of well split domains. I explained this in detail in this blog post.

Once you build a more structured architecture and have clean integrations, not only the future development of this application will benefit, there will also be a huge improvement for the the data scientists. When each service stores the relevant data on its own one can “simply copy”¹ the data and “do” any science on it. In one single place.² That would allow the data people to work on their own without the need of product teams building reports and data pipelines.


Footnotes:

  1. “simply copying” the data may also involve some queues/messages. It should not be a local bash script that is literally copying. For static data analysis you won’t need real-time data anyways, in this case it’s fine if you need some minutes to aggregate, process and store the data.
  2. “in one single place” may actually be different data stores: one for public information (like flight dates), one for not public information (like usernames/user identifiers) and yet another one for information that even most people of a company should not have (like financial data or user passwords).

How To: Consumer Driven Contract Tests

In my last post I wrote about how to test micro services. When I wrote about the most difficult part of the testing, the integration, I said that it would blow the size of a post to go into the necessary details. I am sorry and I want to make up for it here.

When we design and create the interaction (and thus integration) of different services with each other we rely very much on the principles of “Consumer Driven Contracts“, in short CDCs. These Contracts can be tested. While the basic idea seems simple (“let other test the things of our service they think they need”) it implies a change in the way how we think of who-tests-what. So I want to spend some time explaining the concept step by step in many pictures:

Imagine you are part of a team (the blue team) and work with another team (the purple team). The purple team’s service will ask the service of the blue team about the email address of a specific user (ID). They agreed – for the sake of the argument – to communicate in json format via a RESTfull interface. In other words: the consumer (purple team) has a contract with the provider (blue team).

Now our friends in the purple team want to make sure, that the blue team will always give them an email as an answer. The first important thing is that the purple team needs to trust the blue team in figuring out the right email address for the given user. If the purple team starts to validate the blue team’s results for the actual content you have a lack of trust. This is a different, much more severe problem. Writing CDCs does not help you to solve this.

Hence, the purple team tests that they get any answer and that it contains an email address (at all). They will not test which one.

Screen Shot 2019-02-15 at 20.46.36.png

So assuming the blue team does it all right, the purple team would then just write a test that sends any ID and gets back any email in a valid json.

Now, whenever the purple team wants to know if the contract is still valid they let the test run. But then think about when the contract would potentially break: in this picture the contract will only be broken if the blue team introduces (unintended, not communicated) changes to their service. So isn’t it a much better idea to run this specific test whenever there is a change to the service of the blue team?

Screen Shot 2019-02-15 at 20.47.55.png

And this is what we do: the purple team – the consumer – drives the the contract to make sure it holds up to the agreed functionality of the blue team’s service. In this way – having the test of the purple team in the CI-System of the blue team – the blue team can make sure to not accidentally break the contract.

A few days later, when everything works smoothly and both teams feel comfortable, the blue team does not need the purple team any more to do a release. The blue team knows that the test they received from the purple team discovers all flaws. The purple team can concentrate on something else.

Screen Shot 2019-02-15 at 20.50.14.png

This is it! This is the basic idea of the CDC. The purple team writes and maintains the test. But the blue team executes it. If the request would go the other way around (the blue team sends an email and expects back an ID), then the test would be written by the blue team and executed by the purple team.

But this is only the start. Of course, the blue team has more than only one service. And there is not just the other purple team, but many teams doing requests to the different services of the blue team. Now the blue team members have to be quite disciplined and ask the other teams for all the CDCs. This sound a bit like an overhead and quite exhausting. But here comes the big benefit: the blue team is in all its work (and more important: deployments) independent of the other teams and independent of the availability of the other teams services in test environments.

Anyone who tried to get multiple services of various teams running in different (test) environments will smile now knowing of all the pain this usually inflicts. In this picture the blue team mitigated this risk/problem.

Screen Shot 2019-02-15 at 20.48.57.png

While the blue team is doing quite well there is more to it. Being independent of other teams and their services is nice. But still you have dependencies between your own single services. You should also resolve those. Cover the connections between the services of the team with CDC tests, too. It will give you:

  • fast feedback (surprise!)
  • independent deploys
  • reduced complexity

Each of those three points on its own leads to a quicker development of features and a reduce in cycle time. As I get the combined benefit in this situation there is nothing I would argue against, so lets do it!

If all tests are in place, each service of the blue team is totally independent. It can be deployed at any time and still make sure the other services work properly.

Screen Shot 2019-02-15 at 20.49.46.png

And did you realize what this means? We do not need a big, heavy, error prone and expensive set of selenium end-to-end tests. We mitigated so many risks already and can be sure of interactions that are working.

You cannot test anything here, but certainly a lot. In any case, improving a lower layer of the testing pyramid usually implies that you can reduce something at the top. That means: less end-2-end test => less time for maintenance => more time to grow out of the role as a classic test manager.

Let’s embrace this!

A Guide to Testing Microservices

Over the last couple of months I have often been asked about micro services, especially how to test them. Building a small (e.g. a micro) service usually requires a different approach towards the architecture and towards the services themselves. In many cases also the team structures changes (like Conway’s Law, just the other way around). The question that arises then is how we can “compensate” for it while testing. Or to be more precise: how to adopt our way of working to ensure a high quality software.

There are three major themes that need to be taken care of when we think about testing micro services:

  1. The many times referenced testing pyramid principle apply here as much as for a monolith. And not just for one single service. The principles are as important when it comes to testing different service’s interactions.
  2. Automation plays a key role in order to speed up your build and deployment pipelines. Maybe this applies even more for micro services than for other architectures.
  3. Its not easy to start with a good micro service landscape and there certainly is some overhead in setting things up (the first time). But the benefit of small and independent services is the fast feedback you can receive.

The second and third points are mentioned most of the time, when someone talks or writes about working with micro services. The testing pyramid is not really new and there is not too much detail about the testing of those little services. So we will look into this even more, as indeed, there are some pitfalls you should avoid in order to have a properly testable service landscape.

If we have a look at a monolith, things are straight forward: it is pretty clear where and how to apply the testing pyramid. The basic components of our services are unit-tested, their interaction is integration tested and then there is probably one end to end test around the entire service.

Screen Shot 2019-02-15 at 20.52.18.png
The testing pyramid for a micro service. The scheme of the service is by Toby Clemson from martinfowler.com

Now a “micro” service is (in this abstraction) nothing but a small monolith. In other words: treat the service itself as you always did. Be sure that the unit test coverage is really good. Find the places where you need to test the integration and then put as few as possible (in some cases: 0) end-to-end tests on top.

Up to here, there is nothing breaking new. But any project dealing with only 1 micro service has no big issues with testing. The real problems arise, once you have two, three or many services. The big difference to the monolith-world is that you have a lot more interaction via the HTTP clients. Lets have a look at an example.

Lets assume that the product we want to build is some kind of messaging system. There are three mayor features (see green boxes in the image):

  1. The dashboard page where the user is going after login and sees a greeting. On this dashboard, also unread messages are displayed.
  2. The inbox itself, all messages can be viewed here
  3. The entire messaging system has some super strong encryption.
Screen Shot 2019-02-15 at 20.53.08.png

Although those features are very clear, we have a look at the business domains. The first domain we can identify is the user (“Finn”, the black circles). The user has a dashboard and an inbox. The second domain are the messages (yellow circles). They are displayed on the dashboard and in the inbox. The messages are also encrypted. The third domain is the encryption itself (blue circle). The encryption only works with messages.

Those circles represent the domains. If we cut our system according to the circles, we will end up with a clear domain for every service.

Service A deals with the user (and probably owns the html). Service B takes care of the messages. And Service C is doing the de-/encryption. For each single one of those services we apply – as described the testing pyramid. Then we put them together and view the entire system. And we consult our pyramid once more:

Screen Shot 2019-02-15 at 20.53.48.png
(side note: the level of tests we are discussing here is completely independent of the tools you may use. The integration tests can be PACT tests or CDCs, they could be written in Selenium or you could use Appium. But we do not want to talk about single tools here, rather about the general concept)

The single service itself is the unit. We know its properly tested and the unit is working as expected. There are a lot of tests in place to make sure this is the case. The base of the pyramid is ok.

The integration becomes very tricky. This is new and was/is not needed in the world of monolithic applications. There is one thing you really – really! – need to be aware of: test the interaction itself – and not two services.
If you have to get two services up and running to make sure that one is working properly it does not sound too bad. But if you scale and you have to get 27 services up and running to test one it just will not work. Too much complexity for a test run.
The challenge is that you will want (need) to test the integration of “your” services without the other ones. An approach that works really well are the consumer driven contract tests. It will blow the scope of this post to explain the concept here again – so I wrote a different post only about those CDCs. Make sure to dig into this!

If you got the trick with the integration, then the rest is “easy”. You will run the standard set of end-to-end tests. And once you have a good feeling about the integration and contract tests, then you will probably also reduce the amount of costly end-to-end tests. If you got all of this in place it would be perfect.

So much for the theory.

While this seems to be a simple principle, the real world looks very different. In many cases I have seen micro services being carved out of an already existing monolith. Furthermore, this usually happens under time constraints and has to be done fast.

This is exactly where a major error occurs, which afterwards propagates through the entire software development cycle: When teams start with the first, small service, the business domain is often unclear. As a result, the system with the Services A, B and C is designed not as above but very different. Remember the three major feature of our examples? Giving just a quick thought about cutting the services, many teams will cut their architecture by those features:

Screen Shot 2019-02-15 at 20.54.43.png

One service handles the dashboard (and thus needs to know about user and encrypted messages, black circle). On service handles the inbox and displays messages (and thus need to know about the user, too, as well as the encrypted messages, yellow circle). The encryption holds/stores all the messages (blue circle).

That means, that the domains of “messages” and “users” is not contained to one service, but instead propagates through the system. And this is the critical point: if we now have a look at how to unit test those domains, the units spread across the services. We need two or three services to write the unit tests. Writing integration tests becomes incredibly complex – or more accurate: hell. Then its also no wonder that some people then tend to leave the integration tests aside and rather cover the entire system by end-to-end tests. The result will look similar to this:

Screen Shot 2019-02-15 at 20.55.31.png

In this situation, the tests are most likely very unstable: as already indicated in the picture there is no clear scheme of what to test where. The integration point of the domains are randomly somewhere in the services. It will be difficult to mock things. If all services are always needed to be available it usually leads to “flakiness” of tests in pre-production environments and it becomes very hard – if not impossible – to test on a local machine. For every fail of the end-to-end tests someone needs to check the error log, in order to find out if it is a “real” error or not. Let me repeat, this is gonna be incredibly complex – or more accurate: hell.

If we then start to automate the entire thing… (you remember: fast feedback and so on) we will end up with a workload that is way beyond what we experienced with the monolith. At this point of time it is perfectly reasonable to do a reality check, whether or not things became easier with the micro services. For all teams that chose this “approach” that I have worked with, we actually figured that we did not improve compared to a monolith. But realizing this is important and valuable:

If your finding is, that it got more complicated then better stop and think how to improve. Because otherwise, as soon as you start to build more and more services and scale your application, it will only get worse. Thus, do yourself a favor and spend a lot of time thinking about the domain split. It will be easier tot test – and scalable!

Screen Shot 2019-02-15 at 20.56.13.png

And at last, the biggest advice I can give to people – teams (!) – that start working with smaller services is to

“Test” in time.

What do I mean? The team needs to be involved early:
Engage with the people designing the services. Talk to the business and understand the domains your are about to form in your services. And make sure that all people in your team have a clear understanding what you are doing and why you are doing this. This is the time where our role stretches far out of the testing area. This allows all of the former testers to grow out of the test manager role. We can make a real difference on the flexibility that our service landscape will provide to our business.  Please make sure, that your team and your business benefits from the approach to micro service. Show them where the pitfalls are. And guide them.

To summarize:

  1. Apply the testing pyramid. Make testing cheap and reliable. (as usual)
  2. Automate everything (that you can). Each manual step is a real show blocker in a micro service environment.
  3. Make it fast. Value the fast feedback from quick running pipelines. Be flexible.
  4. Start in time. Get a clear picture about your business domains and how you think it is – should be – reflected it in your services.

The beauty is: once testing is easy and helpful its a real cool amplifier when you continue to build and improve your system. Have fun 🙂