Patterns of API Virtualization
When Christopher Alexander wrote A Pattern Language in 1977, he was looking for a more powerful way to describe how towns and buildings were laid out. These patterns would allow architects, builders and planners to work together, to use the same words, mean the same thing, and create systems that were beautiful and worked, instead of more urban sprawl.
Twenty years later, Gamma, Helms, Johnson and Vlissdes took the pattern idea and applied it to object-oriented software, which at the time was struggling to figure out how to create windows-based applications.
Today the struggle is figuring out how to break software into small components that can be tested independently, and then having those components interact, typically over internet protocols. Raw SQL commands are giving way to service oriented systems that interact through APIs, sometimes all within one company, sometimes outside with Microsoft, Google, Amazon, or other APIs like a manufacturing company or supplier. While I do not claim to be Christopher Alexander or the Gang of Four, I am seeing some patterns emerge - a set of solutions to a defined problem - and would like to share a few of those today.
What do you mean API?
Alistair Cockburn’s Hexagonal Architecture (below) presents a way to think about APIs. The application we want to develop is in the middle and has a set of adapters to the external world. Those adapters might be an API we expose, like a ‘search’ interface to an online catalog, or the API’s we call, including the database, an email gateway, or the ‘permissions’ service, to see what types of search results we should show to this user.
Cockburn’s Hexagonal Architecture gives us two ways to think about APIs: Our own, and the services we call. (Source: http://alistair.cockburn.us/Hexagonal+architecture)
That’s a lot of APIs. Let’s explore about some ways to virtualize these services - and why.
Automated Build and Continuous Integration
Say, for example, you are working on a piece of software to analyze trending terms on social media - such as a customer complaint that is being liked and tweeted. You want companies to find these problems when they start to trend up, then reach out to the customer and solve it, or, perhaps, reach out to say “thank you” and amplify it. Modern build systems, like Jenkins, TFS, and TeamCity can compile, deploy, and even run the system to check for known scenarios.
The trouble is those pesky adapters to external systems, like Twitter and Facebook. The software could do its job, but there is no way to know if the application is correct in its guesses about trends and importance. Getting the data from the providers can turn a quick build into a slow process that uses a lot of network traffic.
By recording and storing known answers to predictable requests, then simulating the service and playing back known (“canned”) data, API Virtualization allows build systems to do more, with faster, more predictable results. This does not remove the need for end-to-end testing, but it does allow the team to have more confidence with each build.
Performance Testing Your Application
Like build/deploy systems, performance testing the application (the inside of the hexagon) with live, external services can cause problems. All that extra traffic can cause problems with the actual company network infrastructure; it could cause bandwidth problems at the point of the ISP. Some 3rd Party APIs charge a micro-fee per transaction, or limit bandwidth. Many of them lack a ‘test’ sandbox to develop in, so performance testing could interact with real, production work.
Standing up a virtual server to return pre-planned data means you can performance test your application - not the third party - prevent bandwidth throttles, not step on production data, and avoid paying fees intended for real (production) use that is actually being used to test our environment.
Avoid Integration Environment Inconsistency
A few years ago I worked at a large organization that was wrapping old code in proxy services, so they could be consumed by other teams. Login, add-to-cart, search catalog, create custom catalog, permissions, all of it was possible to access through API calls, most of it as simple as a web URL that returned some text.
The problem was the “System Integration Test” environment, or SIT. Every team tested its services in SIT, which meant about a third of the time, something was broken. After finding a bug in the current build, we would track it back to the catalog service, walk over to that team, bring up the issue, and they would say “thanks, we are testing a new build of catalog.”
We expected catalog to work in SIT. Anything else meant a waste of someone’s time. Automated tools reporting false errors were even worse. When teams performance tested their services, everything calling the service got slow, if it worked at all.
By virtualizing services we could test our application end-to-end against known data, without the troubles of SIT, or having to build additional expensive test-lab-like copies of production. Best of all, creating the virtual services is a snap - just record the live service with a tool and instruct it to play back similar requests.
Flip Integration Tests from Virtual To Real for Final Checking
All this API virtualization creates a risk that the team will move from test to production and something will be different between the Virtual API and the live one. If the Virtual API server is just returning the same thing product did when we recorded it and we have automated checks in place, we can change our test server to point to the real service and re-run all the automated checks.
As long as the source data hasn’t changed and we are reading, not writing, from production, the checks should all pass. If the production API has changed, we will get failures, and they will be easy enough to fix and retest.
Simulate Slow or Unresponsive Service In The Middle Of A Long Running Transaction
Sometimes you want to test if a server is overloaded or down. Calling Facebook and asking them to turn off their servers is unlikely to work; even just coordinating with the team down the hall could create a lot of overhead. You also might want to test this often - every day or every hour - and manually pulling a plug or coordinating with the Login team every hour might not be realistic.
The trick is to bring the service down once and record the exact behavior of the system, then use a virtual server to simulate that behavior, over and over again, every day. That means you’ll get the exact behavior, not a guess, and know exactly how the application under test can deal with it.
Early Development of System against an Undeployed API
Sometimes the API you are testing against does not exist, even in test. It’s still possible to create a Virt (virtual API) which returns some roughly equivalent data, and makes it possible to move forward on the core application without introducing new risks.
Avoid Configuration and Copying Hassles
Many companies use a test system that is a copy of production, and then refresh the system periodically. Sometimes, you want test scenarios that do not exist in production, so you have to create them … and lose them during a refresh. The same problem happens with 3rd party APIs, when, for example, a part is discontinued, and you are testing ordering that part, or the sample person you check for insurance coverage leaves the company.
If the request for the part of the coverage goes through an API, you can record known good results that don’t change, even after a database refresh - then leave the real, end-to-end testing for an exploratory step that will be lighter, quicker, more accurate, and have more confidence.
A Fistful of Techniques
Today we discussed a half-dozen common patterns to API virtualization, mostly around testing systems in isolation that consume data through an API, like a 3rd party or an internal service. These ideas are new, and evolving. What are a few of your favorites?