Which API Metrics Even Matter?
Software Measurement is a tricky business. When we all agree on what the thing we are trying to count is, like pieces of candy in a jar, it is simple. Sadly, software isn't candy in a jar.
We try to count things like bugs, or lines of code, of number of stories completed in a sprint but never agree on what any of those things are in the first place. I have been in trouble more than once in my role as a tester from having my performance judged based on number of bugs I reported in a sprint, or the number of test cases written for a release.
Metrics can be dangerous, but we can also use them to build a more complete story about our software.
Let's take a look at how metrics and APIs can go together.
Learning about our customers -- how they work, how software will help them be better, and what they value -- is a good start for building a product. I like knowing that we are creating new features for real people instead of personas or descriptions on a story card. There are usually important things missing though, even when I have been able to go straight to the source. Getting feedback once that code is in production can be a whole other challenge.
Gathering data from API monitoring tools about what our customers are doing can help me collect information I might have never found.
There are a few couple of data points I have found to be useful -- traffic source, and user type. Most companies I have worked with in the past several years have had some sort of mobile app. Maybe it wasn't the main line of business, but it was there and it was important. Traffic source data was a way for us to see how many people were using what parts of the product, and at what points they went from mobile to desktop. When we combined that with user type, we focused in a little more on who was spending the most time where.
This is one place where the magic of software monitoring tools comes in to play. An API, or at least one that gets good usage, throws back a constant stream of data in the form of a HTTP response. And sometimes more than that if the developers design for deeper feedback.
There are a lot of different types of data we can be collect related to performance -- server data like available memory and CPU cycles, load time data and latency in the browser, and database response time. All of these are part of the larger puzzle that is how fast or slow your product is responding. Each time I run a performance scenario and notice a difference, the big questions are how big is the change and does it matter. Sometimes the answer is the change doesn't matter at all. Collecting that data gives us the choice to make those decisions instead of waiting for the customer to tell us when something is wrong.
Imagine you are running a Black Friday ad a few days before Thanksgiving day try to drive people into your stores instead of your competitors’. You want that page to be fast enough that your potential customers don’t give up but checking the site every few minutes isn’t realistic.
Over the course of that few days, we can check metrics like median throughput, API call volume, and throughput distribution by endpoint can help shape our understanding. This information might point to one Amazon ec2 instance running slow that needs load balancing, or maybe a virtual machine that needs better memory allocation. Either way, the data collection may have saved the ad from being inaccessible at a bad time.
Software performance is a tricky thing. There are a lot of different aspects to think about like the database, server resources, code optimization, and technologies used in the user interface. With performance testing efforts, the feedback loop is usually at the release level at best. So every two or more weeks, new information about how performance has changed comes into view. We get that information in the middle of a probably already stacked sprint, so the earliest something gets done is the next spring. That is a long lag between discovery and action.
I test APIs so that hopefully when we push to production everyone is happy and nothing unexpected happens. Software is tricky though, we can never imagine all the ways our customers will use our products and the ways things might go wrong. And so occasionally things do go wrong, pages don't submit when someone clicks the save button and lists of data fail to load. One of the nice things about APIs is that stream of data we get back. Each time a page fails to submit, or a list of data doesn't load correctly we get some data back.
Seeing an error once might be a fluke, seeing an error several times or at the same time every day might be something interesting to look into.
Collecting data on what has gone wrong can make people feel defensive. I have had more than one development manager that was interested in collecting information on problems found in production. The idea was that they could get a better understanding of how well the testers were working out with this data. If there weren't many bugs n production then testers were doing alright, if there were a lot then someone was probably getting an unpleasant talk.
Collecting error data can be dangerous in the type of environment where it is wielded like a weapon. My personal preference is to use it as a tool for inquiry and investigation. An upward trend in error reporting can trigger studying our process, and our product to figure out how to make things better next time around.
Probably the easiest way to take metrics from complicated and dangerous to helpful is to use them as a trigger to start a deeper investigation. Seeing trends in user interface load times for a page getting slower? Set up a project to look at the page and see where it can be optimized. If error reporting gets really active at a certain time every day, maybe we should be making a call to the customer to learn about how they are using the product and see how we can better accommodate that.
Everything is information, we just need to make it useful.