How to Identify Performance Bottlenecks in Your Source Code
If the software was too slow today, then it wouldn't be in a year. And if it was slow then, well, it would be fine once the software had a year or two of experience out in the wild. New releases would probably only get picked up by power users with high-end computers anyway.
That strategy worked ... right up until it didn't.
Bigger hard drives, then new versions of Windows, and then web browsers drove computer sales. The next shift after that wasn't more computers, but smaller, slightly less powerful mobile devices. Sales of traditional desktop computers have been on a downward trend since 2007. Adding circuits to the CPU makes it larger, which means more physical space for electrons to travel. Electronic current is limited by the speed of light, so larger CPUs run into speed problems; Moore's Law seems to be slowing down as well.
Today, developers face a new set of challenges when it comes to identifying memory profiling issues, and other bottlenecks that can slow down the performance of an application. What are some of the challenges and how can developers identify and resolve performance issues?
Let’s start by taking a look at a real example.
In the late 1990's, I worked for a website company, RealtyCheck.com, which was the DotCom support for a brick and mortar Title Insurance Company. Title insurance is an ancient business that is designed to ensure that the person who sells you land has the full rights to that land. In Michigan, all title insurance is roughly the same by law, and the prices are set by law. If you think that's a prescription for a bit of a boring business, you're onto something. So the owner of 44-office company created RealtyCheck to compete on service. RealtyCheck was going to generate online leads for the traditional business and enable customer self-service, so the customer could manage the process on their time, and on their own computer.
The idea sounded good to me. I had a chance to work at a DotCom and possibly a story for my grandchildren one day. I took the job.
One of my early projects was designed to put the schedule of the insurance representatives online, and to make scheduling easier internally. To do that, we needed to know when rooms were available, and when people were available. The company didn't have a version of Outlook that would do scheduling internally, and, besides, I needed to get used to interfacing with Microsoft Exchange anyway. So, as the first step, I wrote the internal scheduler as Windows software.
Writing the Windows Application
Users would select the date from a date-picker, then see the main screen populate. Each employee and office in a location would be a row, and the times of day were columns. If someone was busy, their time would not appear. With this little application the receptionist could schedule meetings without having to talk to anyone or consult the actual, physical appointment book it was replacing.
Yes, I wrote a lightweight version of Google calendar, for Windows. There are actually calendar tools that generate scheduling apps; we bought one, which used an Access database on the back-end. I just had to yank out the Access connections and change them to the Microsoft Exchange API. We did it through an Outlook control, so we knew who the user was and had user credentialing for free. (I'll leave the security aspects of that to another post.) Once the application was ready for a real customer, my next task was to go across the street and install it and train the headquarters staff. Everything went fine, except the software was "slow."
That began my first serious work on performance tuning.
I had no real performance profiling tool in my toolbox. The best I could do was use Task Manager to monitor memory and CPU usage while the app was running. The Task Manager showed plenty of memory and CPU free. Based on what I learned in college from my old professor, Mary Lou Malone, I knew the problem was probably the system waiting for results from a query to the Exchange server, what she would have called an "I/O bound process." Then I remembered something that Rob, my old mentor, had told me on my first job (link to article "getting real about memory leaks) "when there is a performance problem, tune the inner loop."
The source code is long gone, but I remember the heart of the query looked like this. Feel free to find the performance problem or just skim.
iOutlook = get_outlook_object()
collectionUsers = get_Users(iOutlook)
for each user
iOutlook = get_outlook_object()
for each business_half_hour in (day)
iOutlook = get_outlook_object()
iUser = get_user(user)
Can you see it?
I fell for the oldest performance mistake in the book — I re-initialized the variables on every step of the loop!
That is, I only needed to call "iOutlook =" one time, before the looping started. Likewise, I only needed to get the user object once per user. Instead, I was getting that for every appointment.
When I yanked the repetitive code out, performance improved enough for use, which was a good thing, because I didn't know what else to do. I had no other tools. If I really needed to, I could save the time at each step on an algorithm and have the results pop up when a page finished loading, or write them to a log. For now, the application was fast enough, and it was fast enough for the Grand Rapids office and the Lansing office.
Then the Three Rivers, Michigan, satellite office had a problem. The application was slow again. I had no idea what was wrong.
It turned out that Three Rivers didn't have their own exchange server. Instead they were pulling data from the Grand Rapids office over an ancient data-link cable. That cable introduced both propagation delay (as the data went back and forth) but also possible bandwidth restrictions. I had no fixes, the code was about as tight as I could get it. I suppose we could have pulled the data every ten minutes or so for every person, but that would saturate the cable with demand and be generally out of date. The fix for the Three Rivers office was to put their own local server in, which cost a bit in licensing fees and installs, but got their data to be served up over the local network, not a remote cable.
At that point we stopped tweaking the code.
I got three lessons from this experience.
1. Look for bottlenecks outside the local system
First, I learned that there were more bottlenecks than the internal system. In college we talked about intense algorithms, like calculating the value of Pi, where the CPU was the blocker, and also situations where writing or reading from disk or memory were blocking performance. In this case, the bottleneck was an external system.
2. There are only so many tweaks you can make
I also learned that when a hard, external system is involved, you can only tweak it so far. Even if I could make every other part of the software instantaneous, I could only squeeze out a few percentage points of improvement. To go further, I needed to either change the way I interfaced with the bottleneck, or change the bottleneck entirely.
3. Measurement is key
Finally, I learned that measurement was key to performance tuning. For this project, I actually used a stopwatch, changed some code, and used the stopwatch again. That was good enough, enough of the time for desktop software nearly two decades ago. If we had to tune further, I would need to understand where the bottleneck was in the external system — was it the server's CPU, disk, or memory? Was it the bandwidth of the network, or the propagation delay? Was the network overloaded? Was it the switch? That's not a performance testing issue - it is performance investigation, something I was just not equipped to do at the time.
The dream of the 90's may be alive in Portland, but most of the time, I'm happy to live in the 21st century. Thanks!
Fundamentals of Performance Profiling
Application performance is crucial to a software company’s success. When code executes quickly and efficiently, customers see an application as responsive and reliable; they view it as a time saver. However, when code goes into unnecessary loops, calls extraneous functions, or trips over itself in some other way, customers have a very different reaction.
Visit SmartBear’s Code Profiling Resource section to learn more.