Prosys, provides production and process monitoring and management solutions. They also offer subcontracting services and OEM-licensing for industrial device, machine and system developers and manufacturers. Prosys values their customer relationships, so performance of their software is a high priority. They found that AQtime is the right tool to help with their performance and memory management issues.
Prosys Sentrol is a Component Based Rapid OPC Application Development Framework for Borland® Developer Studio (Delphi™ and C++Builder™). It enables quick and flexible development of state-of-the-art applications for data acquisition, production monitoring, device diagnostics, etc. Prosys Sentrol has been Certified for OPC Compliance by the OPC Foundation, which guarantees that it can be relied on in real world applications. AQtime has helped to ensure that Prosys Sentrol meet the compliance requirements. It is a key tool for reliability, the #1 requirement for all 24/7 applications.
"Prosys selected AQtime to guarantee that both the performance and memory allocation of Prosys Sentrol components is at the 24/7 industry level," said Jouni Aro, Software Manager, from Prosys. The key reasons they gave for their selection were:
- Thoroughness: "AQtime provides very thorough and clear analysis that go as deep as we wish."
- Flexibility: "The analysis and results can be filtered and browsed in many different ways, enabling easy focusing on just the problem areas."
- Completeness: "AQtime includes several tools, which enable a complete analysis on any software."
- Usability: "Profiling is not an easy task, so the tool that we chose should help in every possible way to enable all the above in a quick and understandable format and AQtime does this."
- Price: "AQtime provides a great value for the money."
"AQtime simply provides all of the above in a way that is very hard to beat. An ideal situation is to never have to use this kind of tool, however, that's not reality, so AQtime is the only way to go. It actually takes a minimum amount of time to let you know if you really are on a safe ground with your piece of software."
Jouni explained the steps that they take to ensure the reliability of their Prosys Sentrol product: "The key components in Prosys Sentrol continuously move data, even in large amounts, between various data sources, industrial devices, data bases, etc. Reliability in this context means that performance must be adequate and memory must be used carefully."
"In order to guarantee the components' reliability, several measures must be taken. In Prosys, the following methodologies have been used:"
- Refactoring: To actively keep the code clean and simple.
- Unit Testing: To verify correct functionality.
- Exception Logging: To locate and fix unexpected application behavior.
- Code Instrumentation (for enhanced logging): To verify data integrity.
- Profiling (with AQtime): To locate and analyze performance bottlenecks.
- Memory leak detection (with AQtime): To verify proper use of memory allocation.
"Of these, the last issues are the most demanding, and cannot be solved without special tools. On the other hand, they can be easily omitted, but may lead to problems which will appear when the application is already in production. For any software components, however, all of the above aspects must be guaranteed, in order to make their existence in the market legitimate."
"The saying goes: “premature optimization is the mother of all evil”. In fact when you optimize, you better know that it also matters. Too often developers end up optimizing code that actually has no significance to the actual performance of their software. Unless you run real tests and find out the real bottlenecks you can spend days and weeks optimizing with no effect. If you run your analysis with AQtime, you get a clear “top of the pops” list of the methods that consume most of your application’s performance. Tackle those and just move on to more important issues," said Jouni.
"Prosys Sentrol is designed for reliability, but performance must not become an issue. So, it is also frequently streamlined to move data forward efficiently enough. Whenever bottle-necks are suspected, AQtime helps us check those quickly."
Jouni described how AQtime helps them free memory using AQtime: "Memory usage is a pit we all fall into sooner or later. With native languages, such as Delphi and C++, AQtime simply checks your application, and again, provides a clear list of memory that you never freed, and then directs you immediately to the roots of the problem in your code. You can also analyze system resource usage, such as COM memory allocation, which is a completely different world."
"In 24/7 environments like ours, memory must be used especially carefully or the software can't stay up 24/7 for many weeks. Numerous leaks in Prosys Sentrol components have been located and patched to get it to where it is today, all thanks to AQtime."
"Even when you have garbage collection, like in .NET or Java, you can end up with similar, although harder to locate, problems where you end up consuming more and more memory, never releasing the references to it. Of course, the same problem can occur with any language. The symptom is that even when your allocation profiler says you have freed everything promptly at the application closedown, the running instance will fill up the system memory. Due to the nature of most applications, it is hard to tell whether this is a feature or a defect, the only thing that helps is live profiling using AQtime," said Jouni.
"Live profiling in AQtime let’s you examine a detailed, object based view of all the applications allocations in the currently running process, and easily track how that is changing over time. Should your objects pile too much where they should not, AQtime will reveal that to you."
Case: Clients Suspect Memory Leaks
Jouni explained how one issue started when a recent update to Prosys Sentrol resulted in a defect report from Mipro, who use Prosys Sentrol extensively in their Safety Related Systems. "They were suspecting that Sentrol OPC components were leaking memory, because their applications had a common symptom of slowly running out of memory after a couple of months use."
"The memory issue turned out to be a difficult one to find. The symptom had not been seen elsewhere at this point, but the components and applications were examined closely with AQtime."
Troubleshooting Memory Leaks
"When fixing the memory leak, the first case was to run the AQtime Allocation Profiler with one of the suspect applications. The Allocation Profiler reported no leaks that would explain this, so we decided to do a Live Allocation Profile," said Jouni.
"The next symptom was high peaks of incoming data: the Sentrol OPC components buffer unlimited incoming data. Once the peak is over, the buffer is cleared, but not freed, in order to minimize the number of temporary objects. Eventually, however, the buffers are freed so they will not show up in the final Allocation profiler reports," said Jouni
With Live Profiling, it is possible to examine the number of objects reserved during the lifetime of an application. So, the application was put to a test where a high amount of OPC data was fed into MisoRecorder and the Live Profiler showed an increasing number of
TPsOPCDataChangeTransaction objects being created (Figure 1). But, the number of these objects never grew over a few hundred and that could not explain a continuously increasing memory allocation at a rate of 100MB/week."
"The result at this point was that the case could be explained, after all, by memory fragmentation. So the applications were next compiled using the memory manager of the FastMM project, which addresses the fragmentation problems. However, the problem prevailed."
Figure 1 - Live Allocation Profiler
"Figure 1 displays the Live Allocation Profiler results. The suspect object class,
TPsOPCTransaction, is highlighted in the Live Monitor. Even though, the Live Monitor revealed that the number of object allocations were rising under system stress, that was not the real problem. It's actually exactly what should happen, because this is an object used for buffering incoming OPC data packets. The buffer will eventually be emptied once the temporary peak in data flow has passed and the application has time to process the packets. Each packet has time stamped data in it, so the original time of occurrence is also kept, despite the delays in data processing."
Resource Profiler Fixes the Memory Leak
"By this time, the problem had been also located in another project. It became evident that it might only be related to OPC string and array data. Using simple data types, revealed no problems, which also explained why it had been difficult to reproduce in test environments. Which, of course, only used simple data," said Jouni.
"The focus was changed from object allocation to the actual OPC data, which is actually COM data. Here, AQtime helped with the Resource Profiler, which can be used to examine both Task memory and Sys string allocation, which are used to allocate memory needed for passing structures between COM applications."
"The Resource Profiler was used in the Live-mode as well. In the live mode, you can take snapshots of the current allocations while the application is running. By comparing subsequent snapshots, you can get a report of the resource allocations' differences. See Figure 2 and Figure 3. The results were quite clear, using array or string data in the communications lead to an increased number of task memory allocations, which were never freed."
Figure 2 - Live Resource Profiler difference between two samples, indicating a possible leak in Sys string usage.
Figure 3 - Live Resource Profiler difference indicating a clear suspect in Task Memory allocations.
"AQtime also provides a link to the code and the complete call stack, which helps us locate the line of source code where the allocations occur. In this case, the source of the problem was this piece of code, which clears the variant data array received from the OPC server:"
- FillChar(FValues^, Count*SizeOf(TVarData),0);
"It is a result of optimization to the original (slow looking) code:"
- for i := 0 to Count-1 do
"It seems that you can just clear the memory here, instead of using the for-loop, but the developer just forgot the side-effect of VarClear. In addition to clearing the memory used for the variant records, it also frees the externally allocated memory for string and array data."
Figure 4. Resource Profiler Results