The Delphi TList class is a convenient way to hold a list of objects and is most often used when you need an array whose size can change at run time. This useful class also holds a hidden performance problem that's easy to miss without a line-level performance profiler like AQtime.
Preamble
TList defines a default array property Items that returns a Pointer value. The TList default array property lets you refer to a property without referring to the property name. It's very handy to acces a list item with the expression List[I] instead of the more cumbersome List.Items[I]. However, it also hides from you the real cost of the property implementation behind the TList.Get method.
Symptoms
You've seen these symptoms before: Your application runs lazily, time-critical features take lots and lots of time, and the application window responds slowly when being repainted. Your customer tears his eyes from this disaster and says the only significant word: Why? You tweak the code and the queries to no avail. It's still too slow. Finally, you take the AQtime profiling toolkit in your hands and analyze the runtime behavior of your application. On the top of Performance Profiler’s results you see the culprit, the key line is: "TList.Get". Oops… what’s wrong with it?
Sample
I’ve created a simple example to illustrate this situation. The main form obtains five thousand TCustomer objects and places all them in the TCustomerList list on startup. At the same time it processes every TCustomer object via the TCustomer.Process call, passing to it all remaining objects, including self, as argument (see TForm1.ProcessCustomers).
unit Unit1;
interface
uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs;
type
// The customer class definition
TCustomer = class
private
// The data fields of this new class
FName: string;
FNumber: Integer;
public
procedure Process(ACustomer: TCustomer);
// Properties to read these data values
property Name: string read FName write FName;
property Number: Integer read FNumber write FNumber;
// Constructor
constructor Create(const AName: string; const ANumber: Integer);
end;
// The typed list of customers
TCustomerList = class(TList)
private
function GetItem(Index: Integer): TCustomer;
public
property Items[Index: Integer]: TCustomer read GetItem; default;
end;
TForm1 = class(TForm)
procedure FormCreate(Sender: TObject);
procedure FormDestroy(Sender: TObject);
private
FList: TCustomerList;
procedure AddCustomers;
procedure ProcessCustomers;
end;
var
Form1: TForm1;
implementation
{$R *.dfm}
{ TCustomer }
constructor TCustomer.Create(const AName: string; const ANumber: Integer);
begin
FName := AName;
FNumber := ANumber;
end;
procedure TCustomer.Process(ACustomer: TCustomer);
begin
// This is a sample stub
end;
{ TCustomerList }
function TCustomerList.GetItem(Index: Integer): TCustomer;
begin
Result := TCustomer(inherited Items[Index]);
end;
{ TForm1 }
procedure TForm1.AddCustomers;
var
I: Integer;
begin
FList.Capacity := 5000;
for I := 1 to 5000 do
FList.Add(TCustomer.Create('Added new item', I));
end;
procedure TForm1.FormCreate(Sender: TObject);
begin
// Create the TList object to hold a set of customer objects
FList := TCustomerList.Create;
// Create 5000 customers and add them to our object list
AddCustomers;
// Now let's do some tests...
ProcessCustomers; // using TList.Items
end;
procedure TForm1.ProcessCustomers;
var
I, J: Integer;
Customer, Customer2: TCustomer;
begin
for I := 0 to FList.Count - 1 do
begin
Customer := FList[I];
for J := 0 to FList.Count - 1 do
begin
// Get another customer object
Customer2 := FList[J];
// Do something with it
Customer.Process(Customer2);
end;
end;
end;
procedure TForm1.FormDestroy(Sender: TObject);
var
I: Integer;
begin
// Finalization takes care of memory leaks
for I := 0 to FList.Count - 1 do
TObject(FList[I]).Free;
FreeAndNil(FList);
end;
end.
Well, we’ve got the TForm1.ProcessCustomers method that iterates 5000 customer objects 5000 times. Thus, it forces indirectly a large number of TList.Items property invocations. As for the TCustomerList.GetItem method, it's a cosmetic wrapper for the TList.Items base property. The question is: is it possible to tune up the execution speed of this code? Let's ask AQtime.
Getting Ready for Performance Profiling
We'll use Delphi 2005 to compile the sample. AQtime’s Assistant suggests adjusting compiler settings at first. The sophisticated AQtime built-in debug info analyzer requires the option “TD32 debug info” to be checked in Delphi, go to the menu: Project | Options | Linker | EXE and DLL options. We should ensure that the compiler optimization is turned on just like a regular release configuration. Finally, the option “Using Debug DCUs” is not necessary since we want to have release versions of Delphi RTL and VCL units compiled into the executable.
Drag the compiled EXE onto the AQtime window and define what and when to profile (see Fig. 1). The new line level area “Customers” is used to set up for TCustomerList, TForm1 and TList classes, while the default areas, Profile Entire .NET Code and Full Check, are not necessary here. Turn off the checkbox “Stop on TForm1.FormDestroy” to exclude the finalization code in TForm1.FormDestroy, so that its extra calls to TList.Items won’t be counted in the profiling results.
Figure 1. AQtime is loaded with the sample and ready to profile. If you’re feeling new to AQtime, the side panel Assistant is a good guide through all the profiling stages.
What’s going on?
A profiling run takes more time than a standard run, but here are the results we wanted: TForm1.FormCreate and its child routines are the slowest. As expected, the Call Tree points to the path ending with TList.Get (see Fig. 2). Let’s take a look at the Delphi RTL sources and AQtime Disassembler panel (see Fig. 3) to better understand what TList.Get is really indented for.
Figure 2. The critical path points to TList.Get, TList.Items property getter.
Here is code snippet taken from Classes.pas:
function TList.Get(Index: Integer): Pointer;
begin
if (Index < 0) or (Index >= FCount) then
Error(@SListIndexError, Index);
Result := FList^[Index];
end;
TList.Get checks the Index argument and calls TList.Error that raises EListError exception if item index is out of bounds. This behavior is very useful on the whole. The error is directly reported in a meaningful way, instead of as an access violation error somewhere further in your code, and can be easily handled if necessary. But the other side doesn’t look so good.
Say we deal with the following pattern – iterating the list of objects:
for I := 0 to List.Count – 1 do
begin
Obj := List[I];
// processing Obj
end;
The For-cycle guarantees that index I is always between 0 and List.Count–1. There is no need to perform the out-of-range check on each iteration in this case (see Fig. 3). Moreover, it’s not a secret that a method call is more expensive than reading directly from memory. We’ve got the TList.Get method called on every iteration with a simple purpose: to extract a list item from the FList array.
Figure 3. AQtime displays TList.Get disassembled. Selected assembler instructions deal with the out-of-range check. In case of a valid index these four lines are being executed.
Note that the Delphi compiler inserts an exception frame for finalization of automatically managed resources, which would result in extra overhead. Fortunately, the EListError exception is thrown in a different method, TList.Error, and TList.Get is not affected by them.
Let’s see how we can avoid using TList.Get and how much effect it will have.
Solution #1
As noted, TList.Get wraps the FList private field which is defined as a dynamic array. Take a look at the TList class definition:
const
MaxListSize = Maxint div 16;
type
PPointerList = ^TPointerList;
TPointerList = array[0..MaxListSize - 1] of Pointer;
TList = class(TObject)
private
FList: PPointerList;
...
public
...
property List: PPointerList read FList;
end;
Fortunately, the TList class has the List public property that provides direct access to FList. Armed with the List property, we’ve introduced a new version of the customers processing routine, ProcessCustomersEx, that reads the list items via an improved TCustomerList.GetItemEx method:
procedure TForm1.ProcessCustomersEx;
var
I, J: Integer;
Customer, Customer2: TCustomer;
begin
for I := 0 to FList.Count - 1 do
begin
Customer := FList.GetItemEx(I);
for J := 0 to FList.Count - 1 do
begin
// Get another customer object
Customer2 := FList.GetItemEx(J);
// Do something with it
Customer.Process(Customer2);
end;
end;
end;
function TCustomerList.GetItemEx(Index: Integer): TCustomer;
begin
// In contrast to TCustomerList.GetItem, GetItemEx uses the List property
Result := TCustomer(List[Index]);
end;
Theoretically, using the List property will get us the advantage of direct memory access and removing redundant index check. And what’s the result? AQtime reports that ProcessCustomersEx runs about 25% faster than ProcessCustomers. That’s a good result; however it’s not a significant difference in performance.
Solution #2
We’ve forgotten that GetItemEx itself is a method and, thereby, produces extra overhead. The next variant is coming to solve this problem.
Finally, ProcessCustomersEx2, the latest version of the customers processing routine, is in the game. It has no method calls and no index checks, and that completely distinguishes it from its predecessors:
procedure TForm1.ProcessCustomersEx2;
var
I, J: Integer;
Customer, Customer2: TCustomer;
begin
for I := 0 to FList.Count - 1 do
begin
Customer := FList.List[I];
for J := 0 to FList.Count - 1 do
begin
// Get another customer object
Customer2 := FList.List[J];
// Do something with it
Customer.Process(Customer2);
end;
end;
end;
The results seem to be more than adequate: according to Fig. 4 we’ve got a really significant performance boost this time, it runs more than four times faster. This solution is satisfactory.
Figure 4. The latest version of the customers processing routine ProcessCustomersEx2 is about four times faster than the original.
Discussion
At this point you may ask: “What’s with the index checking, however? It’s a must-have feature in general.” The short answer is: yes, it is. And you can easily insert an assertion for that, just like the following:
Assert((I >= 0) and (I < List.Count));
Obj := TObject(List.List[I]);
// processing Obj
Since assertions are eliminated from the release configuration they do not affect performance. In general, this approach is recommended for most of the runtime checks.
Another reasonable question is: “What’s with the encapsulation? – We used typed TCustomerList.GetItem and TCustomList.GetItemEx before, now we have to deal with raw pointers returned by TList.List?” Well, that’s like walking on a thin ice. Fighting for increased performance may result in some consequent design losses. Solution #2 is a result of the specific requirements of this article: it was our goal to speed up the sample. In the real world it’s up to you to find compromise and to decide whether to stop or go further – looking for the solution #3.
Conclusion
Now you know the TList.Items property trick and are ready to eliminate this bottleneck from your applications. AQtime clearly shows that direct access to the list items via TList.List property is several times faster compared to invocation of the TList.Items property getter. However, I should warn you: performance optimization is a never ending battle, be prepared to battle more bottlenecks shifted to the top of Performance Profiler results in place of this solved one.
P.S. It’s to be noted that described trick works only if the list is accessed from one thread at a time. Consider using TThreadList instead of TList to access list items in a multi-threading environment. However, multi-threading issues are the subject of a different story.