Monday, March 23, 2009

What a bad ReallocMem can do to Midas

The TClientDataSet isn’t known for its speed. It becomes slower and slower if you add more and more records. But what is the reason for this performance hit? After some research (F12 is a nice key if you want to know which function either gets called very often or consumes a lot of time) I nailed it down to the memory manager that is used inside the Midas code. (BTW: Thank’s CodeGear for compiling MidasLib.dcu with C/C++ source line information what makes reading assembler code a lot easier).


By using the MidasLib.dcu instead of the external Midas.dll the FastMM memory manager is used but this doesn’t give any significant performance improvement. So you would think the memory manager can’t be the bottle neck. But wait, there is a function (I call it) MidasRealloc. And with FastMM you would think it calls System.ReallocMemory. But that isn’t the case. The function looks like this: (translated from assembler to Delphi)



function MidasRealloc(P: Pointer; OldSize, NewSize: LongWord): Pointer; stdcall;
begin
Result := AllocMem(NewSize); // => GetMem+FillChar
if (Result <> nil) and (P <> nil) then
begin
Move(P^, Result^, OldSize);
FreeMem(P);
end;
end;


As you can see the function never calls ReallocMemory. Instead it does it the hard way for every call. Allocate new memory, zero the new memory, copy the previous content and release the old memory block. On the first look the AllocMem could be replaced by GetMem with a FillChar that only zero-fills the “Gap := NewSize - OldSize” block and not the whole memory block. But at a second look you realize that this is nothing else than a really bad ReallocMemory implementation. FastMM’s ReallocMemory is very fast because it doesn’t need to realloc memory blocks and copy memory all the time. Instead it has some inteligence in it to reduce the memory block moving.


Let’s see what can be done against this bad implementation. Replacing MidasRealloc alone woudn’t work because the memory manager has also a malloc(), a calloc() and a free() function. But all in all the replacements that call into FastMM aren’t that hard to write, especially the MidasRealloc becomes less code. The main problem is to find these function in the process space in order to patch them. But I already have the knowledge to find them. And I’ve already have a unit that makes the Midas a lot faster.


Testapplication from QC 7102

[rename Vessels200.xml to Vessels.xml in the source code and put a "for 1 to 10 do" loop before the ClientDataSet1.AppendData"]


Delphi 2009’s MidasLib unit:

1. Call: 858ms

2. Call: 1966ms

3. Call: 3182ms

4. Call: 4414ms

5. Call: 5678ms


Delphi 2009’s MidasLib unit with my MidasSpeedFix.pas unit:

1. Call: 406ms

2. Call: 265ms

3. Call: 374ms

4. Call: 312ms

5. Call: 281ms

No comments: