Subject | Tested UDF string performance (Was: UDF:s and strings i Delphi) |
---|---|
Author | Kjell Rilbe |
Post date | 2005-02-11T10:20:11Z |
Pavel Menshchikov wrote:
posting my results.
For any newcomers: this concerns a comparison between the following two
methods of returning a string from an UDF:
1. function MyUDF: PChar; cdecl;
2. function MyUDF(Result: PChar); cdecl;
I used these UDF:s
------------------------------------------------------------
{ Type 1 - in-UDF memory allocation }
function Substring1(Source: PChar; FromPos: PInteger;
ForCount: PInteger): PChar; cdecl;
begin
if Assigned(Source) and Assigned(FromPos) and Assigned(ForCount)
then begin
Result:=ib_util_malloc(StrLen(Source));
if (FromPos^<1) or (FromPos^>Integer(StrLen(Source)))
then Result[0]:=#0
else StrLCopy(Result,Source+FromPos^-1,ForCount^);
end
else Result:=nil;
end;
------------------------------------------------------------
{ Type 2 - FB memory allocation }
procedure Substring2(Source: PChar; FromPos: PInteger;
ForCount: PInteger; Dest: PChar); cdecl;
begin
if Assigned(Source) and Assigned(FromPos) and Assigned(ForCount)
then begin
if Assigned(Dest) then begin
if (FromPos^<1) or (FromPos^>Integer(StrLen(Source)))
then Dest[0]:=#0
else StrLCopy(Dest,Source+FromPos^-1,ForCount^);
end;
end
else begin
if Assigned(Dest)
then Dest[0]:=#0;
end;
end;
------------------------------------------------------------
These were declared in my FB database like this:
------------------------------------------------------------
-- Type 1 - in-UDF memory allocation
declare external function AMSUBSTRING1
CString(255),
Integer,
Integer
returns
CString(255) free_it
entry_point 'Substring1'
module_name 'AMUDF.dll';
------------------------------------------------------------
-- Type 2 - FB memory allocation
declare external function AMSUBSTRING2
CString(255),
Integer,
Integer,
CString(255)
returns
parameter 4
entry_point 'Substring2'
module_name 'AMUDF.dll';
------------------------------------------------------------
I then ran some tests on a table "Foretag" with 1586325 records (1.6
million). This table has a column "Telnr" with phone numbers in Swedish
format: <areacode>-<number>. I wanted to count records with a Stockholm
number, which start with '08-':
------------------------------------------------------------
select count(*)
from "Foretag"
where AMSUBSTRING2("Telnr", 1, 3) = '08-';
------------------------------------------------------------
But to make sure the impact of the substring call was significant, I
multiplied the condition ten times in six variants:
------------------------------------------------------------
select count(*) "Antal i 08-området"
from "Foretag"
where AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
------------------------------------------------------------
The returned count is 185187 (about 12% of the records). The table has a
primary key on a different column and no other indices.
I tried this query multiple times (4 * 2 times to be exact), alternating
between SUBSTRING1 and SUBSTRING2, within the same transaction. Minimal
disk usage so I assume all relevant data had already been cached.
The results:
SUBSTRING1: 43-46 seconds (43, 43, 45, 46)
SUBSTRING2: 34-35 seconds (34, 34, 35, 34)
So it seems that Pavel was right - FB memory allocation is faster.
Kjell
--
--------------------------------------
Kjell Rilbe
Adressmarknaden AM AB
E-post: kjell.rilbe@...
Telefon: 08-761 06 55
Mobil: 0733-44 24 64
> As I remember in multi-rowOK, I tested it. I think the results are of general interest so I'm
> SELECT <some UDF>,... FROM...
> FB allocates the memory just once and reuses it for every UDF calling
> - it may give you some preformance improvements as compared to in-UDF
> memory allocation and FREE_IT. But I am not sure, so try to test it
> yourself again :)
posting my results.
For any newcomers: this concerns a comparison between the following two
methods of returning a string from an UDF:
1. function MyUDF: PChar; cdecl;
2. function MyUDF(Result: PChar); cdecl;
I used these UDF:s
------------------------------------------------------------
{ Type 1 - in-UDF memory allocation }
function Substring1(Source: PChar; FromPos: PInteger;
ForCount: PInteger): PChar; cdecl;
begin
if Assigned(Source) and Assigned(FromPos) and Assigned(ForCount)
then begin
Result:=ib_util_malloc(StrLen(Source));
if (FromPos^<1) or (FromPos^>Integer(StrLen(Source)))
then Result[0]:=#0
else StrLCopy(Result,Source+FromPos^-1,ForCount^);
end
else Result:=nil;
end;
------------------------------------------------------------
{ Type 2 - FB memory allocation }
procedure Substring2(Source: PChar; FromPos: PInteger;
ForCount: PInteger; Dest: PChar); cdecl;
begin
if Assigned(Source) and Assigned(FromPos) and Assigned(ForCount)
then begin
if Assigned(Dest) then begin
if (FromPos^<1) or (FromPos^>Integer(StrLen(Source)))
then Dest[0]:=#0
else StrLCopy(Dest,Source+FromPos^-1,ForCount^);
end;
end
else begin
if Assigned(Dest)
then Dest[0]:=#0;
end;
end;
------------------------------------------------------------
These were declared in my FB database like this:
------------------------------------------------------------
-- Type 1 - in-UDF memory allocation
declare external function AMSUBSTRING1
CString(255),
Integer,
Integer
returns
CString(255) free_it
entry_point 'Substring1'
module_name 'AMUDF.dll';
------------------------------------------------------------
-- Type 2 - FB memory allocation
declare external function AMSUBSTRING2
CString(255),
Integer,
Integer,
CString(255)
returns
parameter 4
entry_point 'Substring2'
module_name 'AMUDF.dll';
------------------------------------------------------------
I then ran some tests on a table "Foretag" with 1586325 records (1.6
million). This table has a column "Telnr" with phone numbers in Swedish
format: <areacode>-<number>. I wanted to count records with a Stockholm
number, which start with '08-':
------------------------------------------------------------
select count(*)
from "Foretag"
where AMSUBSTRING2("Telnr", 1, 3) = '08-';
------------------------------------------------------------
But to make sure the impact of the substring call was significant, I
multiplied the condition ten times in six variants:
------------------------------------------------------------
select count(*) "Antal i 08-området"
from "Foretag"
where AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
and AMSUBSTRING2("Telnr", 1, 1) = '0'
and AMSUBSTRING2("Telnr", 1, 2) = '08'
and AMSUBSTRING2("Telnr", 1, 3) = '08-'
and AMSUBSTRING2("Telnr", 2, 1) = '8'
and AMSUBSTRING2("Telnr", 2, 2) = '8-'
and AMSUBSTRING2("Telnr", 3, 1) = '-'
------------------------------------------------------------
The returned count is 185187 (about 12% of the records). The table has a
primary key on a different column and no other indices.
I tried this query multiple times (4 * 2 times to be exact), alternating
between SUBSTRING1 and SUBSTRING2, within the same transaction. Minimal
disk usage so I assume all relevant data had already been cached.
The results:
SUBSTRING1: 43-46 seconds (43, 43, 45, 46)
SUBSTRING2: 34-35 seconds (34, 34, 35, 34)
So it seems that Pavel was right - FB memory allocation is faster.
Kjell
--
--------------------------------------
Kjell Rilbe
Adressmarknaden AM AB
E-post: kjell.rilbe@...
Telefon: 08-761 06 55
Mobil: 0733-44 24 64