Subject | Re: [ib-support] query optimizing |
---|---|
Author | Svein Erling Tysvær |
Post date | 2000-12-05T11:14:42Z |
Hi Bram.
An index can't help you at all when using CONTAINING, so I think 10-15
seconds is fairly OK on a table your size. if you used STARTING rather than
CONTAINING the index could be used, but then you would of course only get
those streets starting with Bram.
An alternative we've used here at the Norwegian Cancer Registry, is to
create some additional indexed fields (for names, not streets) containing
parts of the name. Let's say you added four fields street_part1,
street_part2, street_part3, street_part4 all defined as CHAR(5) and parsed
each street so that every part of the name got into these fields. In your
case, that would mean:
Street Street_part1 Street_part2 Street_part3 Street_part4
Bram van der Voet Bram van der Voet
(or maybe you would only store Bram and Voet, since I guess van der is
pretty common in the Netherlands, and a search for those words are next to
futile).
You could then modify your SQL to something like
SELECT STREET FROM POSTCODE
WHERE STREET CONTAINING 'BRAM'
AND (STREET_PART1 STARTING 'BRAM' OR
STREET_PART2 STARTING 'BRAM' OR
STREET_PART3 STARTING 'BRAM' OR
STREET_PART4 STARTING 'BRAM')
which would get all streets where Bram was the beginning of either of the
indexed fields - not quite containing, but you're normally interested in
searching for the beginning of a word in the street name, not something in
the middle.
The end result should be similar to your CONTAINING (all street but those
that doesn't start bram in any of the first four words of the street), but
it should be able to use indices which may speed things up a bit.
Set
At 11:46 05.12.2000 +0100, you wrote:
about 450.000 records with all street names in the Netherlands.
CREATE TABLE "POSTCODE"
(
"ID" INT64 NOT NULL,
"STREET" CHAR( 50) CHARACTER SET NONE,
"CITY" CHAR( 50) CHARACTER SET NONE,
"POSTCODE" CHAR( 7) CHARACTER SET NONE,
CONSTRAINT "PK_POSTCODE" PRIMARY KEY ("ID")
);
CREATE ASC INDEX "POSTCODE" ON "POSTCODE" ("POSTCODE");
CREATE ASC INDEX "STREET" ON "POSTCODE" ("STREET");
I created the following query :
SELECT STREET FROM POSTCODE WHERE STREET CONTAINING 'BRAM'
There are actually 130 streets in Holland containing my first name. But it
takes about 10/15 seconds before its returning the 130 rows with an
IB_Query set on active. Is there a way to check if the query is 100%
optimized.
Could it go faster ?
Bram van der Voet / A&V Automatisering
<mailto:bram@...>bram@...
Glasbergenlaan 6
2235 BP VALKENBURG ZH
tel 071 407 6956
fax 071 407 3939
eGroups Sponsor
<http://rd.yahoo.com/M=102308.1038796.2731130.908943/D=egroupmail/S=17000000
01:N/A=466330/?http://www.yahoo.com>Click Here!
To unsubscribe from this group, send an email to:
ib-support-unsubscribe@egroups.com
<<<<
An index can't help you at all when using CONTAINING, so I think 10-15
seconds is fairly OK on a table your size. if you used STARTING rather than
CONTAINING the index could be used, but then you would of course only get
those streets starting with Bram.
An alternative we've used here at the Norwegian Cancer Registry, is to
create some additional indexed fields (for names, not streets) containing
parts of the name. Let's say you added four fields street_part1,
street_part2, street_part3, street_part4 all defined as CHAR(5) and parsed
each street so that every part of the name got into these fields. In your
case, that would mean:
Street Street_part1 Street_part2 Street_part3 Street_part4
Bram van der Voet Bram van der Voet
(or maybe you would only store Bram and Voet, since I guess van der is
pretty common in the Netherlands, and a search for those words are next to
futile).
You could then modify your SQL to something like
SELECT STREET FROM POSTCODE
WHERE STREET CONTAINING 'BRAM'
AND (STREET_PART1 STARTING 'BRAM' OR
STREET_PART2 STARTING 'BRAM' OR
STREET_PART3 STARTING 'BRAM' OR
STREET_PART4 STARTING 'BRAM')
which would get all streets where Bram was the beginning of either of the
indexed fields - not quite containing, but you're normally interested in
searching for the beginning of a word in the street name, not something in
the middle.
The end result should be similar to your CONTAINING (all street but those
that doesn't start bram in any of the first four words of the street), but
it should be able to use indices which may speed things up a bit.
Set
At 11:46 05.12.2000 +0100, you wrote:
>>>>With the help from the ibobjects lists I created a database/table with
about 450.000 records with all street names in the Netherlands.
CREATE TABLE "POSTCODE"
(
"ID" INT64 NOT NULL,
"STREET" CHAR( 50) CHARACTER SET NONE,
"CITY" CHAR( 50) CHARACTER SET NONE,
"POSTCODE" CHAR( 7) CHARACTER SET NONE,
CONSTRAINT "PK_POSTCODE" PRIMARY KEY ("ID")
);
CREATE ASC INDEX "POSTCODE" ON "POSTCODE" ("POSTCODE");
CREATE ASC INDEX "STREET" ON "POSTCODE" ("STREET");
I created the following query :
SELECT STREET FROM POSTCODE WHERE STREET CONTAINING 'BRAM'
There are actually 130 streets in Holland containing my first name. But it
takes about 10/15 seconds before its returning the 130 rows with an
IB_Query set on active. Is there a way to check if the query is 100%
optimized.
Could it go faster ?
Bram van der Voet / A&V Automatisering
<mailto:bram@...>bram@...
Glasbergenlaan 6
2235 BP VALKENBURG ZH
tel 071 407 6956
fax 071 407 3939
eGroups Sponsor
<http://rd.yahoo.com/M=102308.1038796.2731130.908943/D=egroupmail/S=17000000
01:N/A=466330/?http://www.yahoo.com>Click Here!
To unsubscribe from this group, send an email to:
ib-support-unsubscribe@egroups.com
<<<<