Subject | Re: [firebird-support] Re: Performance of the count Function. |
---|---|
Author | Dennis Mulder |
Post date | 2005-03-01T11:10:55Z |
I was thinking. I have a table of 150,000 companies. I want to look up a
company by combining all (relevant) fields of the company, parse it and
put the words into a search table. Like
COMPANY_SEARCH_TABLE1 (ID_COMP,
SEARCHWORD,
PRIMARY KEY (SEARCHWORD))
This way I got this 4 million+ table. The selectivity was very bad,
especially with words that occur frequently (duh).
To get a greater selectivity (and speed), how about something like
'overlapping indices'? (I don't know how else to call it)
COMPANY_SEARCH_TABLE2 (ID_COMP,
SEARCHWORD1,
SEARCHWORD2,
PRIMARY KEY (SEARCHWORD1, SEARCHWORD2))
I you want to look for 'n' words, S1, S2 ... Sn, then you could test for
if there's an ID_COMP in (S1 and S2)
that also satisfies for (S2 and S3)
|
(Sn-2
and Sn-1)
up until
(Sn-1 and Sn)
Of course you have to take care of the parsing routine. S1-S2 = S2-S1
and you don't want a double return for the same ID_COMP.
For (even) greater speed you could also use a three way primary key.
Please tell me what you think.
Dennis
company by combining all (relevant) fields of the company, parse it and
put the words into a search table. Like
COMPANY_SEARCH_TABLE1 (ID_COMP,
SEARCHWORD,
PRIMARY KEY (SEARCHWORD))
This way I got this 4 million+ table. The selectivity was very bad,
especially with words that occur frequently (duh).
To get a greater selectivity (and speed), how about something like
'overlapping indices'? (I don't know how else to call it)
COMPANY_SEARCH_TABLE2 (ID_COMP,
SEARCHWORD1,
SEARCHWORD2,
PRIMARY KEY (SEARCHWORD1, SEARCHWORD2))
I you want to look for 'n' words, S1, S2 ... Sn, then you could test for
if there's an ID_COMP in (S1 and S2)
that also satisfies for (S2 and S3)
|
(Sn-2
and Sn-1)
up until
(Sn-1 and Sn)
Of course you have to take care of the parsing routine. S1-S2 = S2-S1
and you don't want a double return for the same ID_COMP.
For (even) greater speed you could also use a three way primary key.
Please tell me what you think.
Dennis