Subject | There must be a better way. |
---|---|
Author | Nigel Weeks |
Post date | 2005-09-02T00:38:45Z |
I'm putting a search facility on our intranet, and I've hit a wall.
My current idea is:
A table of words, with an integer key
Create table tbl_words (
int_word integer not null primary key,
str_word varchar(100) not null unique
);
A table of word id's, linked to an address
Create table tbl_pageword (
int_page integer not null,
int_word integer not null,
str_href varchar(200) not null,
primary key(int_page, int_word, str_href)
);
The problem is, doing 'and' searches.
For example: find the pages that contain ALL THREE TERMS 'beer',
'chocolate', '4WD'
Subselects are slower than I'd like, as a new word lookup has to be done for
each word. For example:
Select str_href from tbl_pageword where
int_word in (select int_word from tbl_words where str_word = 'beer')
AND int_word in (select int_word from tbl_words where str_word =
'chocolate')
AND int_word in (select int_word from tbl_words where str_word = '4WD');
There has to be a better way. Any ideas? Even a complete schema change is
welcomed.
Nige.
Nigel Weeks
Tech Support & Systems Developer
nweeks@...
www.examiner.com.au
71 - 75 Paterson Street
Launceston
Tasmania 7250
Australia
Phone: 03 6336 7234
[Non-text portions of this message have been removed]
My current idea is:
A table of words, with an integer key
Create table tbl_words (
int_word integer not null primary key,
str_word varchar(100) not null unique
);
A table of word id's, linked to an address
Create table tbl_pageword (
int_page integer not null,
int_word integer not null,
str_href varchar(200) not null,
primary key(int_page, int_word, str_href)
);
The problem is, doing 'and' searches.
For example: find the pages that contain ALL THREE TERMS 'beer',
'chocolate', '4WD'
Subselects are slower than I'd like, as a new word lookup has to be done for
each word. For example:
Select str_href from tbl_pageword where
int_word in (select int_word from tbl_words where str_word = 'beer')
AND int_word in (select int_word from tbl_words where str_word =
'chocolate')
AND int_word in (select int_word from tbl_words where str_word = '4WD');
There has to be a better way. Any ideas? Even a complete schema change is
welcomed.
Nige.
Nigel Weeks
Tech Support & Systems Developer
nweeks@...
www.examiner.com.au
71 - 75 Paterson Street
Launceston
Tasmania 7250
Australia
Phone: 03 6336 7234
[Non-text portions of this message have been removed]