Subject | Re: There must be a better way. |
---|---|
Author | Svein Erling Tysvær |
Post date | 2005-09-02T11:45:37Z |
Well, Nigel, I'm pretty certain getting the same result as your select
can be done very quickly:
select * from rdb$database
where 1=0
There's no way one record in tbl_pageword could have more than one
int_word, and since int_word is the primary key of tbl_words, it
couldn't possibly match more than at most one of your criteria.
My guess is that you want three tables:
A table of words, with an integer key
Create table tbl_words (
int_word integer not null primary key,
str_word varchar(100) not null unique
);
A table of pages, with str_href and an integer key
Create table tbl_pages (
int_page integer not null primary key,
str_href varchar(200) not null unique
);
A table of word and page id's linked together
Create table tbl_pageword (
int_pageword integer not null primary key,
int_page integer not null, //Create an index on this field separately
int_word integer not null //Create an index on this field separately
);
Then your sql becomes:
Select tp.str_href from tbl_pages tp
join tbl_pageword tpw on tpw.int_page = tp.int_page
join tbl_word tw1 on tw1.int_word = tpw.int_word
join tbl_word tw2 on tw2.int_word = tpw.int_word
join tbl_word tw3 on tw3.int_word = tpw.int_word
where
tw1.str_word = 'SevenUp' and
tw2.str_word = 'Bamsemums' and
tw3.str_word = 'Firebird'
Report back your results after doing this modification, I expect your
result to be much better and hopefully execute pretty quickly.
Set
can be done very quickly:
select * from rdb$database
where 1=0
There's no way one record in tbl_pageword could have more than one
int_word, and since int_word is the primary key of tbl_words, it
couldn't possibly match more than at most one of your criteria.
My guess is that you want three tables:
A table of words, with an integer key
Create table tbl_words (
int_word integer not null primary key,
str_word varchar(100) not null unique
);
A table of pages, with str_href and an integer key
Create table tbl_pages (
int_page integer not null primary key,
str_href varchar(200) not null unique
);
A table of word and page id's linked together
Create table tbl_pageword (
int_pageword integer not null primary key,
int_page integer not null, //Create an index on this field separately
int_word integer not null //Create an index on this field separately
);
Then your sql becomes:
Select tp.str_href from tbl_pages tp
join tbl_pageword tpw on tpw.int_page = tp.int_page
join tbl_word tw1 on tw1.int_word = tpw.int_word
join tbl_word tw2 on tw2.int_word = tpw.int_word
join tbl_word tw3 on tw3.int_word = tpw.int_word
where
tw1.str_word = 'SevenUp' and
tw2.str_word = 'Bamsemums' and
tw3.str_word = 'Firebird'
Report back your results after doing this modification, I expect your
result to be much better and hopefully execute pretty quickly.
Set
--- In firebird-support@yahoogroups.com, "Nigel Weeks" wrote:
> I'm putting a search facility on our intranet, and I've hit a wall.
> My current idea is:
>
> A table of words, with an integer key
> Create table tbl_words (
> int_word integer not null primary key,
> str_word varchar(100) not null unique
> );
>
> A table of word id's, linked to an address
> Create table tbl_pageword (
> int_page integer not null,
> int_word integer not null,
> str_href varchar(200) not null,
> primary key(int_page, int_word, str_href)
> );
>
> The problem is, doing 'and' searches.
>
> For example: find the pages that contain ALL THREE TERMS 'beer',
> 'chocolate', '4WD'
>
> Subselects are slower than I'd like, as a new word lookup has to be
> done for each word. For example:
>
> Select str_href from tbl_pageword where
> int_word in (select int_word from tbl_words where str_word = 'beer')
> AND int_word in (select int_word from tbl_words where str_word =
> 'chocolate')
> AND int_word in (select int_word from tbl_words where str_word =
> '4WD');
>
> There has to be a better way. Any ideas? Even a complete schema
> change is welcomed.
>
> Nige.