Subject | Re: [IBO] full text search question |
---|---|
Author | Helen Borrie |
Post date | 2003-07-12T02:11:10Z |
At 05:49 PM 11/07/2003 +0000, you wrote:
database. If there is no server-native "textsearch( )" function defined in
the language (and there isn't in Firebird/IB) then the only way you would
achieve that would be to write a UDF that invoked it. AFAIK, nobody has
ever done that because there would be major (probably insurmountable)
problems using UDFs to access tables.
filesystem table structures) with a DBMS which controls its own physical
storage. All SQL can offer at the raw level is its set of string search
operators (=, <, >, LIKE, STARTING WITH, IN ( ) [usually horrible] and the
various NOT options).
What's needed for building a SQL search engine is a way to maintain search
indexes which are appropriate to the style of searching required. IBO's
FTS components encapsulate mechanisms for defining and maintaining such
indexes. These indexes (like any other indexes) become available to the
optimizer whenever it is constructing an output set.
to find the primary keys of rows that satisfy a search criterion on
specified columns. So you would use the found primary key (or use it to
find the required foreign key) in order to construct a set of join criteria.
tables on the fly. This is not pretty.
However, a highly valid approach in Firebird/IB is to define such outputs
in selectable stored procedures. By this means, you can finely predefine
an output set without the need for temporary tables. Such output sets can
be requested directly or joined, as an intermediate set, to tables, views
and other selectable SPs.
So, taking your DBISAM example:
SELECT * from Table where tabledate between '2003-05-01' and '2003-05-
31' and textsearch('octopus' in memofield1);
You would move the search part of the query into a FOR...SELECT structure
in a SP that has your required inputs and outputs defined as
arguments. Let's say you name this SP TEXTSEARCH and it's declared like this:
create procedure TEXTSEARCH (from_date timestamp, to_date timestamp,
search_argument varchar(99))
returns (
pkvalue integer,
retvar1 sometype,
retvar2 sometype,
....etc.
)
as
begin
FOR SELECT.....DO
begin
...
suspend;
end
end
You can use the returned pkvalue to join the output of a selectable SP to a
table, view, etc.
In most respects, this is NOT a substitute for a function call. It cannot
be generic with regard to database objects - you can't pass object names to
a SP that can be picked up and applied generically. Server-side code
modules are pre-compiled and the PSQL compiler has to operate on known
objects. (You can pass an object name as a string and write conditional
code inside the SP for each allowable object name string...)
I know this isn't a total answer - I don't have time to write a manual on
SQL search engines today! :-)) In summary, the SQL language implementation
doesn't support fuzzy-text searching, so you have to write your own
fuzzy-text search engine based on appropriate indexes. The FTS components
encapsulate a lot of the grunt work of setting up metadata for fuzzy-text
search mechanisms such as exact and partial strings, metaphone, soundex,
synonyms and antonyms, and maintaining the search index tables. They also
provide a macro substitution mechanism and scripting language for writing
the macros.
Helen
>I have a question about the functionality of the full text searchFTS doesn't (can't!!) alter the underlying SQL language of the
>component in IBO. Currently I'm using DBISAM and I depend on having
>a full text search function that integrates nicely within a query's
>SQL. For example, I can issue a SELECT statement that both queries
>table fields and does a full text search at the same time:
>
>SELECT * from Table where tabledate between '2003-05-01' and '2003-05-
>31' and textsearch('octopus' in memofield1);
>
>I like the speed and ease-of-use I get when the full-text-search
>functionality is integrated with the SQL in this way.
>
>My question is: what sort of integration with regular SQL does the
>IBO full text search provide?
database. If there is no server-native "textsearch( )" function defined in
the language (and there isn't in Firebird/IB) then the only way you would
achieve that would be to write a UDF that invoked it. AFAIK, nobody has
ever done that because there would be major (probably insurmountable)
problems using UDFs to access tables.
>If it doesn't integrate into the SQLIt is indeed the case. You can't compare DBISAM (which operates on
>in the same way as DBISAM's full text search (which I expect may be
>the case),
filesystem table structures) with a DBMS which controls its own physical
storage. All SQL can offer at the raw level is its set of string search
operators (=, <, >, LIKE, STARTING WITH, IN ( ) [usually horrible] and the
various NOT options).
What's needed for building a SQL search engine is a way to maintain search
indexes which are appropriate to the style of searching required. IBO's
FTS components encapsulate mechanisms for defining and maintaining such
indexes. These indexes (like any other indexes) become available to the
optimizer whenever it is constructing an output set.
>then what process is used to get the results of a SQLAn FTS search uses the indexes on whatever search metadata you have defined
>query joined with the results of a full text search?
to find the primary keys of rows that satisfy a search criterion on
specified columns. So you would use the found primary key (or use it to
find the required foreign key) in order to construct a set of join criteria.
>I want to avoidThis is a correct approach, as long as you mean not creating temporary
>the writing of intermediate tables or anything else that's going to
>slow things down.
tables on the fly. This is not pretty.
However, a highly valid approach in Firebird/IB is to define such outputs
in selectable stored procedures. By this means, you can finely predefine
an output set without the need for temporary tables. Such output sets can
be requested directly or joined, as an intermediate set, to tables, views
and other selectable SPs.
So, taking your DBISAM example:
SELECT * from Table where tabledate between '2003-05-01' and '2003-05-
31' and textsearch('octopus' in memofield1);
You would move the search part of the query into a FOR...SELECT structure
in a SP that has your required inputs and outputs defined as
arguments. Let's say you name this SP TEXTSEARCH and it's declared like this:
create procedure TEXTSEARCH (from_date timestamp, to_date timestamp,
search_argument varchar(99))
returns (
pkvalue integer,
retvar1 sometype,
retvar2 sometype,
....etc.
)
as
begin
FOR SELECT.....DO
begin
...
suspend;
end
end
You can use the returned pkvalue to join the output of a selectable SP to a
table, view, etc.
In most respects, this is NOT a substitute for a function call. It cannot
be generic with regard to database objects - you can't pass object names to
a SP that can be picked up and applied generically. Server-side code
modules are pre-compiled and the PSQL compiler has to operate on known
objects. (You can pass an object name as a string and write conditional
code inside the SP for each allowable object name string...)
I know this isn't a total answer - I don't have time to write a manual on
SQL search engines today! :-)) In summary, the SQL language implementation
doesn't support fuzzy-text searching, so you have to write your own
fuzzy-text search engine based on appropriate indexes. The FTS components
encapsulate a lot of the grunt work of setting up metadata for fuzzy-text
search mechanisms such as exact and partial strings, metaphone, soundex,
synonyms and antonyms, and maintaining the search index tables. They also
provide a macro substitution mechanism and scripting language for writing
the macros.
Helen