Subject Re: [firebird-support] Events and non-intrusive, opt-in, server-side fulltext-search
Author Ivan Prenosil
> You have to create triggers for each and every table that you'd like to search
> in, in worst case by hand. That'd totally remove the non-intrusive, opt-in and
> easily configurable aspect of the fts engine.

But you have to declare/create udfs and stored procedure anyway,
so few additional triggers wont make much difference ?


> You have a set of UDFs to search for text fragments, an early draft
> would look like
> this pseudo-code select procedure:
>
> create procedure select_from_fts(
> fts_config_id integer,
> search_pattern varchar(255) )
> returning values( id integer ) as

For searching across more tables the SP will also return table
name, right ?


> Alternatively, I can re-index the whole tables every x minutes,

Aha, so you will index only small databases if you can consider
such approach.


> but I guess
> the incremental method using events would be nicer. With our current
> client-based solution, re-indexing takes a lot of time and resources.

By current solution, do you mean client connects to external ft system,
or that the index is stored in Firebird db ?

Ivan


----- Original Message -----
From: "Daniel Albuschat" <d.albuschat@...>
To: <firebird-support@yahoogroups.com>
Sent: Friday, March 11, 2005 11:10 AM
Subject: [firebird-support] Events and non-intrusive, opt-in, server-side fulltext-search


>
> Hello folks,
>
> I'd like to develop an opt-in, server-side solution for
> fulltext/textfragment-search (fts)
> for Firebird, with the help of UDFs and events for incremental updating.
> I'd imagine something like this:
>
> First, you'd create configuration sets that tell on which
> tables/fields a certain
> text fragment set would be created. That way, you can search for "John" in
> the first name and last name field of your address table at once, without
> multiple search queries.
> This has been proven to be very handy and efficient.
>
> As for the API:
> You have a set of UDFs to search for text fragments, an early draft
> would look like
> this pseudo-code select procedure:
>
> create procedure select_from_fts(
> fts_config_id integer,
> search_pattern varchar(255) )
> returning values( id integer ) as
> declare variable handle integer;
> declare variable id
> begin
> handle = fts_open(:fts_config_id, :search_pattern );
> /* fts_open would be an UDF -- can you invoke them this way? */
> repeat
> id = fts_find( :handle );
> if( :id == null )
> break;
> suspend;
> until false;
> fts_close(:handle);
> end^
>
> This way, you do
> select t.* from select_from_fts( 1, "test" ) f left join my_table t on t.id=f.id
> to get the matching results from the text fragment engine.
>
> fts_open and fts_find connect to a server application, fts_close closes
> the handle created on the server. This way, it can reside on another box,
> to distribute needed computing time/resources.
> This application handles the text fragments and connects to the firebird
> server (it's a two-sided connection setup, where the fts-server
> connects to FB and
> FB via udfs to the fts-server) and registers for update/delete/insert
> events for some
> configured tables to update it's text fragments.
>
> This looks like it'd work to me.
> The only problem are the events to update the fragments:
> You have to create triggers for each and every table that you'd like to search
> in, in worst case by hand. That'd totally remove the non-intrusive, opt-in and
> easily configurable aspect of the fts engine.
> The only way to solve this would be to let the fts server application check
> and re-create the triggers for the configured tables automagically.
> I guess I'll have to do this anyways...
> But the again, I don't see any advantages over events instead of writing
> fts_update()/fts_insert()/fts_delete UDFs that handle the updating
> themselves. This would mean we need a one-sided (FB server to
> fts-server) connection only, which might minimize configuration troubles.
> I haven't looked at events thoroughly yet, maybe I actually *have* to do
> this, because it looks like when an event is posted, I know what happend,
> but not *where* -- ie, I don't have a dataset primary key or something
> like this.
>
> Alternatively, I can re-index the whole tables every x minutes, but I guess
> the incremental method using events would be nicer. With our current
> client-based solution, re-indexing takes a lot of time and resources.
>
> Is this the right way or do you see any better solutions?
>
> regards,
> Daniel Albuschat