Subject Re: Full-text Search Support
Author dcabbar
Actually, what I really want is the full-text search portion, and
fuzzy match is just the part of it. i.e. searching for a specified
word in a varchar column. And, as you know using "like %word%" is
pretty expensive.

So, is there at least any add-on that is available? Any
ideas/comments/experience on this?

Again, thanks for the help.


--- In firebird-support@yahoogroups.com, "Nigel Weeks" <nigel@e...> wrote:
>
> > -----Original Message-----
> > From: dcabbar [mailto:dcabbar@y...]
> > Sent: Wednesday, 31 December 2003 10:23
> > To: firebird-support@yahoogroups.com
> > Subject: [firebird-support] Full-text Search Support
>
>
> > Hi All,
>
> > Does firebird support full-text/fuzzy search in text columns? i.e.
similar
> to my sql's match ... against? Of > course, I am not talking about
"like"
> searches where you can perform starts with type searches.
>
> > If there is such a functionality, how efficient/good is this?
>
> > Any comment is appreciated...
>
> > TIA
>
> Out of the box fuzzy matches: not yet, but it's certainly possible!!!
>
> There's an article on building a soundex UDF on the ibphoenix site.
> http://ibphoenix.com/main.nfs?a=ibphoenix&page=ibp_howto4
>
> As for built-in functions, I've heard there's a double-metaphone
function in
> the works, somewhere...
> As with most of these type of searches, it pays to store the hash of the
> word before a search, i.e.:
>
> Select * from table where dmphone(str_text) = dmphone(searchterm)
>
> Would be shooting yourself in the foot, as it would have to convert
every
> single record to see if it matched.
> On the other hand, if a trigger calculated the dmphone, and stored it in
> it's own field, with an index on it, it'd fly:
>
> Select * from table where dm_text = dmphone(searchterm)
>
> Then there's ones you already know about: 'like', and 'containing'
>
> 'like' can use an index, if you search from the first character onwards:
> Select * from table where str_text like 'searchterm%'; (Starting with
> text, uses index)
>
> But if you move it away from the first character, suddenly it has to
do a
> brute-force scan:
> Select * from table where str_text like '%searchterm%'; (Anywhere
in text,
> brute force)
> Select * from table where str_text like '%searchterm'; (Ending
with text,
> brute force)
>
> 'Containing' will never use an index, (AFAIK), as it always scans
the entire
> field for possible substring matches (same as LIKE '%searchterm%',
but case
> insensitive).
>
> Your best bet: roll a soundex or metaphone UDF(or find one that's
already
> written)
>
> Disclaimer: I could be wrong on any of these points...
>
> Nige.