Subject RE: [firebird-support] Re: trying to avoid large datasets
Author firebird@spence.users.panix.com
-----Original Message-----
From: firebird-support@yahoogroups.com
[mailto:firebird-support@yahoogroups.com] On Behalf Of Florian Hector
Sent: Saturday, July 08, 2006 10:25 AM
To: IB-Support
Subject: Re: [firebird-support] Re: trying to avoid large datasets


>> There are lots of ways of implementing a soundex, here is one using
>> Firebird stored procedures.
>>
>> http://fbtalk.net/viewtopic.php?id=182
>>
>
> sounds like what i need. i'll check it out.

>My first thought was Soundex when I read about your problem. However, it's
probably of only little
>or no use for you.
>Creating the soundex for "a river somewhere" returns A6162 whereas "river
somewhere, a" returns
>R1625. Those two strings do not seem very similar for the soundex
algorithm.
>
>What you need is the "levenshtein distance". This is an algorithm that
calculates how many single
>characters have to be swapped, added, shifted or left out to get two
strings to be the same.
>
>Ask aunt google for it, it returns a heap of sites.
>
>Florian


Lucene might be useful here (http://lucene.apache.org/java/docs/).