Subject Re: [firebird-support] Fuzzy match on business name
Author Norman Dunbar
On 20/05/13 18:53, lcampbell wrote:
> Here's one I hope others have encountered (and possibly found an elegant
> solution).
> ... SNIP ...
> constituent's last name (in uppercase). It works well for individuals,
> but fails to catch my 'Corlis Estates' example above. Yes, I could break
> the incoming business name string into individual words, eliminate the
> common "The, A, An, The, Inc,..." and submit individual words in a chain
> of "or"s. Has anybody got a better way?
>
> Lane C.
> NW Software

I would be thinking of a SOUNDEX function for this. I need one myself to
ensure that the "fat fingered" user - me - of my Music collection can
type in a name that's roughly spelt correctly.

I did something similar years ago on Oracle where there is a Soundex
function built in. The code offered an exact lookup - it selected
directly from a table using a WHERE clause, or a fuzzy search - it used
a soundex table which was linked to the main table via Primary key.

All words in the search field(s) were soundexed - they have to be. Using
my music collection as an example:

Elton John would have two entries in the soundex table, one for Elton,
one for John. Anyone looking for John something or something John, would
lookup the "john, Jon, Johanne etc" soundex code (4 characters starting
with J) and offer a list of all the Artists in the artist table, whose
PK is listed as containing the word John or similar to it.

So:

ARTIST TABLE:

ID bigint PK
NAME ....
...


SOUNDEX TABLE:

ARTIST_ID bigint
SOUNDEX CHAR(4)


EXAMPLE ARTIST:

1 Elton John
2 Runrig
...


EXAMPLE SOUNDEX:

1 E435
1 J500
2 R562
...


The Soundex for Jon, John, Johanne etc all boil down to J500, so I would
go in to the soundex table looking for the J500 and from that, pull out
the artist PK id (1) and joing on that to the Atrist table to get my
match/matches.

Hope this makes sense.


Cheers,
Norm.

--
Norman Dunbar
Dunbar IT Consultants Ltd

Registered address:
Thorpe House
61 Richardshaw Lane
Pudsey
West Yorkshire
United Kingdom
LS28 7EL

Company Number: 05132767