Subject Re: [Firebird-Architect] Indexing tales - part 1 - Siblings
Author Ivan Prenosil
Jim Starkey wrote:
>> how would you handle partial word searching with <table, field, record,
>> word position>?
>>
>> eg: a search for "and" could/should return "Gandalf"
>>
>>
> That's not a reasonable thing to ask for (try it with Google, for
> example). It is reasonable to ask for a root word search ("manch*" for
> Manchester), that is easy to do.

That's not reasonable to expect that database stores only English words,
nor that suitability of different searching methods is the same for all languages.
E.g. because Czech is "wysiwyg" language, SoundEx is absolutely useless
for me, which does not mean it can't be useful for others :-)


> Taking your example further, "and" will almost always be a "stop" word
> ignored my searches.

Czech translation of "and" is "a". "and" has no meaning in Czech
so I have no reason to consider it as stop-word.


> Finally, things like depluralizing make sense to do in a preprocessing
> step rather than in the index.

Complexity of preprocessing and reliabiliy of result vary widely for different languages.
Sometimes it can be better to simply index words exactly as stored in document
than to let computer "guess" the correct meaning of the indexed word.

Ivan