Subject | Re: [Firebird-Architect] Indexing tales - part 1 - Siblings |
---|---|
Author | Ivan Prenosil |
Post date | 2006-10-09T13:43:07Z |
Jim Starkey wrote:
nor that suitability of different searching methods is the same for all languages.
E.g. because Czech is "wysiwyg" language, SoundEx is absolutely useless
for me, which does not mean it can't be useful for others :-)
so I have no reason to consider it as stop-word.
Sometimes it can be better to simply index words exactly as stored in document
than to let computer "guess" the correct meaning of the indexed word.
Ivan
>> how would you handle partial word searching with <table, field, record,That's not reasonable to expect that database stores only English words,
>> word position>?
>>
>> eg: a search for "and" could/should return "Gandalf"
>>
>>
> That's not a reasonable thing to ask for (try it with Google, for
> example). It is reasonable to ask for a root word search ("manch*" for
> Manchester), that is easy to do.
nor that suitability of different searching methods is the same for all languages.
E.g. because Czech is "wysiwyg" language, SoundEx is absolutely useless
for me, which does not mean it can't be useful for others :-)
> Taking your example further, "and" will almost always be a "stop" wordCzech translation of "and" is "a". "and" has no meaning in Czech
> ignored my searches.
so I have no reason to consider it as stop-word.
> Finally, things like depluralizing make sense to do in a preprocessingComplexity of preprocessing and reliabiliy of result vary widely for different languages.
> step rather than in the index.
Sometimes it can be better to simply index words exactly as stored in document
than to let computer "guess" the correct meaning of the indexed word.
Ivan