Subject Re: [Firebird-Architect] Re: Full Text Search
Author Jim Starkey
Roman Rokytskyy wrote:

>We have discussed this issue approx. 1,5 years ago. The next round :)
>> 1. The general case is multi-table, multi-column search, which
>> will need an API extension.
>Can you explain this requirement? The last time you have argued that
>other approaches do no work, people want to search the database, not
>tables. My argument against your solution was that the result of the
>search is not a relation, or more correctly, not a relation in first
>normal form.
Let me start with my observation that the web is the universal platform,
and as a platform, has a hundred or a thousand more users than any other
platform doing sophisticated tasks, without training, that are utterly
beyond the capabilities of pre-web computer systems. I believe that the
web as an application model is as far advanced from the File/Edit/View
GUI model as the GUI model was from command lines and command lines from
punch cards. And this is why I built Netfrastructure.

Database systems need to adapt to the way people work given access to
suitable environments. Suppose your dog license is about to expire and
you need to renew. What do you do? You either go to Google and type in
"manchester massachusetts dog license" or, alternatively, go to click search, and type in "dog license". Each
gives you a good list of plausible alternatives to follow. It doesn't
know whether you want to renew a dog license, get a copy of the town
bylaw on dog licenses, the minutes of the last Selectmen's meeting to
discuss dog licenses, or the procedure to renew a dog license. What
we've done, in specific, is selected a half dozen plausible pages from
the 8 billion pages on the web.

File/Edit/View GUI programs don't need good search, let alone
multi-table, multi-column search. In my lifetime I've celebrated the
invention of the disk, the birth of time sharing, the death of time
sharing, departmental computing, personal computing, and the GUI. I
look forward to celebrating the death of the File/Edit/View GUI with the
same enthusiasm as the death of the punch card and the timesharing system.

We need multi-table, multi-column search so Firebird can live to
celebrate the death of the File/Edit/View GUI. No less.

If you make your living selling File/Edit/View GUI programs, you're not
going to like search. Plan to join the silent movie guys waiting for
talkies to blow over and the slide rule manufacturers waiting to for
calculators to loose their bloom and go way. The rest of us will learn
from the web and write better software because of it.

That's why we need multi-table, multi-column search. Our future depends
on it.

>> 9. Search indexing should be html-aware
>Not only HTML-aware, but also XML, RTF, MS Word, etc. But that is easy
>to achieve. If that content is stored in the BLOB, we have already a
>concept of BLOB filter. Just define a "searchable" BLOB type and
>corresponding "HTML", "PDF", "RTF" BLOB types. The conversion between
>that datatypes is done by filter.
Oh, my head swims. Netfrastructure has filters for <*ml>, MSWord, and
PDF. MSWord is a task on the scope of the Vulcan project. PDF has
better documention but more intractable problems (hint: you have to
emulate everything in a laser printer but the paper path and toner
drum). Nobody's asked for RTF, thank god, but I have a converter on the
shelf somewhere when they do.

Interesting that you missed the Open Office formats? Hey, Roman, get
with the program. And what about WordPerfect? And the worst of them
all, PowerPoint?