Subject | Re: [Firebird-Architect] Re: Google-like scoring in databases |
---|---|
Author | Jim Starkey |
Post date | 2003-07-02T14:58:41Z |
Roman Rokytskyy wrote:
the foggiest idea
of what the engine could do with audio or video streams. Text, spatial,
sounds good to
me.
extensible system
tables, the same dolt neglected to implement backup and restore for
them, so you have
some coding to do.
Firebird is still using
the crapola YACC parser I cobbled together to make a quicker DSQL.
Ain't nothing
extensible about it. YACC is preprocessed at build time, so it's out
(probably a very good
thing). So you need a new parser that loads the grammar at runtime.
Lots of candidates,
but none particularly faster. Since Firebird doesn't cache compiled
statements, when pumping
transactions it's often compile bound. Plan to fix that (which needs
doing anyway). Then
there are the various stages of compilation -- semantic analysis,
boolean distribution, view
and computed field expansion -- that need to be address. Major
architectural problem here.
Designing a plugin API is a piece of cake any time you're willing to
freeze the internals,
which is never, leaving you with a situation where you have to
distribute a version of a plugin
for every minor version of the system. If you want to get an idea of
bad this can get, look
at the Mozilla spell checker. There's a version for 1.3, 1.4a, 1.4b,
etc. OK, they may be
thugs, but they're probably morons, but internals must change to make
the system better.
table id and record number.
But that isn't architecturally guaranteed. If you build in a
dependence on an implementation
artifact, it become de facto architectural. Maybe Firebird will want to
switch to 6 byte
record numbers in the future. Architecture supports it, disks are
plenty big enough, but
once you've designed in a dependency, things get sticky.
Netfrastructure returns an object
of class ResultList than can iterate a ordered list of ResultSets.
Unless you can find a way
to do the same within the existing native API, you will lose the ability
to express the results
of a multi-table search, which I don't think you want to do.
[Non-text portions of this message have been removed]
>What about geo-spatial capabilities (at least I need them more thanThat was the intuitive idea behind blob filters and UDFs. But I haven't
>free text search)? Should we put text, spatial, audio and video
>fetures into the server?
>
>
the foggiest idea
of what the engine could do with audio or video streams. Text, spatial,
sounds good to
me.
>Second, can be be a plug-in, or should it be an integral part of theAlthough some forward looking fellow designed Interbase with user
>engine? Part of the engine for the following reasons:
>
> 1. Somebody needs to keep track of which fields are searchable,
>i.e. word-indexed.
>
>
>
>- plugin introduces new RDB$SEARCHABLE_FIELDS;
>- plugin extends SQL with some new keywords;
>
>
extensible system
tables, the same dolt neglected to implement backup and restore for
them, so you have
some coding to do.
> 2. Somebody needs to maintain the word index for all insert,The trigger is a piece of cake. The "extends SQL processing" isn't.
>updates, and deletes of records that contain searchable fields.
>- auto-generated trigger
>- plugin extends SQL processing
>
>
Firebird is still using
the crapola YACC parser I cobbled together to make a quicker DSQL.
Ain't nothing
extensible about it. YACC is preprocessed at build time, so it's out
(probably a very good
thing). So you need a new parser that loads the grammar at runtime.
Lots of candidates,
but none particularly faster. Since Firebird doesn't cache compiled
statements, when pumping
transactions it's often compile bound. Plan to fix that (which needs
doing anyway). Then
there are the various stages of compilation -- semantic analysis,
boolean distribution, view
and computed field expansion -- that need to be address. Major
architectural problem here.
Designing a plugin API is a piece of cake any time you're willing to
freeze the internals,
which is never, leaving you with a situation where you have to
distribute a version of a plugin
for every minor version of the system. If you want to get an idea of
bad this can get, look
at the Mozilla spell checker. There's a version for 1.3, 1.4a, 1.4b,
etc. OK, they may be
thugs, but they're probably morons, but internals must change to make
the system better.
>> 3. Somebody needs to map between table and fields names onFine.
>>numeric ids (or pay a ghastly penalty for failure to do so).
>>Search engine takes care of it, table name and field name is input for
>>it, what it does with it internally, is its own problem.
>>
>>
>> 4. Somebody needs to be able to fetch records quickly by tableAgain, a question of architecture. Yes, rdb$db_key is, in fact, the
>>id and record number.
>>
>>
>
>Why cannot we use rdb$db_key? Within one table it seems to be unique.
>
table id and record number.
But that isn't architecturally guaranteed. If you build in a
dependence on an implementation
artifact, it become de facto architectural. Maybe Firebird will want to
switch to 6 byte
record numbers in the future. Architecture supports it, disks are
plenty big enough, but
once you've designed in a dependency, things get sticky.
>> 5. The DML needs an extension to search explicit table fieldsTo summarize by previous comment, NFW.
>>has part of the boolean, which requires integration with the
>>optimizer
>>
>>
>
>Make optimizer plugin-enabled.
>
>
>> 6. The native APIs and associated plumbing needs to be extendedThe problem isn't the DML, it's the retrieval mechanism.
>>to support search semantics.
>>
>>
>
>Why cannot we use SQL here? i know that you do not accept this idea,
>and Netfrastructure uses JDBC extension instead. But I think, the only
>thing that has to be transferred to the client in addition to the data
>in the query is score, but why cannot we introduce CURRENT_SCORE
>pseudo-variable?
>
>
Netfrastructure returns an object
of class ResultList than can iterate a ordered list of ResultSets.
Unless you can find a way
to do the same within the existing native API, you will lose the ability
to express the results
of a multi-table search, which I don't think you want to do.
[Non-text portions of this message have been removed]