Subject Re: [IB-Architect] Proxy Query ?
Author Ann W. Harrison
At 12:02 AM 3/7/2001 -0300, Marcelo Lopez Ruiz wrote:

>...parsed and compiled by a query processor engine, ... separate layer (OLE
>DB layer, an internal one in the case of MSSQL7 itself), and queries it for
>information about metadata, available indexes, etc.

That's approximately what InterBase does - though only on one database.
It could easily (sic) send parts off to other databases. Early in the
history of InterBase, a customer called us and explained that he had
a query that took 26 hours (16? Some huge number) to run. A quick check
revealed that it took (x hours - 4 seconds) to optimize and 4 seconds to run.
That led to serious tree-pruning in optimization. The equivalent process
distributed around a network is going to be interesting.

Another issue is network speed - simpler than it was in the old days,
but still a consideration. A serious consideration.

>...each OLE DB provider may implement a lot of
>interfaces (which provide optimization hints, like the internal OLE DB
>provider does) or just a few (enough to perform some rough operations on
>plain text files, for example).

At the moment, I don't know what the right hints are to offer between
two InterBase databases - and have a strong desire to do it right, if
we do it at all.

>Through the use of well-designed interfaces, transaction contexts can also
>be extended to encompass other databases.

That's well understood technology. Even I can do that.

>I don't know how well IB separates query compilation, optimization and
>execution, but I'm sure it will require major changes.

They're completely separate. And yes, all the stats that go into
optimization are available in a format that could be distributed.
The question is:

select db1.a.whatever, db2.b.whatever
from a inner join b over a.fx = b.fx
where a.f1 <= '34' and b.f2 = 9

Do you find the matching A's and send them to B? Vice-versa?
If B were across a high-bandwidth, high latency line, would that
change the equation?

>However, look at the
>bright side - if interfaces (structs of function pointes will do, thanks)
>are used, the changes can be done in smaller increments - an index reporting
>interface on this provider, an interface to help the optimizer on that one.

OK - but it might also make sense to start with a simpler problem and
gradually knock off corners until we get to the whole thing.

> > How do we handle the case of access to one database when the other
> > is not available? What happens if one database is moved or renamed?
>Rollback the transaction, whine to the user. What else is there to do?

That's what bother's me - having one database unavailable because
another is enjoying a thunder storm. Sorry, that's antediluvian.
Because another is enjoying preventive maintenance.

> > Yes, aliases help here [general whining].
>The aliases are part of the database holding the reference metadata. Abstrac
>ting the local/remote logic to its own layer makes it a bit less
>troublesome - the connected database is just a special case of "stuff that
>exposeds interfaces/behaviour I can work with".

We called that the mega-database and had hours of unproductive fun
discussing its characteristics. As a volunteer group, we could have
decades of similar fun.

>Having said all that, feel free to spank me into reality - the only time I
>have ever looked at the code was on the security incident.

On the contrary - what can I do to draw you into the project? There
are some interesting and less overwhelming projects that need a fresh


We have answers.