Subject RE: [IBDI] Re: Firebird 1
Author Claudio Valderrama C.
> It would have to select the whole query as usual, and then (in the case of
> LIMIT (30, 10)) it would discard rows 0..29 and > 41 before returning a
> cursor.
> It differs from middleware because you don't have to send the whole result
> set (which could be a few MB in size) to the middleware (which may be on
> another machine) just to get 10 records. Surely that makes sense ?

Not for me. Second time, you do the same select and discard rows 0 to 39 and
so on. For this to be coherent, you need either to keep open a snapshot
transaction for every fetched web page (with a NEXT button, maybe) or to
keep the original result set available all this time to be able to pick the
following chunk of 10 records. This is the caching that middleware does.
Otherwise, you want to recalculate again and again the same query and get
different chunks of it, but consistency is not guarantee if the db changes
in the meantime.
Reflecting changes to the underlying db almost in real time (when browser
gets next web page) and ensuring consistency cannot coexist. If I want
always fresh data when looking at search results, when I hit NEXT, records
11..20 may not be the ones in the original search. If record 5 was deleted,
I will miss original #11 since it became #10 and if a new entry goes in the
middle of #8 and #9 (due to the ordering clause), I will see #10 twice, now
casted as #11.
Typical web search engines aren't paranoid about reflecting the current
state of the net. You get several invalid links that no longer exist when
you click on them. Typically, you submit your URL to the search site and an
editorial staff reviews it. This process takes time but ensures more
"useful" results like the ones delivered by NorthernLight. Other engines may
be automated to process submissions but the result of the indexing the
crawler does may or may not be available in a few minutes. A search engine
integrated with middleware can recognize that Ann was looking for "cats &
dogs" but that I was looking for "dogs & cats" and if none of us took
advantage of some proprietary escape character to indicate that this is a
literal string, the search is essentially the same. If the middleware caches
requests for a couple of minutes before discarding them, chances are the
answer is immediate without touching the db. A clevered search SW could even
recognize that this is a subset of a still cached request for "cats" and
filter it, assuming that A&B means both A and B must be present.
A recent research confirms that there are known items that are very
requested in web search engines (maybe there's a distribution like a
Gaussian curve?) so I think it may make sense to do caching even if one user
doesn't look for all returned pages; other users may do. Even in custom web
facilities tailored for some business you will find probably some preferred
items that are searched again and again.
I will repeat that this doesn't mean FB devs should do nothing to expand
its capabilities, but a thing that tries to be everything for everybody ends
up being nothing. Enhancements usually try to honor generic and massive
requirements (like being friendlier with the web developer) but it's
unlikely the solution will satisfy all people that want the behavior exactly
matching their app's requirements.