Subject | RE: [Firebird-Architect] Inexact database operations |
---|---|
Author | Rick Debay |
Post date | 2005-06-30T17:46:36Z |
> Sure, but who's going to compute it?The server. The point was that there are solutions to either calculate
the results in parallel, or cache them.
Extremely large search engines (Google, Yahoo) approximate by giving up
after a certain point and just performing the calculation by what they
have at hand. Since YMMV, I think it's up to the client to determine
what's close enough, and what constitutes a good approach to finding
that approximation.
Knowing just enough to be dangerous, one approach might be to take all
the appropriate indexes and count the number of items in the resulting
bitmap. If the value is < 40,000 (Google's threshold in Brin and Page's
original paper) calculate using count(*), else use the bitmap's value.
I don't know if the required APIs are exposed to the client, or to a
UDF.
The other point was that if approximate is not well-defined (and only
the developer knows what that is for them) then any given number is
approximate
-----Original Message-----
From: Firebird-Architect@yahoogroups.com
[mailto:Firebird-Architect@yahoogroups.com] On Behalf Of Nando Dessena
Sent: Thursday, June 30, 2005 4:04 AM
To: Firebird-Architect@yahoogroups.com
Subject: Re: [Firebird-Architect] Inexact database operations
Rick,
R> This has been brought up before. If it were implemented,
R> APPROXIMATE_COUNT would probably always return 42.
I'd try to be a little more pragmatic. Something that returns the order
of magniture of the record count of a huge dataset has its uses.
AFAIK that's what the optimizer does to estimate relation cardinality.
R> There are other solutions to returning an approximate number in a web
R> application.
R> Off the top of my head, the value could be in the response after the
R> results, so that the results are rendered in the browser^H^H^H^H^H^H
R> user-agent, and the total rendered soon after.
R> If it's stateful (due to sessions for example), the value could be
R> updated after the response is sent.
In response? Sure, but who's going to compute it?
Ciao
--
Nando Dessena
http://www.flamerobin.org