Subject Re: Doubt with optimization in Query
Author diegodelafuente
Thanks to both for the explanation.

Rgds
Diego

--- In firebird-support@yahoogroups.com, Ann Harrison <aharrison@...> wrote:
>
> 2011/9/13 Svein Erling Tysvær <svein.erling.tysvaer@...>
>
> >
> >
> > Firebird doesn't have histograms yet, so it knows nothing about which field
> > values are common and which are rare when it chooses between using the index
> > of a probably reasonably selective field (I assume you have lots of records
> > in maestro_turnos and that 300K is only a small fraction)
> > ...
> > In the future, I think Firebird will implement histograms (maybe it is
> > already part of Firebird 3?), and then I guess it will make the right
> > decision in cases like yours.
> >
>
> In case there's someone out there wondering what histograms are in this
> context and what they have to do with optimization, let me elaborate a bit.
> Firebird keeps the selectivity of indexes and in the case of compound
> indexes, the selectivity of each field in the key. The selectivity is the
> number of distinct values compared to the total number of entries in the
> index. So, if you've got a table with 100,000 records and one of the
> fields has five distinct values, an index on that field would have a
> terrible selectivity. On the other hand, if 999,900 records had one value
> for the field and the other four values were distributed among the remaining
> 100 records, then for those four values, the index would have good
> selectivity. A histogram records the distribution of values in an index,
> so it's a much more sensitive way of determining the usefulness of an index,
> providing you know the values you're looking for.
>
> Different databases optimize queries at different times. Firebird optimizes
> its queries when they are presented so for joins and prepared queries with
> parameters, it doesn't know all the input values. Some other systems delay
> optimization until all parameters are known, so the inner part of a join is
> optimized for each new value from the outer part of the join and a
> parameterized query is optimized for each new value of the parameters.
> Systems like that can make great use of histograms, but at a significant
> cost for re-optimization.
>
> If I understood the query presented, it was looking for records in a
> specific range of dates. Range queries are particularly awkward to optimize
> based on selectivity. And the dates were explicit. So for that particular
> query, histograms would be a great help.
>
>
> Good luck,
>
> Ann
>
>
> [Non-text portions of this message have been removed]
>