Subject | Re: [firebird-support] Plan question |
---|---|
Author | Milan Tomeš - Position |
Post date | 2010-04-19T10:29:54Z |
Dne 19.04.2010 11:43, Svein Erling Tysv�r napsal(a):
steps are using data inserted in first step. I'm considering to
deactivate indexes during first step so inacurate selectivity of some
indexes will not be problem anymore. But that's not a solution for this
because this problem occurs later and data for this step is filled (and
index selectivity is affected) in same step.
and let's say (without any calculations) that 90% of all records in
STREETS have unique LINK_ID and ALTSTREETS = 'F' and other 5% have
duplicate LINK_ID with ALTSTREETS = 'T'.
Milan
[Non-text portions of this message have been removed]
>You thought right. My program do massive insert in first step. Following
> >> Which Firebird version are you using?
> >I'm currently using version 2.1.4.18270.
>
> OK, I think that means the optimizer knows about the selectivity of
> parts of the indexes, not only the index as a whole.
>
> >I think that NATURAL will be never used if there is any possible index
> >to use doesn't care about the selectivity of that index. I think
> >selectivity is taken into consideration only when more than 1 index
> >matches needs for query.
>
> Possibly (I don't know, I normally don't index Boolean columns), but
> you have two possible indexes to choose from.
>
> >> 2) The optimizer may not know that using an index on SRC_DIR_NAME,
> ID_CSYMBOL is >>equally good or better than an index using simply
> SRC_DIR_NAME (thinking about it, it
> >>is obvious, but if all you're doing is choosing between two indexes
> with a selectivity
> >>of 0.0000007 and 0.5, then you of course prefer the one with
> 0.0000007). If this is >>the case, you can try either:
>
> >And thats strange IMHO. There is an index where only a part (column
> >SRC_DIR_NAME) can be used and there is another index, that contains all
> >columns, which are used in where condition.
>
> Well, I was assuming Firebird 1.5, with 2.1 the optimizer knows about
> selectivity of parts of the index.
>
> >> a) Don't prepare until you have to (possibly even updating the
> statistics shortly
> >> before the prepare, I don't know whether that can help or not),
> hoping a better plan
> >> will be chosen when you have final data
>
> >That means I have to run SET STATISTIC query for this table after
> >previous step. And I don't like this idea. In my mind this type of query
> >belongs to database maintenance and not to standard client application.
>
> In general I would agree, that comment was because I thought your
> program might have done massive insert/update/delete between preparing
> the query and actually selecting the records. And massive
> modifications to the database affects the best plan for the optimizer
> to choose.
>
steps are using data inserted in first step. I'm considering to
deactivate indexes during first step so inacurate selectivity of some
indexes will not be problem anymore. But that's not a solution for this
because this problem occurs later and data for this step is filled (and
index selectivity is affected) in same step.
> >> Another thing you also could consider, is whether it is best toThere are around 4000 records with ID_CSYMBOL = 1360 in table STREETS
> start with SRC or STR >> first in the PLAN. It might be that FROM
> STREETS STR LEFT JOIN ZLEVELS SRC is a
> >> better choice (but that depends on your data, the optimizer thinks
> that it is best to
> >> start with SRC, but it has no knowledge about the selectivity of
> the values 1360 vs
> >> 2000, 2010 and C1, C4).
>
> >ZLEVELS - over 7 millions of records, STREETS - almost 3 millions. In
> >table ZLEVELS are only 2 distinct values of SRC_DIR_NAME - 'C1' and 'C4'
> >(at this time), around 40000 records with ID_CSYMBOL IN (2000, 2010).
>
> 40000 records isn't much, I expect there to be considerably more with
> STR.ID_CSYMBOL = 1360? How selective is the JOIN clause (primarily I'm
> interested in LINK_ID of both tables, I assume ALTSTREETS = 'F' to
> have poor selectivity)?
>
and let's say (without any calculations) that 90% of all records in
STREETS have unique LINK_ID and ALTSTREETS = 'F' and other 5% have
duplicate LINK_ID with ALTSTREETS = 'T'.
Milan
[Non-text portions of this message have been removed]