Subject | Re: Why "SELECT FIRST 1 ..." and "EXISTS ( SELECT * FROM ... )" are so slow in a big database? |
---|---|
Author | Ali Gökçen |
Post date | 2006-07-16T15:47:11Z |
Ok,
i have an another idea if your application is time critical.
we can create another field with ideal index selectivity.
PV_ID = PV*1000000000+ID -- you may be prefer BIGINT for more zeros
you can fill this field by before update&insert triggers.
create unique index on DV_PV(PV_ID);
then:
DELETE FROM PV_TYPE
WHERE PV = 2 AND
( SELECT min(PV_ID) FROM DV_PV
WHERE PV_ID BETWEEN 2000000000 and 2999999999
) IS NULL
it shouldn't be slow but has some cost.
Regards.
Ali
--- In firebird-support@yahoogroups.com, "Antti Nivala"
<antti.nivala@...> wrote:
it appears to be fetching the same amount of data from the index. The
execution time varies between 350 ms and 550 ms on my computer, which
is pretty much the same as about 400 ms with my original statement.
the index before going to records at all.
already have such a field in the PV table for other purposes. One
reason why I said that it is not an elegant solution is that our
counter is not exact. It is just an approximation because update
conflicts can prevent us from updating the counter in some cases. We
don't want to retry the whole complex transaction because of that so
we just ignore the failure of updating the counter. So, we cannot
fully rely on the counter. But we can improve the performance a lot
by using the counter to avoid the EXISTS check if the counter is
fairly large (e.g. > 100 or > 1000). In those cases we just conclude
that making a true existence check is not reasonable because it would
take too much time.
millions of distinct PV values in the table so it easily takes FB to
read a couple of million records before finding a match. Definitely
too slow. So, the index is good there for most cases, but we have a
few values (e.g. PV = 2) which have a large number of duplicates, and
those cases turned out to be slow.
i have an another idea if your application is time critical.
we can create another field with ideal index selectivity.
PV_ID = PV*1000000000+ID -- you may be prefer BIGINT for more zeros
you can fill this field by before update&insert triggers.
create unique index on DV_PV(PV_ID);
then:
DELETE FROM PV_TYPE
WHERE PV = 2 AND
( SELECT min(PV_ID) FROM DV_PV
WHERE PV_ID BETWEEN 2000000000 and 2999999999
) IS NULL
it shouldn't be slow but has some cost.
Regards.
Ali
--- In firebird-support@yahoogroups.com, "Antti Nivala"
<antti.nivala@...> wrote:
>run any faster. The plan is different as you said, but nevertheless
> Ali Gökçen wrote:
> > DELETE FROM PV_TYPE
> > WHERE PV = 2 AND
> > ( SELECT min(PV) FROM DV_PV WHERE PV = 2 ) IS NULL
>
> Ali, I tried this with my database and unfortunately it does not
it appears to be fetching the same amount of data from the index. The
execution time varies between 350 ms and 550 ms on my computer, which
is pretty much the same as about 400 ms with my original statement.
>2 from the index as Dmitry said that the bitmap will be built from
> I guess there is no way to circumvent fetching all matches to PV =
the index before going to records at all.
>use a manual counter as suggested by Lester Caine. We actually
> Since 400 ms per delete is too much for our application, we will
already have such a field in the PV table for other purposes. One
reason why I said that it is not an elegant solution is that our
counter is not exact. It is just an approximation because update
conflicts can prevent us from updating the counter in some cases. We
don't want to retry the whole complex transaction because of that so
we just ignore the failure of updating the counter. So, we cannot
fully rely on the counter. But we can improve the performance a lot
by using the counter to avoid the EXISTS check if the counter is
fairly large (e.g. > 100 or > 1000). In those cases we just conclude
that making a true existence check is not reasonable because it would
take too much time.
>NATURAL scan in this case. Yes, that is really too slow. We have
> Adam asked if I have really measured the performance of using a
millions of distinct PV values in the table so it easily takes FB to
read a couple of million records before finding a match. Definitely
too slow. So, the index is good there for most cases, but we have a
few values (e.g. PV = 2) which have a large number of duplicates, and
those cases turned out to be slow.
>gives good performance. Thanks for all suggestions.
> Anyway, using a manual counter for avoiding the slow EXISTS checks
>
> Antti
>