Subject | Re: [firebird-support] Optimizing a query of live/dead persons |
---|---|
Author | Svein Erling Tysvaer |
Post date | 2007-03-09T11:30:53Z |
Your second approximation won't help you (well, unless you use
expression indexes in Firebird 2.0) since Birthdate+55000 doesn't use
any index.
You may do
select count(*) from persons
where BirthDate between :MyDate-55000 and :MyDate
and coalesce(DeathDate, CURRENT_DATE) >= :MyDate
//(the main change is to put 55000 on the other side of the comparison)
but I am uncertain whether using an index will help at all, since I
assume that the majority of the people in your database were born after
1850 and that they would be eliminated when checking for DeathDate.
Though if you are using Firebird 2.0, it could be worth indexing
'coalesce(DeathDate, CURRENT_DATE)' if it is legal (I've never tried and
mostly use Fb 1.5 myself).
I guess an alternate way of doing this, would be to have another table
which contained the pk of persons and AliveYear and then do something like:
select count(*)
from persons p
join Alive a on p.id = a.id
where p.BirthDate between :MyDate-55000 and :MyDate
and coalesce(p.DeathDate, CURRENT_DATE) >= :MyDate
and a.AliveYear = :YearOfMyDate
Though if your person table is big already, then AliveStatus could
become very big. E.g. since I was born in 1970, I would occupy 38 rows
in Alive and I would guess that the table would contain at least 50
times as many records as Persons.
Set
-a Firebird Foundation member
-http://www.firebirdsql.org/index.php?op=ffoundation
kokok_kokok wrote:
expression indexes in Firebird 2.0) since Birthdate+55000 doesn't use
any index.
You may do
select count(*) from persons
where BirthDate between :MyDate-55000 and :MyDate
and coalesce(DeathDate, CURRENT_DATE) >= :MyDate
//(the main change is to put 55000 on the other side of the comparison)
but I am uncertain whether using an index will help at all, since I
assume that the majority of the people in your database were born after
1850 and that they would be eliminated when checking for DeathDate.
Though if you are using Firebird 2.0, it could be worth indexing
'coalesce(DeathDate, CURRENT_DATE)' if it is legal (I've never tried and
mostly use Fb 1.5 myself).
I guess an alternate way of doing this, would be to have another table
which contained the pk of persons and AliveYear and then do something like:
select count(*)
from persons p
join Alive a on p.id = a.id
where p.BirthDate between :MyDate-55000 and :MyDate
and coalesce(p.DeathDate, CURRENT_DATE) >= :MyDate
and a.AliveYear = :YearOfMyDate
Though if your person table is big already, then AliveStatus could
become very big. E.g. since I was born in 1970, I would occupy 38 rows
in Alive and I would guess that the table would contain at least 50
times as many records as Persons.
Set
-a Firebird Foundation member
-http://www.firebirdsql.org/index.php?op=ffoundation
kokok_kokok wrote:
> I have a person table with "Birth date" and "Death date" fields.
>
> I need to know how many people were living within a period of time.
>
> For example, to know the total persons in march 2004 I do:
>
> select count(*) from persons where BirthDate<='31MAR04' and (DeathDate
> is null or DeathDate>='1MAR04'
>
>
> It works fine but it is not selective, it must read a lot of rows
> because the index is based on "BirthDate" and the filter is
> BirthDate<='31MAR04'.
> It means that if I have 10 million of people of the year 1850,
> Firebird reads 10 millions of records and none is counted because
> probably all are dead.
>
> My second approximation is:
>
> select count(*) from persons where (BirthDate<='31MAR04' and
> BirthDate+55000 > `1MAR04 ) and (DeathDate is null or DeathDate>='1MAR04'
>
>
> Here I suppose that a person can live a maximum of 150 years (55000
> days) and then I minimize the number of read records. In any case, I
> continue reading a lot of no util records.
>
>
> I cannot create an inventory table based on the total of persons per
> day because sometimes I need to use additional filters as "cuontry='US'".
>
>
> Somebody has had the same scenario? Any clue? How can I optimize my query?
>
> Thank you