Subject | Optimizing a query of live/dead persons |
---|---|
Author | kokok_kokok |
Post date | 2007-03-09T10:29:17Z |
I have a person table with "Birth date" and "Death date" fields.
I need to know how many people were living within a period of time.
For example, to know the total persons in march 2004 I do:
select count(*) from persons where BirthDate<='31MAR04' and (DeathDate
is null or DeathDate>='1MAR04'
It works fine but it is not selective, it must read a lot of rows
because the index is based on "BirthDate" and the filter is
BirthDate<='31MAR04'.
It means that if I have 10 million of people of the year 1850,
Firebird reads 10 millions of records and none is counted because
probably all are dead.
My second approximation is:
select count(*) from persons where (BirthDate<='31MAR04' and
BirthDate+55000 > `1MAR04 ) and (DeathDate is null or DeathDate>='1MAR04'
Here I suppose that a person can live a maximum of 150 years (55000
days) and then I minimize the number of read records. In any case, I
continue reading a lot of no util records.
I cannot create an inventory table based on the total of persons per
day because sometimes I need to use additional filters as "cuontry='US'".
Somebody has had the same scenario? Any clue? How can I optimize my query?
Thank you
I need to know how many people were living within a period of time.
For example, to know the total persons in march 2004 I do:
select count(*) from persons where BirthDate<='31MAR04' and (DeathDate
is null or DeathDate>='1MAR04'
It works fine but it is not selective, it must read a lot of rows
because the index is based on "BirthDate" and the filter is
BirthDate<='31MAR04'.
It means that if I have 10 million of people of the year 1850,
Firebird reads 10 millions of records and none is counted because
probably all are dead.
My second approximation is:
select count(*) from persons where (BirthDate<='31MAR04' and
BirthDate+55000 > `1MAR04 ) and (DeathDate is null or DeathDate>='1MAR04'
Here I suppose that a person can live a maximum of 150 years (55000
days) and then I minimize the number of read records. In any case, I
continue reading a lot of no util records.
I cannot create an inventory table based on the total of persons per
day because sometimes I need to use additional filters as "cuontry='US'".
Somebody has had the same scenario? Any clue? How can I optimize my query?
Thank you