Subject | Query Optimization (Tables or Views) |
---|---|
Author | Maurice Ling |
Post date | 2006-07-13T00:30:55Z |
Hi all,
I have a table which acts like a term-list defined index
create table jclone_occurrence_pmid(
pmid numeric(10,0) not null,
clone char(128) not null);
There are 18 million rows of records in this table.
My task now is to answer this question: what is the set of pmids that
has clone A and clone B?
So I ran a query:
select pmid from jclone_occurrence_pmid where clone = 'ACT' and pmid
in (select pmid from jclone_occurrence_pmid where clone = 'ACE')
This query took me 2 hours to run. The problem is, I have 13 million
of these queries, means it will take 3000 years to run. clearly
unacceptable.
So, I thought that part of the reason was that the query had to do a
self join (please tell me if I am wrong) which brings me to create
either a view or a cache table. Can anyone please kindly explain the
nature of a view in FB and how it helps?
DDL for the view:
create view jclone_occur_pair_view (clone1, clone2, pmid) as
select a.clone, b.clone, a.pmid
from jclone_occurrence_pmid a, jclone_occurrence_pmid b
where a.pmid=b.pmid;
DDL for the table:
create table jclone_occur_pair(
clone1 char(128) not null,
clone2 char(128) not null,
pmid numeric(10,0) not null);
"Select count(*) from jclone_occur_pair_view" gives me about 478
million rows, which is the same as doing an inner join "select
count(*) from jclone_occurrence_pmid a inner join
jclone_occurrence_pmid b on a.pmid=b.pmid".
But when I try to populate jclone_occur_pair table with the following SQL,
insert into jclone_occur_pair (clone1, clone2, pmid) select a.clone,
b.clone, a.pmid from jclone_occurrence_pmid a, jclone_occurrence_pmid
b where a.pmid=b.pmid;
commit;
My database ballooned from 6GB to 40GB and for some weird reasons,
died... isql just disappeared from process, using about 260MB of
memory (VSIZE in top). I do not eliminate the possibility of killing
the wrong process but unlikely.
So, is there any suggestions to speed up this query?
Thanks and cheers
maurice
I have a table which acts like a term-list defined index
create table jclone_occurrence_pmid(
pmid numeric(10,0) not null,
clone char(128) not null);
There are 18 million rows of records in this table.
My task now is to answer this question: what is the set of pmids that
has clone A and clone B?
So I ran a query:
select pmid from jclone_occurrence_pmid where clone = 'ACT' and pmid
in (select pmid from jclone_occurrence_pmid where clone = 'ACE')
This query took me 2 hours to run. The problem is, I have 13 million
of these queries, means it will take 3000 years to run. clearly
unacceptable.
So, I thought that part of the reason was that the query had to do a
self join (please tell me if I am wrong) which brings me to create
either a view or a cache table. Can anyone please kindly explain the
nature of a view in FB and how it helps?
DDL for the view:
create view jclone_occur_pair_view (clone1, clone2, pmid) as
select a.clone, b.clone, a.pmid
from jclone_occurrence_pmid a, jclone_occurrence_pmid b
where a.pmid=b.pmid;
DDL for the table:
create table jclone_occur_pair(
clone1 char(128) not null,
clone2 char(128) not null,
pmid numeric(10,0) not null);
"Select count(*) from jclone_occur_pair_view" gives me about 478
million rows, which is the same as doing an inner join "select
count(*) from jclone_occurrence_pmid a inner join
jclone_occurrence_pmid b on a.pmid=b.pmid".
But when I try to populate jclone_occur_pair table with the following SQL,
insert into jclone_occur_pair (clone1, clone2, pmid) select a.clone,
b.clone, a.pmid from jclone_occurrence_pmid a, jclone_occurrence_pmid
b where a.pmid=b.pmid;
commit;
My database ballooned from 6GB to 40GB and for some weird reasons,
died... isql just disappeared from process, using about 260MB of
memory (VSIZE in top). I do not eliminate the possibility of killing
the wrong process but unlikely.
So, is there any suggestions to speed up this query?
Thanks and cheers
maurice