Subject | Query help |
---|---|
Author | Kjell Rilbe |
Post date | 2012-08-22T10:10:50Z |
Hi,
I have a table (DDL and index stats below) with about 180 million
records. In this table I want to find those that satisfy a simple int
condition ("ÄrNull" = 1), have the same value in "Hållare", and overlap
in time, considering the two timestamp columns "Inkom" ("WasAdded") and
"TogsBort" ("WasRemoved"). I need the query to be efficient, considering
the large number of records.
"Inkom" is set when the record is inserted, and "TogsBort" is null at
this time. At some later instant, the record is marked as no longer
current by entering a timestamp into "TogsBort".
"Borttagsuppdatering" is always set at the same time as "TogsBort" and
references a PK in a different table. So "TogsBort" is null if anf only
if "Borttagsuppdatering" is null. This is relevant, because
"Borttagsuppdatering" is indexed, while "TogsBort" isn't.
The idea is that at any given time (considering "Inkom" and "TogsBort"),
there should exist no more than one record per "Hållare" value, that
also has "ÄrNull" = 1. To complicate things, each record may have
"RelateradHållare" set instead of (or in addition to) "Hållare", but
those two cases can be considered as disjoint, and be examined with two
seperate queries.
So, something like this:
select "ECO_ID"
from "Uppgift" U
where U."ÄrNull" = 1
and exists (
select 1
from "Uppgift" U2
where U2."Hållare = U."Hållare"
and U2."ÄrNull" = 1
and U2."Inkom" < coalesce(U."TogsBort", '2012-12-31')
and U."Inkom" < coalesce(U2."TogsBort", '2012-12-31')
)
But I suspect this will execute "forever".
Can anyone come up with something more efficient?
CREATE TABLE "Uppgift"
(
"ECO_ID" bigint NOT NULL,
"Booleanvärde" integer DEFAULT null,
"Datumvärde" timestamp DEFAULT null,
"ECO_TYPE" smallint NOT NULL,
"Förrådsvärde" bigint,
"Heltalsvärde" integer DEFAULT null,
"Hållare" bigint,
"Inkom" timestamp DEFAULT current_timestamp NOT NULL,
"Länkobjekt" bigint,
"RelateradHållare" bigint,
"SenastBekräftad" timestamp DEFAULT null,
"Textvärde" varchar(255) DEFAULT '',
"TogsBort" timestamp DEFAULT null,
"Uppgiftsägare" bigint,
"ÄrNull" integer DEFAULT 0 NOT NULL,
"Borttagsuppdatering" bigint,
"Tilläggsuppdatering" bigint,
"SenasteÄgaruppdaterandKC_" bigint,
"GodkändFörPublicering" integer DEFAULT 1 NOT NULL,
"GodkännandeUppdatering" bigint,
CONSTRAINT "IX_PK_Uppgift" PRIMARY KEY ("ECO_ID")
-- Selectivity = 0.000000
);
CREATE INDEX "IX_Uppgift_BorttagsuppA5K" ON "Uppgift"
("Borttagsuppdatering");
-- Selectivity = 0.000012, probably about 10-30% nulls.
CREATE INDEX "IX_Uppgift_Förrådsvärde" ON "Uppgift" ("Förrådsvärde");
-- Selectivity = 0.000283, probably about 10-30% nulls.
CREATE INDEX "IX_Uppgift_GodkännandeC4Z" ON "Uppgift"
("GodkännandeUppdatering");
-- Selectivity = 0.007937, most are nulls.
CREATE INDEX "IX_Uppgift_Hållare" ON "Uppgift" ("Hållare");
-- Selectivity = 0.000000, probably about 20% nulls.
CREATE INDEX "IX_Uppgift_Länkobjekt" ON "Uppgift" ("Länkobjekt");
-- Selectivity = 0.000151, most are nulls.
CREATE INDEX "IX_Uppgift_RelateradHåIS0" ON "Uppgift" ("RelateradHållare");
-- Selectivity = 0.000000, probably about 80% nulls.
CREATE INDEX "IX_Uppgift_SenasteÄgar7K5" ON "Uppgift"
("SenasteÄgaruppdaterandKC_");
-- Selectivity = 0.000009, probably a lot of nulls.
CREATE INDEX "IX_Uppgift_TilläggsuppYNC" ON "Uppgift"
("Tilläggsuppdatering");
-- Selectivity = 0.000010, should be no nulls.
CREATE INDEX "IX_Uppgift_Uppgiftsägare" ON "Uppgift" ("Uppgiftsägare");
-- Selectivity = 0.200000, not sure how many nulls.
Kjell
--
--------------------------------------
Kjell Rilbe
DataDIA AB
E-post: kjell@...
Telefon: 08-761 06 55
Mobil: 0733-44 24 64
I have a table (DDL and index stats below) with about 180 million
records. In this table I want to find those that satisfy a simple int
condition ("ÄrNull" = 1), have the same value in "Hållare", and overlap
in time, considering the two timestamp columns "Inkom" ("WasAdded") and
"TogsBort" ("WasRemoved"). I need the query to be efficient, considering
the large number of records.
"Inkom" is set when the record is inserted, and "TogsBort" is null at
this time. At some later instant, the record is marked as no longer
current by entering a timestamp into "TogsBort".
"Borttagsuppdatering" is always set at the same time as "TogsBort" and
references a PK in a different table. So "TogsBort" is null if anf only
if "Borttagsuppdatering" is null. This is relevant, because
"Borttagsuppdatering" is indexed, while "TogsBort" isn't.
The idea is that at any given time (considering "Inkom" and "TogsBort"),
there should exist no more than one record per "Hållare" value, that
also has "ÄrNull" = 1. To complicate things, each record may have
"RelateradHållare" set instead of (or in addition to) "Hållare", but
those two cases can be considered as disjoint, and be examined with two
seperate queries.
So, something like this:
select "ECO_ID"
from "Uppgift" U
where U."ÄrNull" = 1
and exists (
select 1
from "Uppgift" U2
where U2."Hållare = U."Hållare"
and U2."ÄrNull" = 1
and U2."Inkom" < coalesce(U."TogsBort", '2012-12-31')
and U."Inkom" < coalesce(U2."TogsBort", '2012-12-31')
)
But I suspect this will execute "forever".
Can anyone come up with something more efficient?
CREATE TABLE "Uppgift"
(
"ECO_ID" bigint NOT NULL,
"Booleanvärde" integer DEFAULT null,
"Datumvärde" timestamp DEFAULT null,
"ECO_TYPE" smallint NOT NULL,
"Förrådsvärde" bigint,
"Heltalsvärde" integer DEFAULT null,
"Hållare" bigint,
"Inkom" timestamp DEFAULT current_timestamp NOT NULL,
"Länkobjekt" bigint,
"RelateradHållare" bigint,
"SenastBekräftad" timestamp DEFAULT null,
"Textvärde" varchar(255) DEFAULT '',
"TogsBort" timestamp DEFAULT null,
"Uppgiftsägare" bigint,
"ÄrNull" integer DEFAULT 0 NOT NULL,
"Borttagsuppdatering" bigint,
"Tilläggsuppdatering" bigint,
"SenasteÄgaruppdaterandKC_" bigint,
"GodkändFörPublicering" integer DEFAULT 1 NOT NULL,
"GodkännandeUppdatering" bigint,
CONSTRAINT "IX_PK_Uppgift" PRIMARY KEY ("ECO_ID")
-- Selectivity = 0.000000
);
CREATE INDEX "IX_Uppgift_BorttagsuppA5K" ON "Uppgift"
("Borttagsuppdatering");
-- Selectivity = 0.000012, probably about 10-30% nulls.
CREATE INDEX "IX_Uppgift_Förrådsvärde" ON "Uppgift" ("Förrådsvärde");
-- Selectivity = 0.000283, probably about 10-30% nulls.
CREATE INDEX "IX_Uppgift_GodkännandeC4Z" ON "Uppgift"
("GodkännandeUppdatering");
-- Selectivity = 0.007937, most are nulls.
CREATE INDEX "IX_Uppgift_Hållare" ON "Uppgift" ("Hållare");
-- Selectivity = 0.000000, probably about 20% nulls.
CREATE INDEX "IX_Uppgift_Länkobjekt" ON "Uppgift" ("Länkobjekt");
-- Selectivity = 0.000151, most are nulls.
CREATE INDEX "IX_Uppgift_RelateradHåIS0" ON "Uppgift" ("RelateradHållare");
-- Selectivity = 0.000000, probably about 80% nulls.
CREATE INDEX "IX_Uppgift_SenasteÄgar7K5" ON "Uppgift"
("SenasteÄgaruppdaterandKC_");
-- Selectivity = 0.000009, probably a lot of nulls.
CREATE INDEX "IX_Uppgift_TilläggsuppYNC" ON "Uppgift"
("Tilläggsuppdatering");
-- Selectivity = 0.000010, should be no nulls.
CREATE INDEX "IX_Uppgift_Uppgiftsägare" ON "Uppgift" ("Uppgiftsägare");
-- Selectivity = 0.200000, not sure how many nulls.
Kjell
--
--------------------------------------
Kjell Rilbe
DataDIA AB
E-post: kjell@...
Telefon: 08-761 06 55
Mobil: 0733-44 24 64