Subject | Re: [ib-support] Re: Why is Interbase so slow? |
---|---|
Author | Carl van Tast |
Post date | 2001-09-29T21:39:25Z |
Hi John, Helen, Ann et al.
Sorry for the long message.
Sorry for the long message.
On Tue, 25 Sep 2001 18:36:01 +0000 (UTC), john@... wrote:
>[...]
>
>UPDATE ENROL E
>SET REVENUE_NO = (
> SELECT REVENUE_NO
> FROM REVENUE_FACULTY RF, STUDENT S
> WHERE S.STUD_NO = E.stud_no
> AND RF.CRT_NO = E.crt_no
> AND RF.OS = S.OS)
>[...]
>This took about 2 hours to run...
>
>What am I doing wrong?
I don't know, but out of curiosity I made a few tests. For those
interested I post my results here. The tests were run with Interbase
6.0 (referred to as IB in this post), PostgreSQL 7.1.3 (PG) and two
prot^Hper^H^Hrilit^H^H... , well, commercial database servers (M7 and
O7).
Hardware and OS:
P II 300MHz, 128MB, W2K Pro for IB, PG, M7
AMB K6 180MHZ, 256MB, NT Server for O7
All database servers installed out of the box, no special tuning (see
note at end).
All times are in minutes and seconds, first value without indices,
second value with indices.
John did not tell us his metadata, so I had to guess:
CREATE TABLE enrol (
stud_no INTEGER NOT NULL,
crt_no INTEGER NOT NULL,
revenue_no INTEGER
/*,
UNIQUE (stud_no, crt_no)
*/
);
CREATE TABLE revenue_faculty (
revenue_no INTEGER NOT NULL,
crt_no INTEGER NOT NULL,
os INTEGER NOT NULL
/*,
PRIMARY KEY (revenue_no),
UNIQUE (crt_no, os)
*/
);
CREATE TABLE student (
stud_no INTEGER NOT NULL,
os INTEGER NOT NULL
/*,
PRIMARY KEY (stud_no)
*/
);
The tables were populated with following stored procedures:
SET TERM | ;
CREATE PROCEDURE enrol_fill
AS
DECLARE VARIABLE stud INTEGER;
DECLARE VARIABLE crt INTEGER;
DECLARE VARIABLE cnt INTEGER;
BEGIN
stud = 0;
crt = 0;
cnt = 0;
WHILE (cnt < 10000) DO BEGIN
cnt = cnt + 1;
stud = stud + 1;
IF (stud > 7001) THEN BEGIN
stud = 1;
END
crt = crt + 1;
IF (crt > 10) THEN BEGIN
crt = 1;
END
INSERT INTO enrol (stud_no, crt_no) VALUES (:stud, :crt);
END
END|
CREATE PROCEDURE student_fill
AS
DECLARE VARIABLE stud INTEGER;
DECLARE VARIABLE os INTEGER;
DECLARE VARIABLE cnt INTEGER;
BEGIN
stud = 0;
os = 0;
cnt = 0;
WHILE (cnt < 7000) DO BEGIN
cnt = cnt + 1;
stud = stud + 1;
os = os + 1;
IF (os > 3) THEN BEGIN
os = 1;
END
INSERT INTO student (stud_no, os) VALUES (:stud, :os);
END
END|
CREATE PROCEDURE revenue_faculty_fill
AS
BEGIN
INSERT INTO revenue_faculty VALUES (123456, 1, 1);
INSERT INTO revenue_faculty VALUES (123457, 2, 2);
INSERT INTO revenue_faculty VALUES (123458, 3, 1);
INSERT INTO revenue_faculty VALUES (123459, 4, 2);
INSERT INTO revenue_faculty VALUES (123460, 5, 3);
INSERT INTO revenue_faculty VALUES (123461, 6, 1);
INSERT INTO revenue_faculty VALUES (123462, 7, 2);
INSERT INTO revenue_faculty VALUES (123463, 8, 1);
INSERT INTO revenue_faculty VALUES (123464, 9, 3);
INSERT INTO revenue_faculty VALUES (123465, 10, 2);
END|
SET TERM ; |
EXECUTE PROCEDURE enrol_fill;
-- IB: 0:01.0 / 0:02
-- PG: 0:03.5 / 0:04.7
-- M7: 0:06 / 0:06
-- O7: 0:41 / 1:41
EXECUTE PROCEDURE student_fill;
-- IB: 0:00.9 / 0:01.4
-- PG: 0:02.5 / 0:03.3
-- M7: 0:04 / 0:04
-- O7: 0:36 / 0:57
EXECUTE PROCEDURE revenue_faculty_fill;
-- IB, PG, M7: instantly
Note: Without explicit BEGIN/COMMIT TRANSACTION times jump up to 1:55
for enrol_fill resp. 1:20 for student_fill in M7, +/- a few seconds
for with/without index.
-- test:
SELECT count(*)
FROM enrol e, revenue_faculty rf, student s
WHERE s.stud_no = e.stud_no
AND rf.crt_no = e.crt_no
AND rf.os = s.os;
-- correct answer is 3333
-- IB: 0:01.6 / 0:01.6
-- PG: 0:03.1 / 0:01.7
-- M7: 0:00.9 / 0:01
-- O7: 0:03.4 / 0:01.8
-- John's update:
UPDATE enrol e
SET revenue_no = (
SELECT revenue_no
FROM revenue_faculty rf, student s
WHERE s.stud_no = e.stud_no
AND rf.crt_no = e.crt_no
AND rf.os = s.os);
-- IB: 66:00 / 66:00
-- PG: 4:40 / 0:05.4
-- M7: 2:16 / 0:02
-- O7: >>180 / 0:35
UPDATE enrol
SET revenue_no = rf.revenue_no
FROM revenue_faculty rf, student s
WHERE s.stud_no = enrol.stud_no
AND rf.crt_no = enrol.crt_no
AND rf.os = s.os;
-- IB: doesn't work
-- PG: 0:03.4 / 0:02.5
-- M7: 0:01 / 0:01
-- O7: doesn't work
-- Helen's procedure
SET TERM | ;
CREATE PROCEDURE update_enrol
AS
DECLARE VARIABLE vrevenue_no INTEGER;
DECLARE VARIABLE vcrt_no INTEGER;
DECLARE VARIABLE vstud_no INTEGER;
BEGIN
FOR SELECT rf.revenue_no,
rf.crt_no,
s.stud_no
FROM revenue_faculty rf
JOIN student s
ON rf.os = s.os
INTO :vrevenue_no, :vcrt_no, :vstud_no DO
BEGIN
UPDATE enrol
SET revenue_no = :vrevenue_no
WHERE stud_no = :vstud_no
AND crt_no = :vcrt_no;
END
END|
SET TERM ; |
EXECUTE PROCEDURE update_enrol;
-- IB: 50:00
This procedure has to update 3333 rows, but
SELECT ... FROM rf JOIN s ...
selects 23334 rows, so on average each row is updated seven times. I
did no more tests with this procedure and used the following instead:
SET TERM | ;
CREATE PROCEDURE update_enrol2
AS
DECLARE VARIABLE vrevenue_no INTEGER;
DECLARE VARIABLE vcrt_no INTEGER;
DECLARE VARIABLE vstud_no INTEGER;
BEGIN
FOR SELECT rf.revenue_no,
rf.crt_no,
s.stud_no
FROM revenue_faculty rf
JOIN student s
ON rf.os = s.os
JOIN enrol e
ON e.stud_no=s.stud_no AND e.crt_no=rf.crt_no
INTO :vrevenue_no, :vcrt_no, :vstud_no DO
BEGIN
UPDATE enrol
SET revenue_no = :vrevenue_no
WHERE stud_no = :vstud_no
AND crt_no = :vcrt_no;
END
END|
SET TERM ; |
EXECUTE PROCEDURE update_enrol2;
-- IB: 7:30 / 0:02.6
-- PG: 6:25 / 0:07
-- M7: 8:41 / 0:04
-- O7: 5:29 / 0:11
Again you have to include BEGIN/COMMIT TRANSACTION for M7. Execution
time would be 9:33 / 0:41 without.
Re tuning:
On Thu, 27 Sep 2001 17:34:41 +0000 (UTC), aharrison@...
("Ann W. Harrison") wrote:
>> > >DATABASE_CACHE_PAGES 75
>
>I'd strongly suggest increasing this to at least 2500 if you're
>running SS. You may see better performance at 5000.
Well, after setting this parameter to 2500 and restarting IB I
repeated my tests and I got *exactly* the same times. This can be
explained by the fact that the machine is almost 100% CPU bound during
these tests. Disk IO is not an issue here. Or did I do something
wrong?
Conclusions:
. Both open source databases offer good performance when used
properly.
. The more modern commercial product seems to be faster with
complex SQL statements. (The other one is almost ten years old.)
. IB and PG have really fast stored procedure execution engines.
. Data design and choice of algorithm have great influence on
performance. Using the same hardware and software, you can do the
same job in 3 seconds or in more than 1 hour.
. UPDATE ... FROM would help a lot.
Please don't flame me for doing my tests on different hardware.
Correct the results by a factor 2 or 3, if you want. IMHO conclusions
are still valid.
If I made a mistake or you think the times measured should be
interpreted differently, let me know. I'll happily correct this
document and make it available again, though not necessarily on this
mailing list; it's really too long for sending it more than once.
>
>Thanks in advance for any clues.
>
>John Newcombe
.02 by
Carl van Tast
>
>
>PS Sacrilege - I datapumped the entire database into a sql_server db -
>and ran the exact same query with no indexes and it took 2 secs!!
Not a sacrileg, just valuable information. But what kind of
sql_server?