Subject RE: [firebird-support] slow join?
Author Svein Erling Tysvær
Hi Kelly!

If every record in Parts have a match in ReliefD, then how can the result of a JOIN be any less than the total number of records in Parts? That's just a curious aside, and not important.

What's more important - why do you select that many records? In desktop databases like Paradox or Access it may not matter all too much whether you select entire tables, but client/server databases like Firebird will appear extremely slow (45 seconds to fill a grid is ridiculous for most applications). Normally, selecting a small subset of the records based on user criteria is superior by far, so unless all 300000 records are of interest, then don't select all of them.

Note that I'm not saying that you never should select many records, e.g. I do have a program that has to select 10-20000 records (out of over 15 million) for sending to about 30 places in my country, but those 10-20000 records will each be individually processed and checked by the recipients, it is not a mix of interesting and irrelevant records.

The big time difference that you observe, will normally be due to different PLANs being chosen (particularly if the only difference is the data). Though it may of course be other reasons as well, e.g. a large transaction gap forcing Firebird to evaluate lots of old versions of each record (transactions are important in Firebird). However, your query seems extremely simple and you operate on small tables, so it should not take too long at all.

Set

-----Original Message-----
From: firebird-support@yahoogroups.com [mailto:firebird-support@yahoogroups.com] On Behalf Of bwc3068
Sent: 26. september 2009 00:18
To: firebird-support@yahoogroups.com
Subject: [firebird-support] slow join?

Hi--

I'm by no means a SQl expert and I have a question about a query.

FB 2.1

Here's the query:

Select P.*, E.uniqueKey from Parts P
Inner Join ReliefD E
on (p.repairEntryKey = E.UniqueKey)
order by
P.UniqueKey

On a database with 44,000 ReliefD records and 321,000 Parts records (the result of the query is about 300,000 records) it resolves (or displays the first page of results in the grid) in a matter of 45ish seconds.

On a different database file with 41,000 ReliefD records and 422,000 Parts records (the result of the query is 421,000 records) it takes 25 minutes to show the first page of results in the grid.

This is on the same PC with the exact same stuff running in the background (not much).

Same indexs, same tables, same records, etc. Just different data.

There is a P.RepairEntryKey for every record in Parts and that field is indexed.

There is a E.UniqueKey for every record in ReliefD and that field is the primary key.

There is a P.UniqueKey for every record in Parts and that is the primary key of that table.

And every record in the Parts table does have a record in ReliefD BUT the reverse is not true (not all ReliefD records have Parts records)

Thoughts on the speed difference? Could it really be 325,000 vs. 425,000 records his some sort of big threshold?

TIA!!!
Kelly