Subject RE: [firebird-support] one query works, the other never comes back
Author Svein Erling Tysvær
>Hi,
>
>I have a problem where a select query works on one user's data but not another's. (Firebird server 2.5.1.26351)
>Strangely, if I reverse the join and from statements, the query works for both - why?
>
>This query fails for one user: (by fail, I mean it never returns)
>
>Select TRANSACTIONS.INVOICENUMBER, TRANSACTIONS.ITEMNUMBER
>From TRANSACTIONS
>LEFT JOIN COMMISSIONS ON (COMMISSIONS.INVOICENUMBER=TRANSACTIONS.INVOICENUMBER)
> AND (COMMISSIONS.ITEMNUMBER=TRANSACTIONS.ITEMNUMBER)
>WHERE (Commissions.Status IS NOT NULL)
> AND (Commissions.InvoiceNumber = '-2')
>
>This one works for that same user:
>Select TRANSACTIONS.INVOICENUMBER, TRANSACTIONS.ITEMNUMBER
>From COMMISSIONS
>LEFT JOIN TRANSACTIONS ON (TRANSACTIONS.INVOICENUMBER=COMMISSIONS.INVOICENUMBER)
> AND (TRANSACTIONS.ITEMNUMBER=COMMISSIONS.ITEMNUMBER)
>WHERE (Commissions.Status IS NOT NULL)
> AND (Commissions.InvoiceNumber = '-2')
>
>The plan is similar, only reversed.
>
>Performant query:
>PLAN JOIN (COMMISSIONS NATURAL, TRANSACTIONS NATURAL)
>
>Non-performant query:
>PLAN JOIN (TRANSACTIONS NATURAL, COMMISSIONS NATURAL)
>
>There is no physical relationship between these tables as they were migrated from Paradox.
>There is, however, a logical relationship where each commission record will have one or more transactions.
>
>With that in mind and your education on the plan doing a lookup on the right side, I can understand
>why it might be slower having the detail table on the left.
>
>Thanks - I now what to watch for and have some code to review with similar situations.
>
>P.S. I think, too, there's room for improvement by adding specific indexes. I thought the primary key would
>automatically provide an index, but according to these plans using NATURAL, it looks like that isn't the case.

Hmm, I think you need some Firebird education. Luckily, Firebird is simple, so this email hopefully gives you a start (though it does contain things that are more my opinion than facts).

Unfortunately (for learning purposes), Firebird is very different from Paradox, important concept in Firebird are transactions and datasets - which is great for multiuser access, whereas Paradox has more emphasis on tables, hardly any transactions at all and is OK for single user access. You've started well with querying several tables!

In Firebird, virtually everything happens within transactions and you should take care to commit them regularly (hard commit, not commit retaining) or you might get speed issues when many users modifies the same data "simultaneously" and one transaction is preventing the oldest active transaction from moving forward.

Primary keys ought to have no meaning. Often, this doesn't matter, but fields defined as primary keys are more troublesome to change if they e.g. are part of a foreign key. In such cases, even changing a field definition from CHAR(10) to CHAR(11) gives you challenges that you simply avoid by having meaningless keys.

Firebird does create indexes for primary keys automatically. The reason they're not used in your query is that they're neither Status, InvoiceNumber, nor ItemNumber.

Indexes should be created for selective fields referred to in JOIN or WHERE clauses. Firebird indexes are unidirectional, so in some cases (like fields where you use MAX) you would want to create two indexes, one ascending (default) and one descending. I typically use only one field per index, that keeps things simple. Multifield indexes are possible with Firebird, I rarely use them since they add complexity (understanding a plan becomes more difficult), typically only offer a small performance gain (10-20%?) and Firebird can use multiple indexes for the same table (tuple) within a query.

INNER JOINs are preferable over OUTER JOINs, OUTER JOINs put constraints on the optimizer, in your case with the non-performant query, it forces Firebird to first access the TRANSACTIONS table before considering COMMISSIONS. Moreover, I'd say your non-performant query is contradictory (give me all TRANSACTIONS, if COMMISSION has a match, then JOIN them otherwise return NULLs for the COMMISSION table, and, by the way, only give me the rows where COMMISSION has these values), typically you should not refer to the RIGHT table of a LEFT JOIN in your WHERE clause (there are exceptions like checking for the absence of matches, using OR to match either out of several tables or optimizing slow queries). Normally, a LEFT JOIN also gets a different result from INNER JOINs, or a LEFT JOIN with the order of tables reversed.

For your particular query, indexes for the following fields may be considered:

COMMISSIONS.INVOICENUMBER
COMMISSIONS.ITEMNUMBER
COMMISSIONS.STATUS
TRANSACTIONS.INVOICENUMBER
TRANSACTIONS.ITEMNUMBER

I guess I'd just add indexes for the INVOICENUMBER fields, since I assume the others to be pretty non-selective (could consider an index for STATUS if you typically run queries for rare statuses, but IS NOT NULL sounds very unselective). Also, I'd change your query to what Ann recommended (she's an 'ancestor of Firebird', having founded Groton Database Systems, later to become InterBase, in 1984) - well, except that I'd remove the parenthesis, they're not needed in your query.

HTH,
Set