Subject Re: [IBO] Join criteria in the WHERE clause
Author Helen Borrie
At 09:59 PM 10-01-01 -0600, you wrote:
>Without the LEFT it is very fast, however I only get records contained in
>both tables. What I need is all records in N even if there is not a match
in
>B (like the join in the WHERE clause produces).

I think it would be good to test the outputs to see if you are really
getting what you think you are getting. To get the set you describe here,
you need a LEFT OUTER JOIN, and the implicit join is certainly not capable
of getting that for you. The word LEFT on an explicit inner join doesn't
affect the result set from the select, it just helps to document the set
for you. The reason I suggested omitting it was the thought that possibly
the optimizer was (wrongly) eliminating the usefulness of an index because
it knows a rule that says "You can't use a composite index for a sort on
LEFT joins because of the potential occurrence of NULL in some sort column"
(i.e. wrong because inner joins don't result in NULL key columns). But
maybe "right" after all, because you have no primary keys and possibly
therefore nothing upon which to assume uniqueness or NOT NULLness...



>I think I understand. Are you saying leave the CO, YR, PARCEL fields in the
>tables so that they can be used for a WHERE clause but add a KY field that
>links the records together so that it can be used to optimize LEFT JOINs ? I
>initially thought you were saying, replace CO, YR, PARCEL with KY in all
>tables (i.e. extreme normalization).

No, I wouldn't leave the dependent fields in the tables, I would replace
them in all cases...sorry if my example misled you. My point about buttons
and Velcro is really a comment in defence of dropping the redundancy and
adding the extra table so that you have that unique CO, YR, PARCEL
combination in one place alone and being surrogated everywhere it appears,
using KY.

Your indexes wouldn't be in the tables where you currently have them -
repeated ad infinitum everywhere you have represented them - but in the
single table that supplies the surrogates.

Your SELECT * examples would be atypical of client/server. This is a more
likely scenario:

SELECT KH.KY, KH.CO, KH.YR, KH.PARCEL, N.FIRSTNAME, N.SURNAME, N.ADDRESS1.....,
B.CURRENTTOTAL....
FROM KEYHOLDER KH
JOIN NAME_ADDRESS N
ON KH.KY = N.KY
LEFT JOIN BILLING_DATA B
ON KH.KY = B.KY
WHERE KH.CO='Something' AND KH.YR=2000
ORDER BY KH.CO, KH.YR, KH.PARCEL

And - yeah - I do think you are making life hard for yourself by not having
primary keys. You seem to be creating a lot of "nominal" 1:1 relationships
across the tables, too. These are much easier and safer to manage if all
are keyed; and I don't have to remind you of the exigencies of maintaining
the integrity of these using composites for the linking "nominal" key. It
requires you to keep control of uniqueness and NOT NULLness yourself which
means you'll be looking at a lot more constraints in the database and
validation of input data to ensure that you'll have a unique identifier for
every row.

I think you need to examine the other indexes on your "nominal key" columns
and find out which one(s) are creating ambiguities. There are SQL "fudges"
to block such ambiguities and force the optimizer to select the index you
want it to use - you do this by setting nonsense WHERE criteria like

WHERE
INDEXED_COLUMN-0=INDEXED_COLUMN

Where you have composite indexes, in IB you don't generally need
same-directional indices on the other columns as well. The optimizer can
use individual or pairs of the columns from the comp. index. I'd prefer to
avoid "fudging" to influence the plan, and tidy up my index collection
instead, but there are others far more august than I who swear by it. Our
beloved Claudio is one and he can even give you better name than "fudging"
for it.<g>

Enough of that now as it is going way off topic for IBO and I have much to
do and very little time to do it. Anyway, I need a whiteboard to do this
justice. :))

I have two very fave books on RDBMS structure and SQL. Look them up on the
IBO website if you are interested in a practical-academic kind of way. One
is Simsion's "Data Modeling Essentials: Analysis, Design and Innovation",
the other is Melton and Simon, "Understanding the New SQL" which, thanks to
IB's high degree of conformity to standard, is very pertinent. (Ok,
Claudio, I hear you shouting "But don't rely on IB to be standard about
NULLs in ordering sequences!!" IB is quite naughty in this regard).

H.

All for Open and Open for All
InterBase Developer Initiative ยท http://www.interbase2000.org
_______________________________________________________