firebird-support - RE: [firebird-support] Query Problem (Resolved + New Question)

Subject	RE: [firebird-support] Query Problem (Resolved + New Question)
Author	Helen Borrie
Post date	2006-08-28T00:30:24Z

At 09:15 AM 28/08/2006, you wrote:

>Helen and Alan,
>
>Thank you so much for the responses. I was able to fix my issue by going
>about the query a bit different and discovered some problems with my
>database design.

Be certain that the issue is actually fixed. I hope you have a
testbed on which you can verify that you are getting all and only the
results you want.

>Alan - I am using a Date datatype...please let me know what you were
>thinking about for my own edification.
>
>Helen - yes, sorry, I actually removed the BILLINGID field from the select
>when I submitted my problem here (and forgot to remove it from the group by)
>as I was thinking it was extraneous to my question and I thought it would
>make it clearer. Obviously that was a stupid thing and I won't do it again.
>:p (AMOUNT is in the select however, in a calculation for a field I named
>"EXTENDED" in the select.)
>
>In regards to order of GROUP BY, please help me to understand how this
>should work generally. I've noticed that the database wants me to include
>any non-aggregated fields in the select list but I don't understand why? It
>seems to me that I should only have to GROUP BY BILLINGITEMID for what I was
>trying to do.
>
>In regards to DISTINCT, yes I see I don't need that, I'm not sure why I had
>it. However it shouldn't negatively affect my results though right?

Depends on the results you're after. DISTINCT applies to all fields
in the outer SELECT, so duplicates returned to the output (in this
case, the intermediate set that the GROUP BY will operate on) will be
missing. I haven't time at the moment to plough through your logic
to work out whether it affects this particular query, though I think
not, since I'm supposing that BillingItemID is the PK of the table,
i.e. there's no possibility that it would NOT be distinct by
definition on the "left side" of a left join or (as in your revised
query) by being simply in the SELECT list.

That apart, it's rarely (if ever ???) appropriate to use DISTINCT for
output that you're going to apply GROUP BY to. I leave you to figure
out why. :-) I'm sure it's not what you wanted there.

It's not what you want here, either (assuming BillingItemID is the
PK). However, the engine will do it anyway, and it's a very costly
piece of non-necessity.

>OK, (I used a subselect instead of the previous grouping approach) my
>solution uncovered an embarrassing issue with my database design that I
>would be grateful for some tips on. Here is my working solution (untouched!
>:p ) with question to follow:
>
>SELECT /* Kill it! DISTINCT */
> BI.BILLINGITEMID,
> P.CHARTID,
> P.LASTNAME || ', ' || P.FIRSTNAME || ' ' || P.MIDDLEINITIAL
>PATIENTDISPLAYNAME,
> BI.BILLINGITEMDATE,
> BI.DIAGNOSISCODE,
> BI.PROCEDURECODE,
> BI.BILLINGID,
> BI.AMOUNT * BI.UNITS EXTENDED,
> COALESCE((SELECT SUM(AMOUNT) FROM PAYMENT WHERE BILLINGITEMID =
>BI.BILLINGITEMID AND PAYMENTDATE <= :BILLINGITEMDATE),0.00) SUMOFPAYMENTS,
> (BI.AMOUNT * BI.UNITS) - COALESCE((SELECT SUM(AMOUNT) FROM PAYMENT WHERE
>BILLINGITEMID = BI.BILLINGITEMID AND PAYMENTDATE <= :BILLINGITEMDATE),0.00)
>BALANCE
>FROM
> BILLINGITEM BI
> INNER JOIN BILLING B ON (BI.BILLINGID = B.BILLINGID)
> INNER JOIN PATIENT P ON (B.PATIENTID = P.PATIENTID)
>WHERE
> BI.BILLINGITEMDATE <= :BILLINGITEMDATE AND
> (
> (BI.AMOUNT * BI.UNITS) - COALESCE((SELECT SUM(AMOUNT) FROM PAYMENT WHERE
>BILLINGITEMID = BI.BILLINGITEMID AND PAYMENTDATE <= :BILLINGITEMDATE),0.00)
> > 0.001 OR
> (BI.AMOUNT * BI.UNITS) - COALESCE((SELECT SUM(AMOUNT) FROM PAYMENT WHERE
>BILLINGITEMID = BI.BILLINGITEMID AND PAYMENTDATE <= :BILLINGITEMDATE),0.00)
>< -0.001
> )
>ORDER BY
> P.LASTNAME, P.FIRSTNAME, P.MIDDLEINITIAL, BI.BILLINGITEMDATE,
>BI.BILLINGITEMID
>
>You'll notice some weirdness in my where clause with "> 0.001" OR "< -0.001"
>... Where really what I want to say is "<> 0"

It's not just weirdness. Since you really have no way to know the
extent of the float imprecision of your stored data, trying to find
"intended zeroes" within that range isn't necessarily going to
consider all of the records that you want to target. For example, if
the float value resolves zero to 0.000046 it will be excluded.

If you are stuck with these (highly inadvisable) floats then you'll
need to work out some system that will treat near-zero and
dead-zero. Using CAST (aFloatValue as Integer) or
CAST(aDoublePrecisionValue as BigInt) will achieve that for search
arguments on the bare stored values. You will bump into more complex
problems with expressions that involve multiplication or division and
need to watch the "timing" of your casts to make sure you don't lose
precision too early in a calculation.

>Some time ago I was required to convert my application from a Paradox
>database to client/server LITERALLY overnight. In my haste I think I made a
>big mistake with my datatypes. I used a FLOAT for my currency, so there are
>numbers after the hundreds place that is causing me to have to do the above
>to get my results.

Yep. It's a Pandora's box.

>An additional problem I just noticed is that out of nearly $100,000
>calculated my final result is off by some change, maybe 60 or 70 cents.

That's usual when you choose float as your data type for currency (or
any other kind of numeric that has to be counted). Similarly if you
choose DOUBLE PRECISION, although the error range will be less
extreme at the low end.

>So my question is - what is the correct datatype for currency?

The general rule is to use fixed types (NUMERIC, DECIMAL) for
currency and other things you count, and double precison for things
you measure. However, 18 is the largest precision you can handle
with fixed types. It's more than ample for most currencies but the
Japanese Yen will beat it real fast (and other currencies where the
base currency unit is very small). In those cases it's necessary to
store currency in DOUBLE PRECISION and keep your wits about you.

Geoff Worboys wrote an interesting white paper on the subject, which
you can find amongst the TechInfo sheets at the IBObjects website
(www.ibobjects.com/TechInfo.html - it is case-sensitive).

>What is the best way to safely convert my tables and data?

There's a loaded question! But it's highly desirable to understand
the issues with floating point types. Rule Number One would
be: don't trust Paradox's Numeric type, because it was, is and ever
shall be misnamed according to SQL standards. It's not an SQL
Numeric. It's a single-precision floating point type with something
like 5-byte precision (AFAIR) - more or less "accurate enough" for US
Currency in a fairly brain-dead desktop application but dangerous if
converted to an SQL single-precision float (7 significant digits in
Firebird, of which approximately the first 5 s.d. are reliable).

>Thanks to all for your time and expertise! :)

Hang in there for Set to come online. He loves SQL puzzles like yours.

./heLen