Subject Re: [firebird-support] Re: casting to decimal(9,2) gives different answers if used with a group by clause
Author Kjell Rilbe
wibbleian2 wrote:
> > The dogdy thing about it, is that you seem to treat the symptoms, not
> the cause.
> >
> The dodgy thing about it is that I get 2 different answers depending on
> the accident of the sql statement.

But you still don't seem to get it: That is not dodgy! It's a *very*
natural effect of doing math with an *approximate* type such as double,
as any floating point number.

That's what you don't seem to understand. Maybe you *do* understand, but
in that case hopefully someone else will find this useful.

So, let me try to explain how it works.

Floating point numbers are constructed to be able to store very small
numbers as well as very large numbers. To achieve this, they are stored
in three parts:

- Sign
- Exponent
- Mantissa

Where the number stored is calculated from the parts like this:

Number = Sign * Mantissa * BASE ^ Exponent

BASE is of course usually 2 for computer datatypes but it *could* be any
number. In sciense you often write numbers in decimal form (base 10)
like this:

1.23 * 10 ^ 6

This would be 1.23 million. This is exactly how floating point numbers
store their values.

Now, no matter how large the number is (in other words, what the
exponent is), the mantissa (in other words: the significant digits)
always contains the same number of digits, i.e. the *precision* (number
of significant/stored digits) is always the same.

For double the mantissa is 52 binary digits. See here:
http://en.wikipedia.org/wiki/Double_precision

This provides for rather good precision, but you still can't store all
numbers *exactly*. You often get rounding errors. Why?

Let's look at a simple example. Assume you had a special "decimal"
floating point datatype, which stores 1 (one) decimal digit of exponent,
a sign bit, and 4 decimal digits of mantissa.

The smallest number that this datatype can store would be 0.001 * 10 ^
-9, where Mantissa = 0.001 and Exponent = -9, or 0.000000000001.

The largest number is 9.999 * 10 ^ 9, or 9999000000.

Now, let's say you want to store 1/7. Well within the described range
that our type can handle, right? So, let's see what happens.

1/7 = 0,142857142857... and then repeat 142857 *forever*.

Our type would store this as:

1.429 * 10 ^ -1

So, it will actually produce a rounding error of 0,000042857142857...
repeat 142857 as before.

Now, what happens if we try to add a few numbers?

Lets say we want to add these three numbers:

a = 7.004
b = 7.005
c = 7.006

This can be done in three different orders, taking into consideration
that addition is commutative (a+b = b+a). Let's examine two of them:

1: (a + b) + c
2: a + (b + c)

For order 1, we get this:
a + b = 14.009, but will be rounded to 14.01.
14.01 + c = 21.016, but will be rounded to 21.02.

For order 2, we get this:
b + c = 14.011, but will be rounded to 14.01.
a + 14.01 = 21.014, but will be rounded to 21.01.

Oops! Different results! This is *exactly* the effect you are observing
in your SQL!

The only difference is that you *think* you are adding exact numbers,
because they look like this, for example: 130.27500000000. But double is
not a decimal datatype. It's binary. It's value is stored with binary
digits, with base 2. In base 2/binary, this number looks like this:

10000010.010001100110011001100110... where 0110 repeats forever, just
like 142857 repeats forever for 1/7 in base 10/decimal. So, double can't
actually store 130.275 exactly, even though it looks like it when you
display it with ten-or-so decimal digits.

The same probably applies to the terms you're adding in your select:
even though you *think* they're exact and nice, like 23.002000000000,
they probably suffer from the same infinite "tail" in base 2/binary as
130.275 does, and will suffer from rounding errors when squeezed into
double. The effect of these intermediate rounding errors will result in
different results depending on the order the terms are added, and it's
not strange (although not obvious either) that the order is different
between the two selects you showed us.

So, although the result of both selects looks like 130.2750000000, one
is probably actually something like 130.275000000000000000114987, and
the other is probably actually like 130.274999999999999999847172.
Obviously, these two will round differently, even with banker's rounding.

To avoid this, financial calculations are usually done with other
datatypes than floating point types like double. You usually use a
*fixed* point type with 2-4 decimals. These types are *exact*. If it
looks like you have 130.275, then that's exactly what you have. In SQL,
this applies to e.g. decimal and numeric.

So, this SQL works:

select cast(sum(cast(qty as decimal(9,4))) as decimal(9,2)) from tmp;
select parent_id, sum(cast(qty as decimal(9,4))) as decimal(9,2)) from
tmp group by parent_id;

because you first convert all terms to an exact, fixed point, datatype,
then you add them *without* intermediate rounding errors, then you
further round the sum to the desired number of decimals.

Nothing dodgy about it at all!!!!!!!

Of course, if the terms you have actually contain 10-15 *significant*
decimals, then you're in trouble, because you probably have problems
finding a decimal type that can both handle enough decimals *and* large
enough totals.

> I agree that this is not treating the cause, but I can't figure out an
> alternative that really fits my needs. Especially not one that is just
> based on dabbling with data types.

It *is* treating the cause, except the fact that the terms are actually
stored as double, while they should probably be stored as decimal(18,4)
or something.

> I have a better fix for this specific "symptom" but I'm not convinced
> I'm being unreasonable expecting the same answer from the example
> queries given the sample data.

You're being unreasonable if you expect floating point arithmetic to
produce exact results. They don't, never have done, and never will do.
That's not what floating point numbers are for. They are *by definition*
approximate types!

As an interesting side note, 1/7 can actually be expressed *exactly*
with one single "decimal" in base 7. It looks like *taraa* 0.1.
3/7 looks like 0.3. 3/49 looks like 0.03. Get the general idea?

I hope this sheds some light on this "magic" for you or anyone else.
Kjell
--
--------------------------------------
Kjell Rilbe
DataDIA AB
E-post: kjell@...
Telefon: 08-761 06 55
Mobil: 0733-44 24 64