Subject Re: [firebird-support] Weighted Averages
Author Helen Borrie
At 05:13 PM 1/11/2004 +0200, you wrote:


>I need to calculate the weighted average for two fields (val1 and val2)
>using weight1. Zero values should be ignored.
>
>val1 val2 weight1
>---- ---- --------
>5 8 10
>4 2 20
>0 2 10
>9 0 20
>
>should give: 6.2 for val1 and 3.5 for val2.
>
>The following statements will not give the correct answers
>
>SELECT SUM(val1*weight1)/SUM(weight1) avg1, SUM(val2*weight2)/SUM(weight2)
>avg2 FROM tbl.
>SELECT SUM(val1*weight)/SUM(weight1) avg1, SUM(val2*weight2)/SUM(weight2)
>avg2 FROM tbl WHERE val1<>0 and val2<>0.
>
>Val1 might be available, with val2 set to 0 or visa versa. I can not think
>of any way to return both avg1 and avg2 with 1 query.

It will depend on the data. SUM() conforms to the SQL standard and ignores
nulls. It won't ignore zero unless you give it a suitable WHERE
clause. Your examples do not give any indication of what you want to
ignore. I suspect that there are conditions here that are not represented,
e.g. if Val1 or Val2 is null or zero, do you want the corresponding
weighting ignored too?


>Is it possible to add UDF's for aggregate functions (maybe add a SUMIF)?

You could write a UDF. Perhaps someone already has done so. There are
links to some public UDF libraries here:
http://www.ibphoenix.com/main.nfs?a=ibphoenix&page=ibp_udf_libs and I'd
also recommend googling on UDF + "statistical" + interbase.


>Am I missing the obvious?

At the beginning of your message you said your queries gave wrong
results. If you want help from the list with algorithms, it's a waste of
time posting a message with vague statements. At a minimum, you need to
describe in what way the results were incorrect.

If it were my task, I'd want to sit down with a cool head and analyse the
algorithm properly. If you need to take account of nulls and their
behaviour, you must plan for that. If zero in one value affects whether or
not you calculate the other, or how you calculate it, then you must plan
for that.

It's highly likely you can solve this very neatly with a stored procedure,
since you can loop through the candidate rows and handle row-by-row
conditions as they come, and accrue the values for the final operation
exactly as you want them to be. Presumably you'll need some row-limiting
search conditions, too, which you can define as input arguments.

./heLen