Subject Re: Calculated field? Moving average, median
Author Svein Erling
--- In firebird-support@yahoogroups.com, "Francisco G" wrote:
>
> Hello, I have a question about computing values in a table sorted on Date/Time field:
>
> Table will have easily more than 10 million rows.
>
> Input data will look like this:
>
> Date/Time Price
> Record 1: Date/Time1 15.01
> Record 2: Date/Time2 15.45
> Record 3: Date/Time3 15.64
> Record 4: Date/Time4 16.04
> Record 5: Date/Time5 15.04
> Record 6: Date/Time6 15.64
> Record 7: Date/Time7 15.23
>
> I need to calculate a new column holding the average of the previous 3 rows prices and another column holding the median (50% percentile) of the previous 3 rows prices:
>
>
>
> Date/Time Price 3-Average 3-Median
> Record 1: Date/Time1 15.01 Null Null
> Record 2: Date/Time2 15.45 Null Null
> Record 3: Date/Time3 15.64 Null Null
> Record 4: Date/Time4 16.04 15.37 15.45
> Record 5: Date/Time5 15.04 15.71 15.64
> Record 6: Date/Time6 15.64 15.24 15.04
> Record 7: Date/Time7 15.23 15.24 15.04
>
> Please, can you give me a hint?
>
> Cheers
>
> Francisco

Hi Francisco!

Computed fields can be useful when they're calculated on the basis of the current record. When having to consider other records, well, at least test whether and how big a probable performance loss is.

A calculated field, hmm, do you mean ordinary fields modified through triggers? That could be doable. Are you only having inserts or also updates (where you modify MyDateTime) and deletes? Should updates and deletes modify other records? I suppose the insert trigger could be something like (written in Notepad, so errors are likely):

SET TERM ^^ ;
CREATE TRIGGER CalculateFields FOR MyTable ACTIVE BEFORE INSERT AS
declare variable t1 DateTime;
begin
t1 = (Select mt1.MyDateTime
FROM MyTable MT1
JOIN MyTable MT2 on MT1.MyDateTime < MT2.MyDateTime
JOIN MyTable MT3 on MT2.MyDateTime < MT3.MyDateTime
LEFT JOIN MyTable MT1N ON MT1N.MyDateTime > MT1.MyDateTime and MT1N < MT2.MyDateTime
LEFT JOIN MyTable MT2N ON MT2N.MyDateTime > MT2.MyDateTime and MT2N < MT3.MyDateTime
LEFT JOIN MyTable MT3N ON MT3N.MyDateTime > MT3.MyDateTime and MT3N < new.MyDateTime
WHERE MT3.MyDateTime < new.MyDateTime
AND MT1N.MyDateTime is null
AND MT2N.MyDateTime is null
AND MT3N.MyDateTime is null);
new.Average3 = (SELECT sum(Price)/3
FROM MyTable
WHERE MyDateTime >= :t1
AND MyDateTime < new.MyDateTime);
new.Median3 = (SELECT M1.Price
FROM MyTable M1
JOIN MyTable M2 ON M1.MyDateTime < M2.MyDateTime
WHERE M1.MyDateTime >= :t1
AND M1.MyDateTime < new.MyDateTime
AND M2.MyDateTime < new.MyDateTime
ORDER BY Price
ROWS 2 TO 2);
end ^^
SET TERM ; ^^

Since I haven't got such a table, I haven't tried to prepare the statement to see whether indexes are used for MyDateTime everywhere (which is a must for your performance).

The update and delete triggers are somewhat more complex, since delete might have to modify up to three records and update up to six. I won't attempt to figure out how to write them today.

If you've already got a table with lots of records, I suppose the initial value could be set like this:

EXECUTE BLOCK as
declare RecordNo integer;
declare MyDateTime DateTime;
declare Price decimal(9,2);
declare Average3 decimal(9,2);
declare Median3 decimal(9,2);
declare i integer;
declare ic integer;
declare i1 decimal(9,2);
declare i2 decimal(9,2);
declare i3 decimal(9,2);
begin
i = 0;
i1 = null;
i2 = null;
i3 = null;
Median3 = null;
for select RecordNo, Price from MyTable order by MyDateTime into :RecordNo, :Price do
begin
ic = I-(I/3)*3+1;
Average3 = (i1+i2+i3)/3;
If (i3 is not null) then
Begin
Median3 = i3;
If (i1 between min(i2, i3) and max(i2, i3)) then Median3 = i1;
If (i2 between min(i1, i3) and max(i1, i3)) then Median3 = i2;
End
update MyTable set Median3 = :Median3, Average3 = :Average3
where RecordNo = :RecordNo;
if (ic = 1) then i1 = Price;
if (ic = 2) then i2 = Price;
if (ic = 3) then i3 = Price;
i = i + 1;
end
end
end

HTH,
Set