Subject Re: [firebird-support] Computed Columns
Author Helen Borrie
At 09:47 AM 26/04/2007, you wrote:
>I've been advised by several people that database columns defined via the
>"computed" options should be avoided. I've been told that they extremely
>inefficient. I was advised that instead of using the computed option for a
>column, I should store the result of the calculation in its own column.

"I've been advised that you have stopped beating your wife. Is it true?"

>This seems incredible to me. I would think that adding redundant data to
>the database would reduce, not improve performance.

It depends...people do sometimes store redundant data to avoid having
to use joins on quick lookups...

I think you and your advisors are confused about what you mean by
'"computed" options.

1. You can define a field in a table as COMPUTED BY <an
expression>. This doesn't store the data at all, only the
definition. It is an output-only value that will be computed each
time it is specified in a SELECT query (or a sub-select
expression). It can be cheap (e.g., a simple concatenation or
calculation based on values in the record itself) or expensive (some
complex correlated subquery on other tables or a stored procedure
call, for example). But the calculation for a COMPUTED BY column
won't occur if that column is not specified in the select; and it
will be performed only for the rows that are involved in the query.
Since they are calculated only for output, COMPUTED BY fields based
on the record in hand don't impact data-changing operations at
all; but, if they read other tables then a dependency exists which
is capable of blocking operations on the other tables under some conditions.

a) You can write Before Insert and Before Update triggers on the
table to calculate and populate columns in your table
b) you can write After Insert and After Update triggers on a
dependent table to populate your table.

People sometimes do such things to maintain redundant totals on the
header records of invoice and order structures, for example. This
*can* impact performance noticeably if they involve other tables (or
{{{{shudder}}}} other records in the same table).

>As an example, the application I'm working on has an account number that
>consists of an office number and an "account number within office". Instead
>of defining a column for account number and computed columns for office
>number and "account number within office", I've been advised to create three
>columns, each containing the corresponding data.

Unless I've misunderstood what you're describing here, you would want
only the office number and the within-office number. There would be
no need for either a computed by or a redundant column to store the
combined number. For searching, you would use both; for outputs,
you would just form the number using an expression in your output

>The database will primarily be used to generate summary reports that for the
>most part involve all the records in a table.

Sure; but you don't design databases to fit reports. The SELECT
statement is for designing output. Performance of systems depends on
clean database design and good (read "appropriate") indexing.

>Please let me know if this makes sense? Is it better to store redundant
>data than to use the "computed" option?

The answer to this question is very obvious: Yes, No and Maybe.

In other words, it's impossible to state hard and fast rules one way
or the other. Use your knowledge of your data and the requirements
for its use and always test everything with realistic test
data. Bear in mind things like dependency and the overhead of
writing redundant data in the context of your actual usage and do
what works best *for each case*.