Subject | Re: [firebird-support] performance question |
---|---|
Author | unordained |
Post date | 2012-04-17T22:42:15Z |
---------- Original Message -----------
From: "Olaf Kluge" <olaf.kluge@...>
If the layout of the string was changed after you had already received data, it
wouldn't make sense to change the mask globally. You'd either need to timestamp
your format overlays, or store the mask ID with each historical record, or ...
something. If the problem is that the client is wishy-washy, fine -- have a
stored procedure do the splitting of the string into the table upon insert, and
put all your configurable magic there.
I'd go with the first option, just because it can accommodate changes over time
as your input format changes, but the underlying idea of the data doesn't (or
varies only a little). If some day you start receiving an extra field in the
string, great -- you can add a column that will be NULL for historical data,
and adjust the input stored procedure or program to match. Dropped from input
data? NULL for new records, keep old data intact. Multiple formats, from
different sources? Multiple input procedures, same underlying data-structure
for storage.
Don't worry about row sizes -- I don't think there'll be much cost from having
the fields split up vs. not. Not enough to care about, anyway. Relational
database file formats have a good deal of "wiggle room" anyway, they're not
packed 100% tight, so slight variations in field-terminators, etc. aren't the
end of the world.
But individual fields (a traditional relational database) do give you the
ability to index certain columns efficiently (technically, you could do so even
with your other solution, but please don't, it's unnecessarily complex), easily
perform computations on them, or create convenient views that give certain
users access only to parts of the underlying data. You can also use your
resulting database with off-the-shelf reporting and graphing tools, which won't
necessarily have any idea how to use your custom SP approach to gathering data.
That compatibility (through standardization) is worth something!
-Philip
From: "Olaf Kluge" <olaf.kluge@...>
> Now I have two ideas. I create a table with all fields (separate for each------- End of Original Message -------
> information) and an import stored procedure fill this table with information
> from the long string. (line_number = substring(:str_in from 1 for 4)) etc.
>
> String: 1234A1E2WEWE3432334
>
> Into Field 1 = 1234, Field 2 = A1, Field3..
>
> Or
>
> I made a table with one field and save only the complete string in it.
> In a second table I store the information about the field-names and
> the start-byte. If the client-visualisation gives me the start-byte, I
> can get this information from the one table and disassemble the string
> from the other while executing. This method had the advantage, that I
> can simply and short change the configuration if the layout of the
> string was changed.
>
> String: 1234A1E2WEWE3432334 into one field and get the information by
> executing the stored procedure.
If the layout of the string was changed after you had already received data, it
wouldn't make sense to change the mask globally. You'd either need to timestamp
your format overlays, or store the mask ID with each historical record, or ...
something. If the problem is that the client is wishy-washy, fine -- have a
stored procedure do the splitting of the string into the table upon insert, and
put all your configurable magic there.
I'd go with the first option, just because it can accommodate changes over time
as your input format changes, but the underlying idea of the data doesn't (or
varies only a little). If some day you start receiving an extra field in the
string, great -- you can add a column that will be NULL for historical data,
and adjust the input stored procedure or program to match. Dropped from input
data? NULL for new records, keep old data intact. Multiple formats, from
different sources? Multiple input procedures, same underlying data-structure
for storage.
Don't worry about row sizes -- I don't think there'll be much cost from having
the fields split up vs. not. Not enough to care about, anyway. Relational
database file formats have a good deal of "wiggle room" anyway, they're not
packed 100% tight, so slight variations in field-terminators, etc. aren't the
end of the world.
But individual fields (a traditional relational database) do give you the
ability to index certain columns efficiently (technically, you could do so even
with your other solution, but please don't, it's unnecessarily complex), easily
perform computations on them, or create convenient views that give certain
users access only to parts of the underlying data. You can also use your
resulting database with off-the-shelf reporting and graphing tools, which won't
necessarily have any idea how to use your custom SP approach to gathering data.
That compatibility (through standardization) is worth something!
-Philip