Subject RE: [firebird-support] performance question
Author Svein Erling Tysvær
>from the plc I get every cycle of production a record with some information.
>Each record (a car) includes 300 Bytes with alphanumeric characters. For example char1 to char4 is the line number, char 5 to
>char x the model etc. I store the information in a firebird 2.5 cs db.
>The client (visualization) gives me the information, what fields he want to see. Either byte 1 to x or fieldname xx or field
>number x etc. depending on the realization
>Now I have two ideas. I create a table with all fields (separate for each
>information) and an import stored procedure fill this table with information from the long string. (line_number =
>substring(:str_in from 1 for 4)) etc.
>String: 1234A1E2WEWE3432334
>Into Field 1 = 1234, Field 2 = A1, Field3..
>I made a table with one field and save only the complete string in it. In a second table I store the information about the >field-names and the start-byte. If the client-visualisation gives me the start-byte, I can get this information from the one >table and disassemble the string from the other while executing. This method had the advantage, that I can simply and short >change the configuration if the layout of the string was changed.
>String: 1234A1E2WEWE3432334 into one field and get the information by executing the stored procedure.
>What can be better (performance, etc.)

First, one thing you probably want even though you don't mention it, is to add an integer field to your table, use that field as your primary key and use a trigger to assign a value to that field.

Your second option has several drawbacks. It will normally be slower due to it being difficult to index individual parts of the string, so your queries will typically have to scan the whole table. Another thing that either may not apply or that you may have forgotten to think about, is historical data. Lets say you have already imported the string you mentioned above:

String: 1234A1E2WEWE3432334

A while later you import your line number 10000, forcing you to change the record structure:


Now, you have gotten yourself two problems:

First, for lines 0001-9999, you might have to substring from position 7, whereas for lines >=10000, that would be position 8.

Secondly, if you do
WHERE SUBSTRING(EntireString from 1 for 4) BETWEEN '1000' AND '1099'
that would include line 10000.

Both of these are possible to overcome, the latter a bit simpler than the first, but they add complexity for no gain and as such it would be better to avoid them in the first place. The simplest way to do this, is to primarily go for your first suggestion where you both split them as you indicate, but also change their type into integers, dates etc. where appropriate. Those fields that are likely to be searched for (and reasonably selective), should be indexed.

Note that there's nothing wrong with a hybrid solution, where you first import into one table like your second solution, and then transfer from this table to another table using your first solution (either through a trigger or through a separate program).

If you receive big files, it could also be a possibility to import using an external table, though your description leads me to think that you receive one or a few records at the time and not big files more infrequently.