Subject Re: [firebird-support] Blob Parsing Stored Procedure
Author Helen Borrie
At 11:13 PM 3/07/2007, you wrote:
>Hello all,
>I've got a lot of small inserts to do and its just taking much longer
>than I'm willing to wait for (even with an embedded DB). Basically
>these inserts are small records (roughly 4 BIGINT's) that have a FK to
>another larger table.
>
>If I'm reading everything correctly, Firebird does NOT support bulk
>inserts and anything close to it is kind of a hack.

No, that is quite untrue. However, it is not always advisable to
push huge numbers of inserts through in a single transaction if
you're fairly limited on resources. It can be a good practice to
batch your inserts at around 10,000 per transaction.

You don't seem to have provided any comprehensible information so
far, such as where the data for the inserts is coming from or how
many records you are processing.

>I was able to
>create a stored procedure that took 10x the parameters and did 10
>inserts at once. I saw a decent performance gain from this.

Sorry, I don't understand the statement above.


>What I would really like to do is to send a single BLOB to a stored
>procedure. This blob would be a Binary data structure that would
>contain N 32 byte blocks for this example. The first 32 byte block
>would be the first 4 BIGINT's to insert into the table, the second
>would be the next 4 BIGINT's, etc.
>
>Can a stored procedure be written to parse the blob in this manner?

Yes, it could be; but I would have the strongest doubts that it
could be faster than direct DML from the client using a prepared
parameterised statement and applying values to your parameters in a
read-from-source/write-to-target loop. To be readable by the engine,
your blob would have to be a text blob, so that it could be converted
to a string....then you would have to apply some parsing routine to
the string and you would *still* need to use a parameterised insert
statement (or an equivalent cursor operation) for each block of four
values you extracted.

On the other hand, if the source of the data were a text file that
you had formatted as fixed-length records, there could be a major
performance advantage to making this file an external table and
processing the entire thing in a stored procedure. We just don't
have any information from you about where your data is coming from.

>Basically I want to pass in an arbitrarily sized blob (I can pass in
>size as a parameter also if necessary...) and break it up into the 8
>byte blocks (32 bytes per row = 8bytes * 4 values) that actually
>represent the data to be stored in the database.

Expensive, compared to a parameterised DSQL statement.

>OR - Can I pass in the values as text and parse it in the same manner.
> I could convert the 64bit numbers to 20 bytes of text and grab the
>numbers off in 20 byte chunks if necessary.

Of course you could do something like that if it were needed. But to
me it looks like a whole heap of unnecessary processing and
conversion happening on both sides of the interface for no particular purpose.

Putting aside the obvious (processing a file), let's suppose that it
is really true that you have no option but to pre-process the data in
your client application. Let's assume you have some loop on the
client where each iteration generates the set of values for the
record you want to insert.

Outside the loop, you begin with the parameterised insert statement, e.g.,

insert into atable (col1, col2, col3, col4)
values (?, ?, ?, ?)

First, you start a transaction (which might be automatic with ODBC)

Next, you prepare this statement (for which the ODBC driver will have
a method that defines the parameter set and prepares the structures
that its Prepare call returns from the server).

Now, once the statement is prepared, you begin your loop:

for each (input instance) ....
begin
munge the input into variables of the appropriate data type
apply each variable to the appropriate parameter (the order is
crucial, if your
driver doesn't create its own parameter variables; the
driver probably has
a method that can do the above in one step)
execute the statement
end

Finally, when all of the input has been processed, you commit the transaction.

Your workflow is not apparent from any of your emails, but get past
the belief that you have to invent some kind of hack to perform bulk
inserts. It is conventional SQL you are looking at here.

From your posting to the ODBC list, I see that you are getting
memory exceptions from your attempts so far to do [[ whatever you are
doing ]]. You do need to persevere with this...but help yourself AND
the ODBC gurus by providing enough information to understand the
problem. The doctor can't help if the only information the patient
is willing to impart is "I don't feel well today."

I suspect that you are getting AVs from referring to unassigned
objects. Likely sources of these problems would be unreferenced
parameters (if you are using parameters) or an invalidated statement
(which happens if you perform a total replacement of the statement
inside the loop without re-preparing -- a highly undesirable approach
even if you were properly taking care of the state !!).

So - provide more information.

./heLen