Subject | Re: [Firebird-Architect] Bulk loader. |
---|---|
Author | Vlad Khorsun |
Post date | 2007-11-04T20:31:22Z |
>>> Now another question: where do we put the files to load? On the fileI mean schema below :
>>> system accessible from the server? Should we support remote bulk load?
>>
>> I don't like artificial limitations. Why should we limit users offering bulk load
>> from files only ? I prefer to implement generic bulk load API, say prepare statement
>> for bulk operation, pass into it large (or even huge ;) set of parameters, may be few
>> times, routine to finish load process and check results. Next we may implement at
>> higher level bulk loading from file, stream, etc.
>
> Here I'm not convinced.
>
> At the moment we have limitation of 64k "message" size that can be sent
> to server via API. So, if we pack all needed data in one message, we
> will be able to insert something between 100 and 200 records at a time.
> And after inserting them check the constraints. With this approach I'm
> afraid we will implement batch updates with average batch size of 200
> and not the bulk inserts.
>
> So, I would say we should provide a way to load a local file similar to
> the way we're doing right now with the external tables, possibly
> supporting the CSV format (any other? XML?) and add the deferred
> constraint evaluation and bypassing the triggers. This is what I would
> call bulk inserts.
>
> We can add batching capabilities to the wire protocol, and, if
> neccessary, new API call to pass a bunch of parameters in one go (say
> for 200 records). We can do deferred constraint evaluation, we can think
> about bypassing the triggers. And we call it batch updates.
>
> Do I miss anything?
a) fb_dsql_prepare_batch_statement
b) fb_dsql_batch_bind
c) fb_dsql_batch_done
d) fb_dsql_fetch_batch_results (not sure it needed, must think more on it)
fb_dsql_prepare_batch_statement may specify execution options such as
execute user triggers or not, check or ignore constraints etc. Or this options
may be encoded at SQL syntax level like in MSSQL's BULK INSERT statement.
fb_dsql_batch_bind may be called as many times as it needed. And it is
not necessary that 64K limitation will take place for this new call.
Constraints evaluation will be performed only one time inside fb_dsql_batch_done
Regards,
Vlad