Subject Re: [IBO] How to optimize usage of TIB_DataPump?
Author Thomas Steinmaurer
Hi Jason,

> Comments inserted below.
>
>> one correction inline ...
>>
>>> I'm trying to get most out of the TIB_DataPump, in a scenario where I'm
>>> running in a 2PC transaction, purging data from a production into a log
>>> database. The data is a master (table: IBLM$OPERATIONLOG) / detail
>>> (table: IBLM$COLUMNLOG) relationship. What I'm currently using is (all
>>> is instantiated at run-time, as I'm in a command-line tool here):
>>>
>>> 1) Single, read-write distributed transaction in tiConcurrency mode
>>> across the two databases. All components etc. are attached to this
>>> transaction.
>>> 2) Source/Destination statement of TIB_DataPump are TIB_Cursor.
>>> 3) Transfer of the master records based on the following query:
>>>
>>> select * from iblm$operationlog order by id asc
>>>
>>> In context of this operation, I'm using TIB_DataPump events to save the
>>> start and end id of processed operation log records in variables for
>>> usage in 4). This way, and in combination with the tiConcurrency
>>> transaction, I can avoid querying detail records per processed master
>>> record, which results in a single detail query, instead of a bunch
>>> detail queries per master record.
>>>
>>> 4) Transfer the detail records based on the following query:
>>>
>>> select * from iblm$columnlog where operationlog_id between :start_id and
>>> :end_id order by id asc
>>>
>>> 5) Remove source database records via:
>>>
>>> delete from iblm$operationlog where id between :start_id and :end_id
>>>
>>> An ON DELETE cascade FK constraint takes care on removing detail records
>>> as well.
>>>
>>> 6) Do a COMMIT of the 2PC transaction
>>>
>>>
>>> I do see the following from the Trace API output:
>>>
>>> a)
>>>
>>> INSERT INTO IBLM$COLUMNLOG
>>> ( ID
>>> , OPERATIONLOG_ID
>>> , COLUMN_NAME
>>> , OLD_VALUE
>>> , NEW_VALUE
>>> , OLD_VALUE_BLOB
>>> , NEW_VALUE_BLOB )
>>> VALUES
>>> ( ? /* ID */
>>> , ? /* OPERATIONLOG_ID */
>>> , ? /* COLUMN_NAME */
>>> , ? /* OLD_VALUE */
>>> , ? /* NEW_VALUE */
>>> , ? /* OLD_VALUE_BLOB */
>>> , ? /* NEW_VALUE_BLOB */ )
>>> RETURNING ID, OPERATIONLOG_ID, COLUMN_NAME, OLD_VALUE, NEW_VALUE,
>>> OLD_VALUE_BLOB, NEW_VALUE_BLOB
>>>
>>>
>>> Is the RETURNING part something TIB_DataPump really needs? I basically
>>> don't need that I guess, especially as there can be BLOB columns
>>> involved as well. So, I need that as light-weight as possible. ;-)
>
> No, it doesn't need this and it should be suppressed to improve performance.
>
>
>>> b)
>>>
>>> I also see re-occuring execution of the detail records SELECT statement
>>> with the same statement handle and transaction id:
>>>
>>> select * from IBLM$COLUMNLOG
>>> where operationlog_id between ? /* olid_start */ and ? /* olid_end */
>>> ORDER BY id ASC
>>>
>>>
>>> Not sure, if I'm doing something wrong here, or if this is a sign that
>>> IBO is fetching additional records from the TIB_Cursor result set?
>
> Perhaps you can get a callstack on this in order to see what it is up to?
>
>
>> Re-executing this statement comes straight from my logic and makes
>> perfect sense. Sorry.
>>
>>
>>> Perhaps a FetchAll before executing the pump makes sense?
>
> No, not when you are using TIB_Cursor as the source.
>
>
>>> * Do you see any improvements in the usage of TIB_DataPump here?
>>> * How would you implement an incremental commit thingy? As everything is
>>> bound to the single transaction, even the SELECT statements, I guess I
>>> have to call CommitRetaining, because a hard COMMIT would close my
>>> SELECT cursors as well?
>
> Yes, a hard commit would close your cursors.
>
> This might be the best way to take things in bite-sized chunks anyway.
>
> You do all the records you want for the batch and then commit hard and then
> reopen the cursors and start from where you last left off and so on.
>
> Perhaps we can do a chat session and I can take a closer look at things.

Beside exception handling in TIB_DataPump (see my other post), I'm also
stuck in the things discussed above. Let me know when we can have a
closer look on that (via Skype).

Thanks,
Thomas