Subject | How to optimize usage of TIB_DataPump? |
---|---|
Author | Thomas Steinmaurer |
Post date | 2012-04-02T19:52:50Z |
Hello Jason,
I'm trying to get most out of the TIB_DataPump, in a scenario where I'm
running in a 2PC transaction, purging data from a production into a log
database. The data is a master (table: IBLM$OPERATIONLOG) / detail
(table: IBLM$COLUMNLOG) relationship. What I'm currently using is (all
is instantiated at run-time, as I'm in a command-line tool here):
1) Single, read-write distributed transaction in tiConcurrency mode
across the two databases. All components etc. are attached to this
transaction.
2) Source/Destination statement of TIB_DataPump are TIB_Cursor.
3) Transfer of the master records based on the following query:
select * from iblm$operationlog order by id asc
In context of this operation, I'm using TIB_DataPump events to save the
start and end id of processed operation log records in variables for
usage in 4). This way, and in combination with the tiConcurrency
transaction, I can avoid querying detail records per processed master
record, which results in a single detail query, instead of a bunch
detail queries per master record.
4) Transfer the detail records based on the following query:
select * from iblm$columnlog where operationlog_id between :start_id and
:end_id order by id asc
5) Remove source database records via:
delete from iblm$operationlog where id between :start_id and :end_id
An ON DELETE cascade FK constraint takes care on removing detail records
as well.
6) Do a COMMIT of the 2PC transaction
I do see the following from the Trace API output:
a)
INSERT INTO IBLM$COLUMNLOG
( ID
, OPERATIONLOG_ID
, COLUMN_NAME
, OLD_VALUE
, NEW_VALUE
, OLD_VALUE_BLOB
, NEW_VALUE_BLOB )
VALUES
( ? /* ID */
, ? /* OPERATIONLOG_ID */
, ? /* COLUMN_NAME */
, ? /* OLD_VALUE */
, ? /* NEW_VALUE */
, ? /* OLD_VALUE_BLOB */
, ? /* NEW_VALUE_BLOB */ )
RETURNING ID, OPERATIONLOG_ID, COLUMN_NAME, OLD_VALUE, NEW_VALUE,
OLD_VALUE_BLOB, NEW_VALUE_BLOB
Is the RETURNING part something TIB_DataPump really needs? I basically
don't need that I guess, especially as there can be BLOB columns
involved as well. So, I need that as light-weight as possible. ;-)
b)
I also see re-occuring execution of the detail records SELECT statement
with the same statement handle and transaction id:
select * from IBLM$COLUMNLOG
where operationlog_id between ? /* olid_start */ and ? /* olid_end */
ORDER BY id ASC
Not sure, if I'm doing something wrong here, or if this is a sign that
IBO is fetching additional records from the TIB_Cursor result set?
Perhaps a FetchAll before executing the pump makes sense?
* Do you see any improvements in the usage of TIB_DataPump here?
* How would you implement an incremental commit thingy? As everything is
bound to the single transaction, even the SELECT statements, I guess I
have to call CommitRetaining, because a hard COMMIT would close my
SELECT cursors as well?
Any comments, insights are appreciated. ;-)
Thanks!
--
With regards,
Thomas Steinmaurer (^TS^)
Firebird Technology Evangelist
http://www.upscene.com/
Do you care about the future of Firebird? Join the Firebird Foundation:
http://www.firebirdsql.org/en/firebird-foundation/
I'm trying to get most out of the TIB_DataPump, in a scenario where I'm
running in a 2PC transaction, purging data from a production into a log
database. The data is a master (table: IBLM$OPERATIONLOG) / detail
(table: IBLM$COLUMNLOG) relationship. What I'm currently using is (all
is instantiated at run-time, as I'm in a command-line tool here):
1) Single, read-write distributed transaction in tiConcurrency mode
across the two databases. All components etc. are attached to this
transaction.
2) Source/Destination statement of TIB_DataPump are TIB_Cursor.
3) Transfer of the master records based on the following query:
select * from iblm$operationlog order by id asc
In context of this operation, I'm using TIB_DataPump events to save the
start and end id of processed operation log records in variables for
usage in 4). This way, and in combination with the tiConcurrency
transaction, I can avoid querying detail records per processed master
record, which results in a single detail query, instead of a bunch
detail queries per master record.
4) Transfer the detail records based on the following query:
select * from iblm$columnlog where operationlog_id between :start_id and
:end_id order by id asc
5) Remove source database records via:
delete from iblm$operationlog where id between :start_id and :end_id
An ON DELETE cascade FK constraint takes care on removing detail records
as well.
6) Do a COMMIT of the 2PC transaction
I do see the following from the Trace API output:
a)
INSERT INTO IBLM$COLUMNLOG
( ID
, OPERATIONLOG_ID
, COLUMN_NAME
, OLD_VALUE
, NEW_VALUE
, OLD_VALUE_BLOB
, NEW_VALUE_BLOB )
VALUES
( ? /* ID */
, ? /* OPERATIONLOG_ID */
, ? /* COLUMN_NAME */
, ? /* OLD_VALUE */
, ? /* NEW_VALUE */
, ? /* OLD_VALUE_BLOB */
, ? /* NEW_VALUE_BLOB */ )
RETURNING ID, OPERATIONLOG_ID, COLUMN_NAME, OLD_VALUE, NEW_VALUE,
OLD_VALUE_BLOB, NEW_VALUE_BLOB
Is the RETURNING part something TIB_DataPump really needs? I basically
don't need that I guess, especially as there can be BLOB columns
involved as well. So, I need that as light-weight as possible. ;-)
b)
I also see re-occuring execution of the detail records SELECT statement
with the same statement handle and transaction id:
select * from IBLM$COLUMNLOG
where operationlog_id between ? /* olid_start */ and ? /* olid_end */
ORDER BY id ASC
Not sure, if I'm doing something wrong here, or if this is a sign that
IBO is fetching additional records from the TIB_Cursor result set?
Perhaps a FetchAll before executing the pump makes sense?
* Do you see any improvements in the usage of TIB_DataPump here?
* How would you implement an incremental commit thingy? As everything is
bound to the single transaction, even the SELECT statements, I guess I
have to call CommitRetaining, because a hard COMMIT would close my
SELECT cursors as well?
Any comments, insights are appreciated. ;-)
Thanks!
--
With regards,
Thomas Steinmaurer (^TS^)
Firebird Technology Evangelist
http://www.upscene.com/
Do you care about the future of Firebird? Join the Firebird Foundation:
http://www.firebirdsql.org/en/firebird-foundation/