Subject | RE: [firebird-support] RTRIM Overhead |
---|---|
Author | Hardy Sherwood |
Post date | 2007-10-16T00:30Z |
Helen,
I've imbedded answers to your questions below as well as added some
questions.
If there are only one or two input fields that need such a large
container, make some more declarations, e.g., RTRIM30, etc., so that
you reduce the number of huge padded strings being pushed around in memory.
When the RTRIM function is invoked, doesn't the input
parameter's size dictate how much data is being pushed around? If RTRIM is
called passing it say a char(10) field, why would it push around more than
11 bytes including the null delimiter?
What was the problem with the autocast in the initial run, that
caused you to resort to trimming?
The problem was that trailing blanks were not being removed
when the data was being inserted into the internal tables. I assume that
the autocast would convert a CHAR from the external table into a VARCHAR in
the internal table. I'd appreciate if you'd let me know under what
circumstances the autocast would either handle null delimited strings or
remove trailing spaces? I would much rather use the autocast feature than
RTRIM, if this would accomplish the objective of eliminating unneeded data
from the end of character strings that is inserted into VARCHAR columns.
I've also considered defining the columns in the external
table as VARCHAR. The problem with this approach is that alignment (also
called slack) bytes are added so that the two byte size field that precedes
the character data in a VARCHAR is aligned to an even byte boundary. This
would be fine, except that I've experienced inconsistencies in FB's
generation of the slack bytes in the past and have therefore stayed aware
from placing binary data into external tables.
You said it was a large input file (as it must be, to take so long) -
how frequently are you committing during the run? are there indexes
in the destination file?
Currently, up to about 2gb of data is written to a
sequential file. When a sequential file reaches this size (or the end of
data is encountered) the sequential file is inserted into an internal table
with a single SQL statement. Please let me know if the threshold for the
size of the sequential files should be reduced from 2gb and if so, to what
size?
In one of the tables, there is one column defined as a
primary key. In all the other tables (about 140) the primary key is also
defined as a foreign key. There are no other indexes.
Thanks for your help.
Hardy
./heLen
-----------------------------------------------------------------------------------
This message and contents are confidential and intended solely for
the use of the individual or entity to whom they are addressed. If you
have received this email in error please notify the system manager.
The original email message has been scanned for computer viruses.
-----------------------------------------------------------------------------------
[Non-text portions of this message have been removed]
I've imbedded answers to your questions below as well as added some
questions.
If there are only one or two input fields that need such a large
container, make some more declarations, e.g., RTRIM30, etc., so that
you reduce the number of huge padded strings being pushed around in memory.
When the RTRIM function is invoked, doesn't the input
parameter's size dictate how much data is being pushed around? If RTRIM is
called passing it say a char(10) field, why would it push around more than
11 bytes including the null delimiter?
What was the problem with the autocast in the initial run, that
caused you to resort to trimming?
The problem was that trailing blanks were not being removed
when the data was being inserted into the internal tables. I assume that
the autocast would convert a CHAR from the external table into a VARCHAR in
the internal table. I'd appreciate if you'd let me know under what
circumstances the autocast would either handle null delimited strings or
remove trailing spaces? I would much rather use the autocast feature than
RTRIM, if this would accomplish the objective of eliminating unneeded data
from the end of character strings that is inserted into VARCHAR columns.
I've also considered defining the columns in the external
table as VARCHAR. The problem with this approach is that alignment (also
called slack) bytes are added so that the two byte size field that precedes
the character data in a VARCHAR is aligned to an even byte boundary. This
would be fine, except that I've experienced inconsistencies in FB's
generation of the slack bytes in the past and have therefore stayed aware
from placing binary data into external tables.
You said it was a large input file (as it must be, to take so long) -
how frequently are you committing during the run? are there indexes
in the destination file?
Currently, up to about 2gb of data is written to a
sequential file. When a sequential file reaches this size (or the end of
data is encountered) the sequential file is inserted into an internal table
with a single SQL statement. Please let me know if the threshold for the
size of the sequential files should be reduced from 2gb and if so, to what
size?
In one of the tables, there is one column defined as a
primary key. In all the other tables (about 140) the primary key is also
defined as a foreign key. There are no other indexes.
Thanks for your help.
Hardy
./heLen
-----------------------------------------------------------------------------------
This message and contents are confidential and intended solely for
the use of the individual or entity to whom they are addressed. If you
have received this email in error please notify the system manager.
The original email message has been scanned for computer viruses.
-----------------------------------------------------------------------------------
[Non-text portions of this message have been removed]