Subject | INSERT Speed Optimization Ideas |
---|---|
Author | robert_difalco |
Post date | 2005-06-05T15:40:02Z |
I have a distributed architecture where my server handles all
interaction with the database server. Clients (or agents) post data
(lets call it metrics) to the server, and the server then inserts this
data into the database. There are about three related tables, one is
attribute data and contains a long KEY and a blob VALUE. The
attributes table uses a blob because the attributes can be
unpredictable and are usually serialized objects. There can be many
agents posting a lot of data so the inserts need to be fast.
One thing I discovered, which is probably obvious, is that inserting
with blobs can be slow, maybe particularly with JDBC. One has to
create a blob handle, stream to it, and close it. Writing my attribute
data to a VARCHAR OCTETS field seems to be much faster, however, the
size of this data is not consistent and could be larger than a
VARCHAR. It is more likely that the data will fit into a VARCHAR, but
there are a small number of cases where it might not.
I've come up with a solution that I wanted to get feedback on. I've
added an extra VARCHAR AS OCTETS field to the attributes data. When
inserting metrics, I get the size of the serialized attributes, if
they are larger than 1024 I write them to the BLOB field, if not I
write the bytes to the VARCHAR field. On reads, I just check the
VARCHAR field, if it is NULL then I know the data is in the BLOB field.
This seems to have increased INSERT speed quite a bit -- probably by
20% or more. Note that I never use this data in QUERIES (it just comes
back in result sets based on other QUERIES).
My question is (a) does this sound like a good solution for speeding
up my inserts and (b) are there hidden gotchas in this solution that I
am not considering?
This table doesn't really have indices besides it's primary key so
that approach really wont impact performance, and even if it did I
don't think the insert performance change would be as drammatic as
this solution.
TIA,
Robert
interaction with the database server. Clients (or agents) post data
(lets call it metrics) to the server, and the server then inserts this
data into the database. There are about three related tables, one is
attribute data and contains a long KEY and a blob VALUE. The
attributes table uses a blob because the attributes can be
unpredictable and are usually serialized objects. There can be many
agents posting a lot of data so the inserts need to be fast.
One thing I discovered, which is probably obvious, is that inserting
with blobs can be slow, maybe particularly with JDBC. One has to
create a blob handle, stream to it, and close it. Writing my attribute
data to a VARCHAR OCTETS field seems to be much faster, however, the
size of this data is not consistent and could be larger than a
VARCHAR. It is more likely that the data will fit into a VARCHAR, but
there are a small number of cases where it might not.
I've come up with a solution that I wanted to get feedback on. I've
added an extra VARCHAR AS OCTETS field to the attributes data. When
inserting metrics, I get the size of the serialized attributes, if
they are larger than 1024 I write them to the BLOB field, if not I
write the bytes to the VARCHAR field. On reads, I just check the
VARCHAR field, if it is NULL then I know the data is in the BLOB field.
This seems to have increased INSERT speed quite a bit -- probably by
20% or more. Note that I never use this data in QUERIES (it just comes
back in result sets based on other QUERIES).
My question is (a) does this sound like a good solution for speeding
up my inserts and (b) are there hidden gotchas in this solution that I
am not considering?
This table doesn't really have indices besides it's primary key so
that approach really wont impact performance, and even if it did I
don't think the insert performance change would be as drammatic as
this solution.
TIA,
Robert