firebird-java - Re: [Firebird-Java] Optmizing ResultSet.getBytes()

Subject	Re: [Firebird-Java] Optmizing ResultSet.getBytes() - try #3
Author	Roman Rokytskyy
Post date	2006-04-19T07:16:23Z

Hi,

> I think that sending 32k buffer require a bit more time than sending a
> few bytes of the blob length. So most probably it will be slower about
> 1.1 times. But this coefficient includes only network time, and does
> not include the effect of more optimized memory usage. Which is the
> factor I want to have optimized.

So far, the network was a bottleneck, not the garbage collector. But ok, I
accept this goal.

>> Most likely no, unless you prove significant performance gain with this
>> patch.
>
> Do you want me to publish benchmark application repeating blob fetches
> for different blob sizes ?

Yup. I guess we should use 1k blobs together with, let's say, 1M blobs.

>> This patch looks like an attempt to optimize your particular task at
>> the expense of other cases. I still do not understand, why don't you use
>> getBinaryStream() method, which should be the most efficient way to fetch
>> blob.
>
> 1) BLOB = Binary large object. I want the code to be optimized for
> fetching LARGE blobs. I want it to be optimized by memory usage also,
> so fetching MULTIPLE LARGE blobs will not produce a lot of garbage and
> slowdown the whole system.

Sorry, I'm confused.

If we are going to optimize fetching large blobs (i.e. number of
isc_get_segment is, let's say, greater than 10), then your current method of
fetching length first, then allocating the buffer and releasing big chunk of
memory afterwards is ok. This patch is already in the CVS. I still believe
that getBinaryStream is better, simply because we do not need to allocate
big chunks of memory, but that is the problem of those using getBytes
method.

First of all I want to appologize - there is one packet less compared to
your previous case for the small blobs. I have missed the blob.isEof() line.
Why didn't you add it after fetching second segment too?

Now, if you want to optimize memory usage pattern, I see the only
possibility there - to add new method GDS.iscGetSegment with a signature
that accepts the preallocated memory buffer. Check the
AbstractJavaGDSImpl:1755 - starting this place code allocates byte array and
copies data from the "statically" allocated buffer in db handle (64k
buffer). The old signature must be deprecated, not removed and can be not
optimal at all. The new method should copy data from the buffer returned by
db.getResp_data() into the specified place.

If you do this, please ensure to provide diff against HEAD, not the 2.0.1
release code base.

> 2) There are 3rd party libraries like Hibernate, which already use
> getBytes() to fetch blob fields.

Ok, this is a good reason.

> Can we have the compromise by adding a parameter to the connection
> string or do you want to benchmark it first ?

No, there will be no additional connection string. Either we agree on the
new patch, or the old one remains. So benchmark would be quite usefull.

Roman