Subject Re: [Firebird-Java] Possible memory leak in EncodingFactory ?
Author Roman Rokytskyy
> Roman, where I can download the AS3AP benchmark test you used?

From our CVS - the module name is Benchmarks, there is implementation for C#
and Java.

> And lets continue about caching. I want to describe how it can be
> implemented and if we agree on the way - I will make the patch.

> 1) Always use sharedBufferC in Encoding_OneByte

Ok.

> 2) Cache Encoding objects in GDSHelper, so all fields of the
> connection use the same object.
>
> 3) Remove the
>
> protected String iscEncoding = null;
> protected String javaEncoding = null;
> protected String mappingPath = null;
>
> from each FBField - they are obtained from GDSHelper and are needed
> only for encoding/decoding the strings.

Ok.

> 4) Move methods encodeString/decodeString from XSQLVAR to FBField

Probably, but that will break the grouping of the encoding/decoding methods
in XSQLVAR, so unless you present really good reason, I'd prefer not to do
this.

> 5) Implement in GDSHelper the caching method getEncoding() it would
> call non-caching implementaion in EncodingFactory and save obtained
> object for reuse.

See below.

> The problems:

> 1) In the current implementation every field uses default database
> encoding - the field encoding is not used. Is it intentional
> optimization ?

That is not optimization, but the only possible correct implementation. When
we say Firebird our connection encoding (the isc_dpb_lc_ctype property), all
strings should be sent in that encoding. If we start sending each field in
its own encoding, we will corrupt the contents of the strings.

> 2) Is there any chance somebody will be using different objects
> belonging to the same connection from different threads?

Oh yes! That is one of the requirements of the JDBC specification and people
use it.

> The code in decodeFromCharset is not thread safe intentionally, so if
> it is the common use case to work with connection from different
> threads (I personally always use each connection on the single thread)
> the Encoding caching should be done per Statement/ResultSet objects.

But that is what happens right now - each XSQLVAR (the lowest abstraction
for the database field) contains it's personal Encoding instance with its
internal buffers (sharedBufferB and sharedBufferC) allocated to the
appropriate size. When the PreparedStatement or ResultSet accesses the
XSQLVAR object, only the XSQLVAR.sqldata field is updated, the rest fields
remain the same.

This was reported as a problem, since the XSQLVAR objects remain in memory
until Statement or ResultSet objects are garbage collected and each XSQLVAR
contains an allocated buffers. Personally I do not see this as a problem,
since that happens one per field per statement/result set and totally that
should be few kilobytes of memory. But that is the theory, I did not check
that in profiler.

Rick proposed a change where we removed the need in sharedBufferB and
sharedBufferC, the Encoding instances remain cached on XSQLVAR level.

So, considering all said above, do you still think we need another caching?

Roman