Subject [IBO] Re: D2007
Author ibobjects
Stefan,

> did you receive my last (private) mail? (in response to
> your mail from Tue, 10 Jul 2007 00:19:04 -0700)

Yes, your help was very appreciated.

I have finished up support for UTF8 transliteration for all CHAR and
VARCHAR columns. This includes the lengthy ones that the TDataset
insists on handling through a TMemoField, which uses streams instead
of as raw strings.

I still have yet to accomplish this with BLOB MEMO columns. I'm
wondering if this shouldn't be handled instead using a blob filter.
My hunch is that this could be very problematic with UTF8 encoding
because you could have one effective character that maps into
multiple bytes to get severed at a blob segment boundry. Thus, the
data of a segment shouldn't be transliterated unless you know for
sure a character sequence is complete. I suspect this is going to
be a real pain to deal with in the engine in terms of character
length vs. byte length issues.

What I plan to do is always pull the blob as a whole prior to
transliteration and only buffer the transliterated version. I'll
need to add a flag in the blob buffer to tell whether or not the
contents have been transliterated. That way when it is time to post
to the server it will post back a re-transliterated version of the
data.

Let me know if you think this strategy sounds good.

Thanks,
Jason Wharton
www.ibobjects.com