Subject RE: [IBO] UTF-8 handling
Author Jason Wharton
> Jason,
>
> > If the CharSet property is set to UTF-8 then I anticipate that
> > Firebird is transliterating whatever character set the data is
> > stored in the database to UTF-8 format which then necessitates that
> > the client take that UTF-8 character data and run it through the
> > routine that transliterates the UTF-8 chararacter format to whatever
> > the local character set is.
>
> The local character set here in western Europe is Windows 1252. If you
> assume that, I would be unable to access all the Unicode characters
> outside the Windows 1252 area.
>
> > I don't understand why you would want to turn off the
> > transliteration from raw UTF-8 to your local character set.
>
> I want to store and retrieve all possible Unicode characters in my
> database (for mixed-language applications). That's why I use UTF8 as
> the Character Set for my database columns and that's why I use UTF8 as
> the client character set. All other Client character sets would
> restrict me to the characters in ISO8859_1 or WIN1252 or whatever.
>
> > Are you saying that the built-in routine in Delphi that does the
> > transliteration is defective and you want to control it entirely
> > yourself?
>
> I am not saying that the transliteration routine is defective, I just
> want to have control over the characters when I want/need it. For
> databases that are known to be only used for German text, I will use
> ISO8895_1 for the database and the client.
>
> So if I want to have characters in, say, ISO8859_1 i specify that as
> the Client character set and everything gets transliterated to that.
> Same for WIN1252. But when I specify UTF8 as the Client character set,
> I would like to have the "real thing".
>
> As far as I understand things, the Client DLL (fbclient.dll) does all
> the transliteration, so you just have to hand over what you get from
> there?

If I don't use transliteration from the UTF-8 that comes from the client
then there won't be any transliteration when the data is presented to the
user in the local character set. So, by default I am invoking the Delphi
routine to do that transliteration from UTF-8 to the local characterset
automatically.

How is this a problem for you and please help me get a better grip on what
exactly you propose to do with the raw UTF-8 character data.

Jason