Subject Re: [Firebird-Java] PS about client-to-FBcore protocols regarding charset, #5.
Author Roman Rokytskyy
> Oh, i can mark even the 5th proposal.
> Extend internal protocol to allow charset specification per-column and per-blob and per-statement. Like one can do in XML.

This is already done for text BLOBs (they misuse some field), but AFAIK
there is no place for CHAR/VARCHAR columns. Adding new field is not
possible because of compatibility reasons (C/C++ part, read
fbclient.dll, all C++ and Delphi apps, in Java and .NET we are free).

> This way you would have the most efficient network throughput.

That is not the philosophy behind the FB. The philosophy is that client
is always dumber than server, so it has to tell server what it is
capable of and server will provide data that can be consumed directly.
And I believe this is right.

> You are concerned that UTF8 payload might take 4 times the space of SBCS payload? Convert to SBCS, tag it accordingly and pass in compact form.
>
> You made WIN1251 connection but then found data with umlauts? tag the payload as WIN1252 and pass it non-damaged.
>
> You made WIN1251 connection and the data is Win1251 ? Nothing to override so just sent payload untagged saving the sizeof(charset-tag).
>
> That would be most effective solution. And arguably most elegant.
>
> But it also would require most major change in both clients and server, so i do not hope it would happen anytime soon, if at all.

It would require your Delphi installation to depend on ICU libraries,
the Firebird client install will be something like FB embedded.

So, I believe the rule should be either:

- All clients are able to consume data in UTF-8. Period. All Delphi
users will cry when they start to convert Unicode strings to SBCS and back.

- All clients get what they ask for. Those asking UTF-8 get UTF-8, those
asking WIN1251 get WIN1251, and when umlauts are stored - they get an
error "Cannot transliterate characters between character sets".

Roman