Subject Re: UTF-8 vs UTF-16
Author adem
Hi Peter,

> > It seems you all agree that some form of
> > Unicode is a good idea for the server and
> > the charset issue is best left with the
> > client to worry about.
> Count me as the disagreeing one. I'd like
> better UNICODE support (and I'm trying to
> contribute to it), but I'd prefer the
> other charsets not to vanish.

Nor do I actually. Not for the foreseable future
at least.

> > Now, this might be too obvious, but to me it
> > seems only a natural extension to let/require
> > the client upload whatever the collation order
> > it desires to the server.
> >
> > That way, this collation order headache would
> > be completely removed off the server developers
> > and would give the code developers quite a bit
> > of freedom to pick and choose their own collation
> > orders.
> This is a sexy idea, but it's not that much work
> removed from the server.

I wasn't trying to lighten up the load on the server,
but the devoplers of the server, that is many of you
guys :-)

> Nowadays the server gets the collation info from a
> DLL, in your proposal it would read some table in
> the database.

And a DLL is always something extra. The way FB is
going it is likely to get very popular (this is not
me being hopeful, it is almost tangible), and as the
numbers increase so will the numbers of people asking
for different collation orders and therefore DLLs.

Not all of them are to be expected to compile their
own DLLs for the platform the server is running on,
will they?

Plus, I simply can not see the reason why something
that could be solved (I am assuming this) by using
FB's internal/system tables should be hardwired into
some DLL and have to be installed separately.

> The actual use of the collation info to calculate
> keys and compare strings would be the same.

I should have put this disclaimer in my prev post:
I am not anywhere near an architect-calibre, so all
I am doing is to try to use my intuition. Hence, I
presume the above is naturally right.

> Also in the current schema the collation designer can
> carefully handcraft the collation code to use as few bits
> per character in the key as possible. If general
> collation info is give at runtime, either the keys will
> get longer or some very clever collation compiler
> would have to be included in the server.

Again, please forgive my ignorance, but deep down,
isn't a collation order some form of an array where
on one side is the charcode on the other is the sequence
number of it? If so, what do you mean by collation