firebird-architect - Re: [Firebird-Architect] Re: UTF-8 (various)

Subject	Re: [Firebird-Architect] Re: UTF-8 (various)
Author	Daniel Rail
Post date	2005-03-04T12:16:18Z

Hello Alexandre,

Friday, March 4, 2005, 2:27:25 AM, you wrote:

> Of course a slow right answer is better than fast wrong one, but I think
> that if all queries does a natural scan on tables to perform the
> collation analises on run-time will not work either.

> If I understand right, will be a problem on how to create indices if we
> have a common chasert inside de engine and how those index will be
> evaluated with different charsets/collations define by sessions
> (queries), taking this, won't be enough if the create index statement to
> have another clause the specifies the collation order ?

> Something like

> Create index MyIndex on MyTable(MyColumnA) collate PT_BR

> When the query will be execute using a charset ISO8859_1 and PT_BR
> collation the index could be used, if another session (query ?) execute
> the same query but uses a diferent charset/collation the index coud not
> be used.

> Or is this nonsense ?

Why not create indices for all the collations that will be used, and
let the optimizer choose the appropriate index(here I simply mean that
the developer will have to create the multiple indices that their
applications will be using). And, then the optimizer will be able to
determine which index to use by looking up which collation is
specified (implicitly or explicitly). The collations could probably be
applied to UTF-8, so you'd have a common set of collations for the
different character sets. Although some collations might not make
sense for some character sets. But, that could be dealt with by
matching up character sets with collations like in RDB$COLLATIONS, yet
the core character set will always be UTF-8. Looking through
RDB$COLLATIONS, I do see all distinct collation names across all of the
character sets, not just within one character set.

Also, I think CREATE DATABASE should be extended to be able to specify
a default collation for the database. So, if no collation is
explicitly specified when creating the field, then it will use the
default collation specified for the database instead of using the
binary collation of the character set.

--
Best regards,
Daniel Rail
Senior Software Developer
ACCRA Group Inc. (www.accra.ca)
ACCRA Med Software Inc. (www.filopto.com)