Subject | RE: [IB-Architect] Syntax for case insensitive sort |
---|---|
Author | David Schnepper |
Post date | 2000-03-29T22:22:37Z |
Boy, now I wish I had read through all the messages before spouting off
replies
so quickly!
Olivier -- this is precisely what I was trying to discuss in another
message, but
you do it much more effectively.
Most of the existing InterBase drivers before 4-pass dictionary sorting for
their respective languages -
Pass 1 - A is different from B
Pass 2 - A is different from A-grave
Pass 3 - A is different from a
Pass 4 - Punctuation marks are considered
(The exceptions to this are the dBase & Pdox drivers which match the Paradox
& dBase
collation orders)
For French, the rule is to uppercase accented characters to the non-accented
version,
EXCEPT when an accent is needed to differentiate from a different word that
has
the same uppercase. (Clearly no database implements this context sensitive
nature
of uppercase).
For French-Canadian, the rule is to uppercase to the accented version of the
character.
Dave
-----Original Message-----
From: Olivier Mascia [mailto:om@...]
Sent: Wednesday, March 29, 2000 10:08 AM
To: IB-Architect@onelist.com
Subject: Re: [IB-Architect] Syntax for case insensitive sort
From: "Olivier Mascia" <om@...>
Still, the issue of non US-ASCII characters will need to be correctly
investigated before programming any case-insensitive thing...
Take french language for instance, but similar or more complex issues
will be raised in other languages.
We have very common accented characters like é (e with an "accent aigu",
cute accent I guess in english). Most often when people write a word
like "école" (school) in uppercase, they write it "ECOLE". So the accent
is dropped from the "E". This is the common way most people will think
about it in french. But Windows and ISO character sets also include
an "É" (which is that same E with an "accent aigu"). [--- By the way I do
not know if my email client will do things correctly so that you can
display those characters on your workstations. ---]
Speaking of case insensitive collation order, é, e, E, and É should be
considered equal. But speaking of an operator to convert "école" to
uppercase, not everybody will like it to become "ÉCOLE". This is the
behaviour of Win32 API. But for most usage people expect "ECOLE" to come
out of the 'uppercasing' operator or function.
If a choice must be made between both, my very personal preference goes
to "école" which is uppercased as "ECOLE" (without accent so).
I'd like to read what other non-english native speaking people do think
or comment on this subject.
---------------------------------------------------------------------
Olivier Mascia T.I.P. Group SA
om@... www.tipgroup.com
Director, Chief Software Architect +32 65 401111
------------------------------------------------------------------------
@Backup- Protect and Access your data any time, any where on the net.
Try @Backup FREE and recieve 300 points from mypoints.com Install now:
http://click.egroups.com/1/2345/3/_/_/_/954354039/
------------------------------------------------------------------------
To unsubscribe from this group, send an email to:
IB-Architect-unsubscribe@onelist.com
replies
so quickly!
Olivier -- this is precisely what I was trying to discuss in another
message, but
you do it much more effectively.
Most of the existing InterBase drivers before 4-pass dictionary sorting for
their respective languages -
Pass 1 - A is different from B
Pass 2 - A is different from A-grave
Pass 3 - A is different from a
Pass 4 - Punctuation marks are considered
(The exceptions to this are the dBase & Pdox drivers which match the Paradox
& dBase
collation orders)
For French, the rule is to uppercase accented characters to the non-accented
version,
EXCEPT when an accent is needed to differentiate from a different word that
has
the same uppercase. (Clearly no database implements this context sensitive
nature
of uppercase).
For French-Canadian, the rule is to uppercase to the accented version of the
character.
Dave
-----Original Message-----
From: Olivier Mascia [mailto:om@...]
Sent: Wednesday, March 29, 2000 10:08 AM
To: IB-Architect@onelist.com
Subject: Re: [IB-Architect] Syntax for case insensitive sort
From: "Olivier Mascia" <om@...>
Still, the issue of non US-ASCII characters will need to be correctly
investigated before programming any case-insensitive thing...
Take french language for instance, but similar or more complex issues
will be raised in other languages.
We have very common accented characters like é (e with an "accent aigu",
cute accent I guess in english). Most often when people write a word
like "école" (school) in uppercase, they write it "ECOLE". So the accent
is dropped from the "E". This is the common way most people will think
about it in french. But Windows and ISO character sets also include
an "É" (which is that same E with an "accent aigu"). [--- By the way I do
not know if my email client will do things correctly so that you can
display those characters on your workstations. ---]
Speaking of case insensitive collation order, é, e, E, and É should be
considered equal. But speaking of an operator to convert "école" to
uppercase, not everybody will like it to become "ÉCOLE". This is the
behaviour of Win32 API. But for most usage people expect "ECOLE" to come
out of the 'uppercasing' operator or function.
If a choice must be made between both, my very personal preference goes
to "école" which is uppercased as "ECOLE" (without accent so).
I'd like to read what other non-english native speaking people do think
or comment on this subject.
---------------------------------------------------------------------
Olivier Mascia T.I.P. Group SA
om@... www.tipgroup.com
Director, Chief Software Architect +32 65 401111
------------------------------------------------------------------------
@Backup- Protect and Access your data any time, any where on the net.
Try @Backup FREE and recieve 300 points from mypoints.com Install now:
http://click.egroups.com/1/2345/3/_/_/_/954354039/
------------------------------------------------------------------------
To unsubscribe from this group, send an email to:
IB-Architect-unsubscribe@onelist.com