Subject | Re: thai charset support? |
---|---|
Author | peter_jacobi.rm |
Post date | 2003-11-03T07:31:37Z |
Hi somphong,
--- In firebird-support@yahoogroups.com, "sp2gui" <sp2gui@y...> wrote:
> i appreciate to be a tester.
> and i will ask for other people to test it too.
> thank you for your kindly and quickly response.
May I ask you to have an actual look into
http://www.nectec.or.th/it-standards/thaistd.pdf,
especially the chapter about sorting?
The doc subscribes a carelully laid out algorithm for
dictionary sorting, using a full Unicode four level
sort (if this is all greek to you, perhaps you can
contact a local expert).
From experiences with other locales, I can report the
'normal' users often find the sorting designed by
'experts' somewhat un-intuitive and would like another
sort order. If you have any specific comment, please
be bring it forward.
Another implementatiuon decision to be made, is the
definition of 'character'. What makes up one character
will obviously influence the meaning of field lengths
(as in 'CHAR(4)'), what can be matched against a single
character wildcard in LIKE, the operation of the SUBSTR
operator and some other minor points.
Perhaps this decision is so obvious to native speakers of
Thai, that it is not addressed thaistd.pdf? The easiest,
pragmatic solution is of course one character = one byte,
which will also gives the best performance. Another natural
choice seems to be the 'cell' as mentioned in '3 Input Methods',
i.e. base charcter + upper or lower vowel + diacritic or tone marker,
the same unit which will deleted in whole when pressing the
"Delete" key.
Regards,
Peter Jacobi