Subject | UNICODE_FSS limitations |
---|---|
Author | Steve_Miller@sil.org |
Post date | 2003-12-12T22:38:20Z |
Hi Peter
concerned in particular about CESU-8. (For anyone intereseted, CESU-8 is
found at http://www.unicode.org/reports/tr26/) We have a sister department
that programs fonts. They do this for languages that are not now
computerized. I'm going to have to check to see if the restriction against
4-byte supplementary characters and UTF-8 4-byte surrogates will be a
problem. Does anyone know about this?
I posted a note a day or so ago about custom collations on UNICODE_FSS,
but didn't get any response. Has anyone had any experience with custom
collations?
Steve Miller
Language Software Development
SIL International
[Non-text portions of this message have been removed]
>Please be aware of some UNICODE_FSS limitations:Thanks for mentioning this. I didn't know about any of these points. I'm
>- no checking for illegal or non-shortest form
> sequences
>- it's effectively CESU-8, not UTF-8
>- field length limit isn't correctly enforced
>(you can store 12 ASCII chars in CHAR (3) CHARACTER SET
>UNICODE_FSS.
>
>But I assume your app has plenty of charset clevernerss
>built in and doesn't rely on the DBMS for this.
concerned in particular about CESU-8. (For anyone intereseted, CESU-8 is
found at http://www.unicode.org/reports/tr26/) We have a sister department
that programs fonts. They do this for languages that are not now
computerized. I'm going to have to check to see if the restriction against
4-byte supplementary characters and UTF-8 4-byte surrogates will be a
problem. Does anyone know about this?
I posted a note a day or so ago about custom collations on UNICODE_FSS,
but didn't get any response. Has anyone had any experience with custom
collations?
Steve Miller
Language Software Development
SIL International
[Non-text portions of this message have been removed]