Subject UNICODE_FSS limitations
Author Steve_Miller@sil.org
Hi Peter

>Please be aware of some UNICODE_FSS limitations:
>- no checking for illegal or non-shortest form
> sequences
>- it's effectively CESU-8, not UTF-8
>- field length limit isn't correctly enforced
>(you can store 12 ASCII chars in CHAR (3) CHARACTER SET
>UNICODE_FSS.
>
>But I assume your app has plenty of charset clevernerss
>built in and doesn't rely on the DBMS for this.

Thanks for mentioning this. I didn't know about any of these points. I'm
concerned in particular about CESU-8. (For anyone intereseted, CESU-8 is
found at http://www.unicode.org/reports/tr26/) We have a sister department
that programs fonts. They do this for languages that are not now
computerized. I'm going to have to check to see if the restriction against
4-byte supplementary characters and UTF-8 4-byte surrogates will be a
problem. Does anyone know about this?

I posted a note a day or so ago about custom collations on UNICODE_FSS,
but didn't get any response. Has anyone had any experience with custom
collations?

Steve Miller
Language Software Development
SIL International

[Non-text portions of this message have been removed]