Subject | Re: [firebird-support] UTF-8 |
---|---|
Author | Stefan Heymann |
Post date | 2006-10-17T07:20:50Z |
Nigel,
a somewhat weak implementation of UTF-8.
sets are unable to store characters for all languages at the same
time.
An example:
When you use ISO8859_1 or WIN1252, you will be able to store text for
probably all western European languages (like English, French, German,
Italian, etc.). But you will get problems with eastern European
languages like Czech or Polish. Their special characters will get
modified. Asian language text like Japanese will not be storeable at
all.
Only UTF-8 will give you the possibility to store all langauges, and
all in one string (varchar or text blob). Your client software must
however be written with Unicode in mind.
You can get more information on my Firebird Character Set page at
http://www.destructor.de/firebird/charsets.htm
and you could take a look at the slides of my talk at last year's
Firebird Conference about "What Developers Should Know about Character
Sets, Unicode etc.":
http://www.destructor.de/talks/fb2005-charsets.zip
If you plan to come to this year's conference, this talk might
interest you:
http://www.ibphoenix.com/main.nfs?a=ibphoenix&page=fb_conf_speakers_2006#heymann
Best Regards
Stefan
--
Stefan Heymann
www.destructor.de/firebird
> This is possibly a stupid question, but I'm building a system that needs toIt's not stupid, because you're not alone :-)
> support any/all known languages
> Does Firebird fully support UTF-8 in varchars and blobs?UTF-8 support is built into Firebird 2. Firebird up until 1.5 only has
a somewhat weak implementation of UTF-8.
> Is this the best charset for all languages?Yes, UTF-8 is the best charset for all languages. All other character
sets are unable to store characters for all languages at the same
time.
An example:
When you use ISO8859_1 or WIN1252, you will be able to store text for
probably all western European languages (like English, French, German,
Italian, etc.). But you will get problems with eastern European
languages like Czech or Polish. Their special characters will get
modified. Asian language text like Japanese will not be storeable at
all.
Only UTF-8 will give you the possibility to store all langauges, and
all in one string (varchar or text blob). Your client software must
however be written with Unicode in mind.
You can get more information on my Firebird Character Set page at
http://www.destructor.de/firebird/charsets.htm
and you could take a look at the slides of my talk at last year's
Firebird Conference about "What Developers Should Know about Character
Sets, Unicode etc.":
http://www.destructor.de/talks/fb2005-charsets.zip
If you plan to come to this year's conference, this talk might
interest you:
http://www.ibphoenix.com/main.nfs?a=ibphoenix&page=fb_conf_speakers_2006#heymann
Best Regards
Stefan
--
Stefan Heymann
www.destructor.de/firebird