Subject | Re: [firebird-support] How to change charset of a DB ? |
---|---|
Author | Helen Borrie |
Post date | 2010-10-22T20:47:29Z |
At 02:22 AM 23/10/2010, you wrote:
If you have a need to store certain columns in a different charset, that is incompatible with the client environment, then it gets more tricky - use of casting or introducer syntax in the client application may become necessary. It would normally make more sense to store the data in UTF-8 and apply a specific collation to that column.
Ask in firebird-tools about data pumping utilities you might experiment with.
./heLen
>Hello all,..a crocodile waiting for its chance to bite..
>
>After a successful upgrade from 2.1 to 2.5 I realized that I was not too
>carefull about the character settings so far.
>I want now to ensure that UTF-8 is in place in the whole DB.A database has the default character set NONE unless you create the database with the desired default character set - see the language docs and updates at the DocIndex page.
>1) adapt db settings: alter character set UTF-8;This would not do anything to existing data or metadata.
>2) Check all char/vchar fields: most of them have CHARACTER SET NONEYou can discover this from a metadata extraction tool - isql has one; most third-party toolsets have better ones.
>COLLATE NONE or even no character set definition on field level at allThe default collation is always the binary one. Charset NONE has no other collation available. It knows only how to uppercase the 7-bit characters; everything beyond dec 128 is "lost souls".
>3) Check the ODBC settings to use UTF-8 as wellAsk about that on the ODBC-devel list. Each client interface has its own way to pass connection parameters.
>Questions:Yes. Basically, get a metadata extract, edit it to reflect the new requirements and then pump the data across from the existing database.
>a) In order ensure that all existing char/varchar fields are storing the
>data in UTF-8 do need to export the whole DB and import structure and
>data again?
>b) Do I need to adapt a char/varchar fields and set them to empty, NONENo. If you intend for all character fields to be UTF-8 then your datapump will take care of it. If you require a particular collation then put it into the column definition when you edit the metadata script.
>or UTF-8 ?
>c) What happens if a client does not define a charset or defines aStrange things can happen, especially in the lower sub-releases of v.2.1.x...although the Intl machine from 2.1.3 onward is pretty good at telling you about mismatches and blocking them with exceptions. Set your client to the charset that is most compatible with the input data and Intl will do the right thing regarding storage and retrieval.
>different one than UTF-8 ?
If you have a need to store certain columns in a different charset, that is incompatible with the client environment, then it gets more tricky - use of casting or introducer syntax in the client application may become necessary. It would normally make more sense to store the data in UTF-8 and apply a specific collation to that column.
Ask in firebird-tools about data pumping utilities you might experiment with.
./heLen