Subject Re: [firebird-support] How to convert big FIREBIRD DB to one charset (win1251) to UTF8 ?
Author Ann Harrison
On Tue, Jun 3, 2014 at 3:20 AM, marcus marcus@... [firebird-support] <firebird-support@yahoogroups.com> wrote:
loris.luise@... [firebird-support] wrote:
>
> I have a large Firebird 2.5 DB (a couple of tables have few millions
> records) to convert from WIN1251 charset to UTF8 charset.
>
> Is there a way to do efficiently this kind of conversion? (preferably
> not using a intermediate text sql dump, that would not applicable
> because the size of the DB).


maybe this one is of help:
http://www.ibphoenix.com/resources/documents/design/doc_150


Err, probably not.  Unfortunately, if my memory serves, changing the character set of a column has one of two results.  Either it's an error, or it just changes the declared type of the column without changing the data in the column, leading to massive confusion.  

I've been thinking about this problem for a while.  There are some solutions that might work on a very simple database without a lot of dependencies.  An easy one is for each column XXX, create a new column XXX_U  
which is defined as UTF8, update the table setting each XXX_U to equal XXX.  Then drop the XXX column and recreate it as UTF8 and move the data back.  Of course, if you have dependencies (and who doesn't), dropping the original columns - or even deleting their contents - will be a mess.  

Something similar which can be done a table at a time rather than a column at a time is to create an external table that corresponds to each table with columns that need conversions, except that the text columns are declared as UTF8.  Dump the internal tables to external tables.  Then create a new database without indexes or any of the fancy stuff (views, constraints, triggers...) but all the internal tables (UTF8) and their corresponding external tables and copy the data back.  Then, carefully, reintroduce all the complexity.

You might lobby for a proper DDL type conversion clause.  That won't happen quickly because, in general, it doesn't work.  So lobby for a DDL clause that converts anything to UTF8.  

Good luck,

Ann