Subject | Re: Converting of an ISO8859_1 database to UTF8 |
---|---|
Author | patrick_marten |
Post date | 2012-11-05T08:01:42Z |
--- In firebird-support@yahoogroups.com, "Pierre Y." <pierre.y@...> wrote:
thank you, that's an interesting alternative!
Although I would need to install the Unified Interbase components for this.
I've tried to just execute the compiled exe, but somehow it doesn't work. Isn't FBClone made for upgrades from FB 1.5 to 2.x only?
>Hi Pierre,
> Not sure but FBClone (via
> http://code.google.com/p/fbclone/wiki/ConvertBadEncodedFirebirdDatabases)
> could help.
>
> I embedded its code into my application to provide automagic ISO8859_1 to
> UTF8 conversion for my application (including BLOB SUBTYPE 1 containg XML
> data)
thank you, that's an interesting alternative!
Although I would need to install the Unified Interbase components for this.
I've tried to just execute the compiled exe, but somehow it doesn't work. Isn't FBClone made for upgrades from FB 1.5 to 2.x only?
> On Thu, Nov 1, 2012 at 8:43 AM, patrick_marten <patrick_marten@...>wrote:
>
> > **
> >
> >
> > Hello,
> >
> > I'm working on a new release of my application, in which some bigger
> > changes are supposed to happen. I need some assistance please :)
> >
> > Sorry, it's a bit long, but it's about a sensitive matter, so I wanted to
> > provide as many details as possible.
> >
> > 1. The current version
> > -----------------------
> > 1.1 It's using Firebird 2.1.3.18185 to access a database with ISO8859_1
> > character set.
> > 1.2 Several tables of the database have fields declared as follows:
> > FIELDX BLOB SUB_TYPE 1 SEGMENT SIZE 80 CHARACTER SET ISO8859_5,
> > FIELDY VARCHAR( 100) CHARACTER SET ISO8859_5,
> > Those fields weren't used so far, because IBO didn't support unicode when
> > this version was created, but they do play a role below, so I'm mentioning
> > them.
> > 1.3 Several tables contain BLOB SUB_TYPE 1 fields, which store RTF-text in
> > them
> >
> > 2. The new version
> > -------------------
> > 2.1 Although it's not neccessary, I was planning to use the latest
> > Firebird version - currently 2.5.1.26351
> > 2.2 Since IBO supports unicode now, my application is supposed to support
> > it as well, so that the "new" database will have UTF8 as default character
> > set
> > 2.3 Fields mentioned in 1.2 will no longer have to have a special
> > character set - everything UTF8
> > 2.4 Fields mentioned in 1.3 have to become BLOB SUB_TYPE 0 fields, as the
> > content will be stored in a different way in the new version. The content
> > must not get lost of course
> > 2.5 A lot of structure changes will come with the new version too (new
> > tables, fields etc.)
> >
> > While all the changes are not a problem for new customers, they pretty
> > much are a big issue for existing customers, as I will have to "convert"
> > their existing databases, once they install the new version. So the
> > requirements from point 2 have to happen somehow on their machines.
> >
> > 3. My ideas / concerns so far
> > ------------------------------
> > 3.1 Using the script "convert2utf8.sql" from IBExpert site (
> > http://www.ibexpert.com/download/character_set_conversion/)
> > I've tried it, but am not very happy with the result:
> > 3.1.1 Problem #1 is: the fields mentioned in 1.2 remain as they are, i.e.
> > ISO8859_5 in that case. The script probably has to be executed several
> > times - each time with a different source_characterset, i.e.
> > source_characterset=ISO8859_1, source_characterset=ISO8859_5, ...
> > 3.1.2 Problem #2 is: although all other columns seem to have UTF8 as
> > character set, the database itself still seems to have ISO8859_1 as
> > character set (showed in IBExpert)
> > 3.1.3 Problem #3 is: after my attempt to convert an existing database it
> > seems to miss 5 tables, 6 views, 58 indicies and to have 3 triggers more
> > than in the source database...
> > 3.1.4 Structure changes (2.4 and 2.5) would have to happen additionally
> > afterwards
> > All in one I'm not very convinced by this option...
> >
> > 3.2 Using a component like TIB_DataPump
> > I don't have any experience with it and don't know, if that's an option to
> > accomplish what I need. The thoughts so far:
> > 3.2.1 I probably would have to make an UTF8 database first, which has the
> > same structure as the source database - just different character sets
> > 3.2.2 One would be accessed by a connection with ISO8859_1 as character
> > set, one by a connection with UTF8 as character set
> > 3.2.3 However the component would pump the data, it would get converted to
> > UTF8 that way (I hope)
> > 3.2.4 Structure changes (2.4 and 2.5) would have to happen additionally
> > afterwards
> >
> > 3.3 Doing everything manually
> > Means probably the most work, but appears to be the safest way at the
> > moment.
> > 3.3.1 I could start with the final UTF8 database, i.e. with all the
> > structure changes etc. included
> > 3.3.2 The source database would be accessed by a connection with ISO8859_1
> > as character set, the UTF8 one by a connection with UTF8 as character set
> > 3.3.3 Going through all the tables and "pump" the data manually into the
> > new database, whereby the conversion to UTF8 would happen automatically (I
> > hope). Good thing would be, that I could do the steps like 2.4 at the same
> > time
> >
> > 3.4 Some other / better way(s) to accomplish what I need, you could think
> > of
> >
> > The new version will be installed into a new folder, so that there will be
> > files of two different firebird versions on the machine, until the old
> > version gets uninstalled. So I also would need to know when and how the
> > jump to Firebird 2.5.1 should be done, especially considering the note I
> > found in one of the PDF files: "IMPORTANT: Before upgrading please pay
> > attention to the files in the /misc/upgrade/metadata directory of your
> > Firebird 2.1 installation as an essential step for preparing your databases
> > for use under Firebird 2.1. If your databases contain metadata fields (like
> > object names, comments/descriptions, computed field definitions, PSQL, view
> > or constraint sources) storing non-ASCII data then the backup/restore
> > upgrade cycle may not be enough to make them usable." This one sounds as if
> > it's not relevant for me, as my databases are already used under Firebird
> > 2.1, but I also believe having read, that about the same applies to blob
> > fields and a change from Firebird 2.1 to Firebird 2.5.1. Maybe I mix up
> > things here though...
> >
> > Regarding 3.1.4 and 3.2.4: I've found this example on how to change the
> > blob type:
> > UPDATE RDB$FIELDS SET RDB$FIELD_SUB_TYPE = 0
> > WHERE RDB$FIELD_NAME=(SELECT RDB$FIELD_SOURCE FROM RDB$RELATION_FIELDS
> > WHERE RDB$FIELD_NAME='SOME_TEXT_BLOB_FIELD' AND
> > RDB$RELATION_NAME='SOME_TABLE');
> > Seems to work without the content getting lost. Sure, it's in the wrong
> > format that way, but it seems to be possible to obtaint the content and
> > then to save it in the new format. But depending on the choice of the way
> > to go, this could be irrelevant.
> >
> > All in one I would like to ask you for some assistance and an as detailed
> > instruction as possible on how to proceed to accomplish the requirements
> > from point 2, as this is a sensitive step and must not make any mistakes :)
> >
> > Many thanks and kind regards,
> > Patrick
> >
> >
> >
>
>
> [Non-text portions of this message have been removed]
>