Subject Re: [firebird-support] External tables Special characters query
Author Helen Borrie
First, please don't post to this list by
forwarding from your email client or news
client. We can do without all that garbage in
the list mail, thanks, and it messes up threading
for people who want to archive the list mail in an orderly fashion.

The other thing is that you should disable your
company adverts and disclaimers when posting to our lists.

At 03:07 PM 14/02/2006, Mahendra Nepali wrote:

>I am trying to load data into Firebird 1.5
>server via external files. One of columns
>contain text like iBoxx £ non-gilt ex sov/
>sub-sov . This gets inserted into my text file
>via an application. But when I try to retrieve
>rows from Firebird DB I get the following
>instead iBoxx £ non-gilt ex sov/ sub-so. I am
>totally lost why this occurs and due to which
>the application crashes. The application is
>using external files since I have to load huge
>volume of data into the DB. Please help.

You probably don't have any character set
specified for the database; or you do and the
character set of the incoming data is different.

The default charset for a database, if not
specified in the CREATE DATABASE statement, is
NONE. That means the engine will store your data
exactly as it is. Charset NONE can only match
your expectations for ascii characters with a
decimal value below 128. When you retrieve it,
it spits out exactly what is there. The fact
that you thought your input data was £, when
actually it was two bytes, one of which was £,
indicates that not all of your incoming data is
low-range ascii; it's possibly variable-length Unicode, perhaps UTF-8.

If you want the stored text to look the same on
retrieval as it looked when you examined it in a
text editor, you will need to do as follows:

1. Establish the character set of the original text data.
2. Define the destination column in the database
to have the same (or a compatible) character set.
3. If you decide that you need to reconstruct
your database using this particular charset as
the default for the database then you can (and
should) set the lc_ctype of the client
connections to the same. (How you do that
depends on your programming interface.)
4. If you have to handle this data as a special
case, you'll need some mechanism on the client
side to interpret the strings according to this
special case, e.g. by using a CAST expression to
convert the bytes to a varchar of the same
character length as the stored data and
specifying the character set you want for the retrieval.

For a varchar(40) named DESCRIPTION, that needs
to be output as UNICODE_FSS, for example,
CAST(DESCRIPTION AS VARCHAR(40) CHARACTER SET UNICODE_FSS) AS DESCR_U

4 is the hard one though. We don't have enough
information here to recommend something particularly useful.

./heLen