firebird-java - Re: [Firebird-Java] GDSException

Subject	Re: [Firebird-Java] GDSException
Author	urswagner@bluemail.ch
Post date	2004-01-31T18:58:51Z

Hi Rick

I am using a ISO8859_1 database. Can I put this in url connection string?
Please tell me the correct specification.

Thanks for help.

Urs

>-- Original-Nachricht --
>To: <Firebird-Java@yahoogroups.com>
>From: "Rick Fincher" <rnf@...>
>Date: Fri, 30 Jan 2004 19:43:23 -0500
>Subject: Re: [Firebird-Java] GDSException
>Reply-To: Firebird-Java@yahoogroups.com
>
>
>Hi Urs,
>
>You need to open the connection to the database with ctype set to the same
>characterset that the database was created with.
>
>The following is a new entry in the faq that hasn't been released yet.
>
>Hope this helps,
>
>Rick
>
>46- Can you explain how character sets work?
>
>return to top
>
>Character handling with JayBird can be confusing because the Java VM,
>Firebird database, browser, and JayBird Connection all have a charset
>associated with them.
>
>Also, the Firebird server attempts to transliterate the internal charset
>of
>a database to the charset specified in the connection. JayBird also attempts
>to translate the JVM charset to the Firebird server charsets specfied in
>the
>connection.
>
>With Firebird the character encoding of the text data stored in the database
>is set when the database is created. That applies to the char and varchar
>columns, and type 1 blobs (text blobs). You can override the default charset
>for columns with the appropriate SQL commands when the columns are created.
>Be careful if you do this or you may end up with two columns in the same
>table that you can't read with the same connection.
>
>The only situation when this problem can happen is when you have a table
>with columns that have the "NONE" character set and some other character
>set
>("UNICODE_FSS", "WIN1252", etc). The server tries to convert characters

from

>the encoding specified for the column into the encoding specified for
>connection. The "NONE" character set allows only one-way conversion: from
><any> to "NONE". In this case server simply returns you bytes written in
>the
>database. So if you have table:
>
>CREATE TABLE charset_table(
>col1 VARCHAR(10) CHARACTER SET WIN1252,
>col2 VARCHAR(10) CHARACTER SET NONE
>)
>
>you will not be able to modify both columns in the same SQL statement, and
>it does not matter whether you use "NONE", "WIN1252" or " UNICODE_FSS" for
>the connection.
>
>The only possible way to solve this problem is to use character set
>"OCTETS". This is some kind of artificial character set, similar to "NONE"
>(data are written and read as byte arrays), however there exist
>bi-directional translation rules between any character set (incl."NONE")
>and
>"OCTETS". You can specify "OCTETS" for connection and then decode byte
>arrays you receive from the server yourself, the driver will do
>byte-array-to-string conversion incorrectly, since it does not get a hint
>about the character set from the server.
>
>Let's say your program or web app prompts the user to type a string. If

the

>user types "abc" into a Java text box in your app, or into a text box in
>a
>browser for a web app, Java creates a string for that 3 character string
>in
>Unicode. So the string would actually have 9 bytes in it- three Unicode
>characters of three bytes each.
>
>INSERTing this in a text column in the database would result in nine bytes
>being inserted without translation.
>
>Note: You have to pass correct unicode strings to the driver. What is
>"correct unicode" string? It is easier to explain what is not a correct
>unicode string.
>
>Let's assume you have normal text file in WIN1251 encoding. In this case
>cyrillic characters from the unicode table (values between 0-65535) are
>mapped into the characters with values 0-255. However, your regional
>settings say that you're in Germany. This means that file.encoding will

>set to Cp1252 on JVM start. If you now open the file and construct a reader
>without specifying that character encoding is Cp1251, Java will read the
>file and construct your strings. However all cyrillic characters will be
>replaced by characters from the Cp1252 encoding that have the same number
>representation as the cyrillic ones.
>
>These strings are valid unicode strings, however their content is not the
>content you read from the file. Interestingly enough, if you write such
>strings back into the file, and open it in some text editor saying that

this

>is WIN1251 text, you will see correct text.
>
>If you open your connection to the database without specifying a character
>set (lc_ctype) it defaults to NONE and no translation is done. So when you
>SELECT the previously inserted data from the database and display it in

your

>program you get the same string you entered, right? Well, not necessarily.
>
>You will get a string with the same nine bytes in it that were stored, but
>if the user getting that string from the database has a different default
>charset in his Java VM those bytes will display differently.
>
>The JVM usually picks up its locale dependent character encoding from the
>underlying operating system, but it can also be set when you invoke the

JVM

>by using -Dfile.encoding=Cp1252, for example. If you attempt to display
>characters that aren't in your default JVM encoding they apear as '?'.
>
>The only way to insure you always get back what you put in is to create

the

>database with a charset and set lc_ctype to the same charset when you open
>the connection to that database.
>
>If want to use charsets other than NONE and you have lots of data in
>databases with a charset of NONE, you may have to set up a new database

with

>a different charset and use a data pump to transfer data over, or write

>small program to do the transfer.
>
>Using UNICODE_FSS works well nearly everywhere but may increase the size
>of
>your databases if you have lots of text because Unicode uses characters

>to 3 bytes long.
>
>There are some limitations regarding UNICODE_FSS character set: there's

only

>one collation, where strings are sorted by the natural order, and not
>collation rules for different languages; there are some issues when
>converting them to upper case, etc. More information on these anomalies

can

>be found in the Firebird-Support group.
>
>Again, the default charset for Firebird is NONE. The Firebird server does
>no
>translation with this charset.
>
>If your database has a charset of NONE and you set a charset type on the
>connection (lc_ctype) that is not NONE, you can write to the database but
>you can't read from it without getting the "Cannot transliterate between
>character sets" exception.
>
>Let's follow a string as it gets inserted into the database and later
>selected from the database. For this example we will set the database
>charset to NONE.
>
>See the freely available Interbase 6 PDF manuals "Data Definition Guide"
>for
>a list of charsets available.
>
>The WIN125X or ISO8859_1 charsets may be a good choice for you if you need
>the non-English characters but want the compactness of the 1 byte
>characters. With these char sets you can specify many different national
>language collation orders in the ORDER BY clause of the SELECT statement.
>
>Let's look at the same example above, but this time we will insert into

>database that has been created with a charset of WIN1251.
>
>When you open the connection to the database you set the lc_ctype=WIN1251.
>Then insert the string 'abc' into the appropriate column. JayBird has to
>take the Unicode encoded Java String "abc"and convert it to WIN1251 format
>and send it to the database server for insertion. Since the database is
>already in WIN1251 format, the server does not have to translate. When the
>string is read back from the database it is converted back to the Java VM
>format by JayBird.
>
>It is also possible to set an lc_ctype in a connection that is different
>from the charset of the database. This lets the database server do the
>translating from one charset to another. This is a feature of the Firebird
>server that lets programming languages or programs that require specific
>character formats to connect to the database without requiring the data

>be stored in that format.
>
>You can also avoid problems by using java.sql.PreparedStatement instead

>java.sql.Statement and not building SQL strings out of concatenated Java
>strings. For example:
>
>String sqlString, firstName="John", lastName="O'Neal";
>
>sqlString = "INSERT INTO nameTable (LNAME, FNAME)
>VALUES('"+lastName+"','"+firstName+"')";
>
>Statement stmt = connection.createStatement();
>int insertedRows = stmt.executeUpdate(sqlString);
>
>The problem here is that if the user types in data for these strings you
>might end up with illegal characters, or the translation might not be
>correct.
>
>In the example above, the following illegal SQL string would be generated
>and cause an exception to be thrown because of the apostrophe in O'Neil:
>
>INSERT INTO nameTable (LNAME, FNAME) VALUES('O'Neal', 'John')
>
>To avoid this, use a prepared statement like in the example below.
>
>PreparedStatement stmt = connection.prepareStatement("INSERT INTO
>nameTable(LNAME,FNAME) VALUES(?, ?)");
>
>stmt.setString(1, lastName);
>stmt.setString(2, firstName);
>
>int insertedRows = stmt.executeUpdate();
>
>if (insertedRows != 1)
>throw new MyInsertFailedException("Could not insert data");
>
>
>----- Original Message -----
>
>>
>> Hello
>>
>> I get the following exception:
>>
>> org.firebirdsql.gds.GDSException: arithmetic exception, numeric overflow,
>> or string truncation
>>
>> Who can help me? The reason is I make a JDBC insert of 'hürdli'. The
>problem
>> is the ü.
>> What must I change?
>>
>> Thank You very much.
>>
>> Regards
>>
>> Urs Wagner
>
>
>
>
>
>Yahoo! Groups Links
>
>To visit your group on the web, go to:
> http://groups.yahoo.com/group/Firebird-Java/
>
>To unsubscribe from this group, send an email to:
> Firebird-Java-unsubscribe@yahoogroups.com
>
>Your use of Yahoo! Groups is subject to:
> http://docs.yahoo.com/info/terms/
>
>