Subject Re: [firebird-support] Character sets. (Again.)
Author Mark Rotteveel
On 30-3-2015 13:01, Tim Ward tdw@... [firebird-support] wrote:
> Sorry about this, but I really can't work out what's going on here, and
> could do with some clues.
>
> The basic problem is that I'm getting string values of columns into PHP
> padded on the right to four times their correct length.
>
> I've tried various combinations of things and got various weird results.
> Here's one example:
>
> (1) I believe the database default character set is UTF8 (although I
> don't know in detail what that means or what it's used for)
>
> SQL> select rdb$character_set_name from rdb$database;
> RDB$CHARACTER_SET_NAME
> UTF8
>
> (2) I've got a column in a table defined as ASCII (not that that makes
> an awful lot of difference, I get very similar results if it's UTF8):
>
> METERID CHAR( 8) CHARACTER SET ASCII
> COLLATE ASCII,
>
> (3) I connect ISQL without a -ch command line parameter and get the
> field arriving in ISQL as eight characters long:
>
> SQL> select meterid from tblmeterchange;
>
> METERID
> ========
> 0001A6BN
> 0001A6BN
>
> (4) I connect ISQL with -ch UTF8 and get the field arriving in ISQL as
> 32 characters long:
>
> SQL> select meterid from tblmeterchange;
>
> METERID
> ================================
> 0001A6BN
> 0001A6BN
>
> (5) I connect via PHP using ibase_connect with "UTF8" as the character
> set parameter and get this eight character field returned as the eight
> wanted characters padded on the right with 24 spaces.
>
> (6) If I change the definition of the METERID column to UTF8 I similarly
> get 32 characters in PHP.
>
> So what should I be doing to get an eight character value of an eight
> character field turning up as eight characters in PHP? (I don't think I
> have the option of setting the connection charset to anything other than
> UTF8 because I don't know what else it might affect and there are plenty
> of other columns in the database which, unlike this one, can't be
> declared as ASCII ... but then I don't know exactly what the concept of
> "connection character set" means or does anyway.)
>
> I have read
> http://www.firebirdsql.org/file/community/ppts/fbcon11/FbCon2011-Charsets-Heymann.pdf,
> but that doesn't help - it suggests that Firebird will transliterate as
> necessary, but padding an eight character string with 24 spaces to make
> a 32 character string isn't any sort of "transliteration" I'm familiar with.
>

The problem is mainly that when the connection character set is UTF8,
and the field character set is anything other than NONE or OCTETS, the
server will return the field as UTF8. The length of a field in UTF8 is 4
* the declared length (which is the maximum number of bytes required).

It is up to the client (isql, or a driver like the ibase PHP driver,
Jaybird, etc), to correctly interpret a field of length 32 bytes in utf8
to a string of 8 characters; unfortunately not all clients/drivers do
this, so they will act as if it is string of 32 characters.

Mark
--
Mark Rotteveel