Subject RE: [Firebird-Architect] UTF-8 Everywhere
Author IBO Support List
This misconception of length could be due to a past issue between IB Objects
and Firebird 1.x where identifiers were erroneously being truncated to 7 or
10 characters. This is because the XSQLVAR was mistakenly reporting a length
of 31 when it should have reported 93 (as opposed to 63). This issue was
resolved a long time ago. I'm pretty sure Dalton must have been bitten by
that since he is a very long time user of IB Objects.

I'm tending to think that adopting a UTF8 only approach is a step backward
since if someone wants UTF8 only then they can just make use of the UTF8
Charset and only use it. As for everyone else still dealing with windows
wide strings, codepages, etc., this simply imposes a potentially major
rewrite of their applications to conform to this new requirement. Legacy
support is always a factor to consider.

If we want to talk about a step forward in flexibility, I suggest you
consider adding in a universal string where you can have each record
indicate what charset is being stored. This would allow any of the
registered charsets to be stored on a per-record basis. The XSQLVAR
structure already includes charset and collation specification via the
SQLScale values so that it would be possible to have this information picked
up and made use of.

Jason Wharton
www.ibobjects.com

-----Original Message-----
From: Firebird-Architect@yahoogroups.com
[mailto:Firebird-Architect@yahoogroups.com] On Behalf Of Mark Rotteveel
Sent: Friday, January 17, 2014 9:15 AM
To: Firebird-Architect@yahoogroups.com
Subject: Re: [Firebird-Architect] UTF-8 Everywhere

On Fri, 17 Jan 2014 10:33:01 -0500, Dalton Calford
<dalton.calford@...> wrote:
> I would agree in regards to UTF8, but I also think the length of SQL
> identifiers needs to be adjusted - ie 31 char identifiers only give 7 to
8
> characters for some character sets.

Although I agree with an extension of the length of identifiers, your
reason is incorrect: the 31 character limits gives you 31 characters in all
languages; internally system identifiers are UNICODE_FSS, which has 3 bytes
per character, so actual storage is 63 bytes for an identifier. I wouldn't
know how you would store only 7 to 8 characters unless your connection
character set is 'wrong'.

Mark


------------------------------------

Yahoo Groups Links