Subject Re: [firebird-support] Unicode size
Author Helen Borrie
At 03:49 PM 13/12/2004 -0500, you wrote:

>I have a DB that is unicode except a few fields like email, password, etc
>that I have deemed can/should be ascii.
>
>I had an email field varchar 100 that was unicode. I tried to index it, but
>it said index was too big. So I reduced it to 80. Same.

80 chars would be OK for an index on a UNICODE_FSS column as long as it was
a single-column index. Multi-segment indexes eat up extra bytes. The max.
size of an index, taking in all segments, can't exceed 252 bytes.

Think hard, too, about how beneficial any 252-byte column might be as a
candidate for an index...

>So then I tried ot
>convert it to ASCII, 100. It tehn said "Field must be at least 300 chars".

You can't reduce the size of a char or varchar column if it already
contains data that would "overflow" the requested new size. You can't
"convert" the character set of data by any direct method.

What you can do is the following:

Alter table yrtable add tempcol varchar(100) character set US_ASCII;
commit;
update yrtable set tempcol = cast(EmailCol as varchar(100) character set
US_ASCII)
where EmailCol is not null;
commit;
alter table yrtable drop EmailCol;
commit;
alter table yrtable add EmailCol varchar(100) character set US_ASCII;
commit;
update yrtable set EmailCol = TempCol;
commit;
alter table yrtable drop TempCol;
commit;


>Does FB use 3 chars for each Unicode char?

Yes, always.

>What kind of encoding is that? I
>has assumed it was using UTF 16 which would be 2 bytes, but 3?

Nope. It's UNICODE_FSS, which mangles the characters into some
uncompressed manifestation of UTF-8, see http://en.wikipedia.org/wiki/UTF-8

Search the archives of this list for postings by Peter Jacobi on the topic
of UNICODE_FSS and the way the engine currently handles it. It's enough to
put anyone off using UNICODE_FSS if there is some alternative. Adriano dos
Santos Fernandes is currently reworking the whole thing for implementation
in a future Firebird version (but not Fb 2.0, at current indications...too
many ramifications to go under the QA microscope at short notice...)

./heLen