Subject Re: [firebird-support] getByte('UTF-8').length
Author Mark Rotteveel
On 15-5-2014 17:02, Łukasz Bączek baczeklu@... [firebird-support] wrote:
> Hello
> I write in Java one thing and needs to check if the data word, eg
> "Warszawa" includes storage space encoded as UTF8

I am not sure what you mean with "data word".

> I started googling for this and found something like this
>
> String s = "Warsaw"
> s.getByte ('UTF-8'). length - returns me probably 7 bytes

Warsaw (without any non-ASCII) will be 6 bytes.

> I have a field in a table (base Firebird)
> declared in this way
>
> "KOD" VARCHAR (6)

What is the character set of the field (or the default character set of
your database)?

> IBExpert multiplies the size of the VARCHAR * 4

Could you describe this in more detail. I don't use IBExpert and I have
no idea what you mean with this.

> So I am also downloading this value multiplied by four what would have
> gave it (24 bytes), but my thinking here is probably incorrect and
> should not multiply this value by 4 because Firebird me screams that
> does not fit.
>
> and here a question for you which value should be taken into account
> because it multiplied like so it means that something kicked while
> getting the size in bytes of UTF8?

Again I am not entirely certain what you mean with this. It sounds as if
you are using connection character set NONE, while the database field is
UTF8.

The storage of a UTF8 field in Firebird is 4x the character length as
that is the maximum needed. Some tools will display the field size based
on (storage size) / (bytes per character), but when the connection
character set is NONE they don't always look at the character set of the
column itself and use (bytes per character) = 1.

Mark
--
Mark Rotteveel