Subject | Re: [firebird-support] Re: Charset of OCTETS in PK causing problems? |
---|---|
Author | Ann W. Harrison |
Post date | 2004-06-16T17:38:35Z |
Let me try to shed a little light on this problem.
A character set is a mapping between binary values and
the glyphs that we normally read.
A collation is a description of the normal sorting order
of a character set. A single character set can have
several collations.
OCTETS is not really a character set. It's a data type
intended to hold arrays of bytes. In other character
sets, blanks have specific meaning. In OCTETS, blanks
are the byte value 32. If you store "ABC123" in a
char (13) octet field, you'll get an array containing
the binary code for 'A', 'B', 'C', '1', '2', '3' plus
seven bytes containing 32. Probably not what you want.
CHAR is a fixed length string. OCTETS are typically
stored in CHAR fields because byte arrays (e.g. GUID)
are typically fixed length. CHAR is also used for
values in other character sets that are inherently
fixed length (e.g. a US social security number).
VARCHAR is a varying length string, stored as a two
byte integer holding the length and the string. If
your data is inherently varying (e.g. names), use
VARCHAR and avoid spending your life stripping off
trailing blanks.
In creating an index key, Firebird strips off trailing
blanks from char and varchar fields in normal character
sets. For varchar, it drops the length word. The
keys generated by char and varchar are identical if
they have the same contents.
OCTETS are no lighter than any other character type.
The performance and storage characteristics of
chars and varchars are virtually identical. Use
the type that fits your application semantics.
Regards,
Ann
A character set is a mapping between binary values and
the glyphs that we normally read.
A collation is a description of the normal sorting order
of a character set. A single character set can have
several collations.
OCTETS is not really a character set. It's a data type
intended to hold arrays of bytes. In other character
sets, blanks have specific meaning. In OCTETS, blanks
are the byte value 32. If you store "ABC123" in a
char (13) octet field, you'll get an array containing
the binary code for 'A', 'B', 'C', '1', '2', '3' plus
seven bytes containing 32. Probably not what you want.
CHAR is a fixed length string. OCTETS are typically
stored in CHAR fields because byte arrays (e.g. GUID)
are typically fixed length. CHAR is also used for
values in other character sets that are inherently
fixed length (e.g. a US social security number).
VARCHAR is a varying length string, stored as a two
byte integer holding the length and the string. If
your data is inherently varying (e.g. names), use
VARCHAR and avoid spending your life stripping off
trailing blanks.
In creating an index key, Firebird strips off trailing
blanks from char and varchar fields in normal character
sets. For varchar, it drops the length word. The
keys generated by char and varchar are identical if
they have the same contents.
OCTETS are no lighter than any other character type.
The performance and storage characteristics of
chars and varchars are virtually identical. Use
the type that fits your application semantics.
Regards,
Ann