Subject Re: [firebird-support] Re: What are the trade-offs of CHAR vs. VARCHAR?
Author Mark Rotteveel
On 2019-11-13 17:37, blackfalconsoftware@... [firebird-support]
wrote:
> Your statement, which may be true, does not sit well with a previous
> statement that states that a VARCHAR field of 1000 characters is
> stored in the table with 1000 characters.
>
> Your statement is suggesting a null length (until the field is
> updated) with two bytes for an actual length, which is how VARCHAR
> fields work in all other databases to my knowledge. Otherwise, to
> follow the previous statement, a Firebird table could have a VARCHAR
> field for 1000 characters, be stored as such with the initial
> storage-info bytes holding the actual length, which would be 1000.
> When the field is updated to lets say 20 characters of data, the field
> would still have an actual size of 1000 characters but the
> storage-info would be 20.
>
> How does this make any sense?

A record on-disk is a compressed version of the in-memory image of the
data of a row. A VARCHAR(1000) in server memory has 2 bytes for the
actual data and 1000 bytes for the data, and 2 bytes padding to make it
a multiple of 4. Bytes beyond the actual data length are zeroed-out.
When written to disk, a RLE compression is applied, so a VARCHAR(1000)
containing 'A' will be something like (in hexadecimal):

03 01 00 41 80 00 80 00 80 00 80 00 80 00 80 00 80 00 97 00

Where
03 means: next 3 bytes literally
01 00 means: actual length 1
41 means: literally A
80 00 means: repeat 128 times 00 (x7)
97 00 means: repeat 105 times 00

These last two give a total of 1001 0x00 bytes (zero-padded value +
additional padding to make it a multiple of 4.

A CHAR(1000) with value A would be encoded as:
01 41 80 20 80 20 80 20 80 20 80 20 80 20 80 20 99 20

Where:
01 menas: next 1 bytes literally
41 means: literally A
80 20 means: repeat 128 times 20 (space) (x7)
99 20 means: repeat 103 times 20 (space)

These last two give a total of 999 0x20 bytes (space-padded value), no
additional padding because value is already a multiple of 4.

NOTE: I might have some details wrong, but this is how it basically
work. The Firebird Internal docoument, section data page also describes
this:
https://firebirdsql.org/file/documentation/reference_manuals/reference_material/html/fbint-page-5.html

Mark