Subject | Re: [firebird-support] Re: What are the trade-offs of CHAR vs. VARCHAR? |
---|---|
Author | |
Post date | 2019-11-14T17:01:14Z |
-----------------------------------------------------------------------
CHAR is a fixed-length data type. If the entered number of
characters is less than the declared length, trailing spaces will be added to the field.
Generally, the pad character does not have to be a space: it depends on the character
set, For example, the pad character for the OCTETS character set
is zero.
The full name of this data type is CHARACTER, but
there is no requirement to use full names and people rarely do so.
Fixed-length character data can be used to store codes whose length is standard
and has a definite “width” in directories. An example of such a code
is an EAN13 barcode—13 characters, all filled.
However, the PDF language manual I have for version 2.5 does in fact state the following...
"A fixed-length character data type. When its data is displayed, trailing spaces are added to the string up to the specified length. Trailing spaces are not stored in the database but are restored to match the defined
length when the column is displayed on the client side. Network traffic is reduced by not sending spaces over the LAN. If the number of characters is not specified, 1 is used by default."
But the above statement puts such descriptions at odds with each other based upon where one looks for such information.
VARCHAR DATA TYPE (VARCHAR(n), CHAR VARYING, CHARACTER VARYING)
-----------------------------------------------------------------------
VARCHAR is the basic string type for storing texts of variable length, up to a maximum of 32,765 bytes. The stored structure is equal to the actual size of the data plus 2 bytes where the length of the data is recorded.
All characters that are sent from the client application to the database are
considered meaningful, including the leading and trailing spaces. However, trailing
spaces are not stored: they will be restored upon retrieval, up to the recorded length
of the string.
The full name of this type is CHARACTER VARYING. Another
variant of the name is written as CHAR VARYING.
<<<
This is the information I have been going by in addition to the link that sent me to the IB-Experts web page of the same.
If the CHAR data type is not being stored with the additional padded characters but only done so in memory as Ann Harrison has stated than I have so far only found this information to be at odds with other information I have found regarding this data type.
Now what is being said is that internally, a CHAR data type no longer pads its field to the defined length at the point of creation. In this case then, the field would then always be expanded to the length required of updated data to the maximum defined at field definition time. VARCHAR data on the other hand will act in the same manner except to use its length-info bytes to base its updated storage field size on.
So when did this change? Is this a new difference between Firebird 2.5 and 3.x.x?
Here is a link to a detailed description as to how the major databases store CHAR and VARCHAR data internally. With the exception of SQL Server, which specifies that ANSI_PADDING has to be set to "ON" to store padded spaces with CHAR data types (which, to my knowledge, is how I mostly worked with this database engine), all of the listed database engines (with the exception of SQLite) say about the same thing for such storage as I have been stating here. For SQLite, this would be expected of since it is a not a strongly typed database engine.
If in fact there was a significant change to how the Firebird engine stores such CHAR than it would be mirroring with SQLite and not the other major database engines available. I have worked with MS SQL Server, Oracle, Sybase, MySQL, PostgreSQL (Ingress), and SQLite. However, I did not like using the latter engine and is why I have chosen Firebird for all my current development efforts.
Steve Naidamast
Sr. Software Engineer