Subject | RE: [firebird-support] Data type for small binary data |
---|---|
Author | Dunbar, Norman |
Post date | 2010-04-30T13:26:29Z |
Hi Steffen,
meaningful to me to use BINARY.
I've never heard of VARCHAR losing spaces, leading or trailing.
The BINARY/OCTETS stores data in such a way as to render spaces as 0x00
rather than 0x20 - so that might prevent the 'loss'. FB Book supplement,
page 58 says this: "Two special character sets, NONE and OCTETS, can be
used in declarations. However, OCTETS cannot be used as a connection
character set. The two sets are similar, except that the space character
of NONE is ASCII 0x20, whereas the space character OCTETS is 0x00. NONE
and OCTETS follow different rules to those that other charsets do
regarding conversions."
However, I created a table with two columns, one BINARY and one OCTETS
and put '...123...' into each (a dot represents a space). In isql, they
come back as hex values:
SQL> create table norm (
CON> a varchar(10) character set octets,
CON> b varchar(10) character set binary
CON> );
SQL> commit;
SQL> insert into norm values (' 123 ',' 123 ');
SQL> commit;
SQL> select * from norm;
A B
==================== ====================
202020313233202020 202020313233202020
SQL> commit;
The FB Book supplement says, on page 206: "One trick to note is that
CHAR and VARCHAR types defined in character set OCTETS
(alias BINARY) now display in hexadecimal format. Currently, this
feature cannot be toggled off."
Looking at the above table using IBExpert, shows the data to (still)
have the correct number of leading and trailing spaces.
Looking at how the data are actually stored internally:
SQL> select rdb$relation_id
CON> from rdb$relations
CON> where rdb$relation_name = 'NORM';
RDB$RELATION_ID
===============
132
SQL> select * from rdb$pages
CON> where rdb$relation_id = 132
CON> and rdb$page_type = 4;
RDB$PAGE_NUMBER RDB$RELATION_ID RDB$PAGE_SEQUENCE RDB$PAGE_TYPE
=============== =============== ================= =============
167 132 0 4
SQL> shell;
tux> ./fbdump ../blank.fdb -p 167
DATABASE PAGE DETAILS
=====================
...
Page[0000]: 169
tux> ./fbdump ../blank.fdb -p 169
DATABASE PAGE DETAILS
=====================
...
Data[0000].hex: 01 fc fd 00 02 09 00 fd 20 03 31 32 33 fd 20
03
00 09 00 fd 20 03 31 32 33 fd 20 01 00
Data[0000].ASCII: . . . . . . . . . 1 2 3 .
.
. . . . . 1 2 3 . . .
...
The above is in internal compressed format, as Firebird stores data in
this way. The following bytes represent the column data, when
uncompressed again:
fc 00 00 00 = record header.
09 00 = data length = 9.
20 20 20 31 32 33 20 20 20 00 = three spaces, 123, three spaces,
0x00 padding to varchar(10).
09 00 = data length = 9.
20 20 20 31 32 33 20 20 20 00 = three spaces, 123, three spaces,
0x00 padding to varchar(10).
So it appears as if the spaces are NOT being converted to 0x00 in an
OCTET or BINARY character set.
SQL> show table norm;
A VARCHAR(10) CHARACTER SET OCTETS
Nullable
B VARCHAR(10) CHARACTER SET OCTETS
Nullable
SQL> commit;
You can also see that even though I said BINARY, it does actually create
an OCTETS character set for the column data.
So, I suspect that your binary data is safe enough.
Cheers,
Norman.
Information in this message may be confidential and may be legally privileged. If you have received this message by mistake, please notify the sender immediately, delete it and do not copy it to anyone else. We have checked this email and its attachments for viruses. But you should still check any attachment before opening it. We may have to make this message and any reply to it public if asked to under the Freedom of Information Act, Data Protection Act or for litigation. Email messages and attachments sent to or from any Environment Agency address may also be accessed by someone other than the sender or recipient, for business purposes. If we have sent you information and you wish to use it please read our terms and conditions which you can get by calling us on 08708 506 506. Find out more about the Environment Agency at www.environment-agency.gov.uk
Information in this message may be confidential and may be legally privileged. If you have received this message by mistake, please notify the sender immediately, delete it and do not copy it to anyone else.
We have checked this email and its attachments for viruses. But you should still check any attachment before opening it.
We may have to make this message and any reply to it public if asked to under the Freedom of Information Act, Data Protection Act or for litigation. Email messages and attachments sent to or from any Environment Agency address may also be accessed by someone other than the sender or recipient, for business purposes.
If we have sent you information and you wish to use it please read our terms and conditions which you can get by calling us on 08708 506 506. Find out more about the Environment Agency at www.environment-agency.gov.uk
>> > VARCHAR(...) CHARACTER SET OCTETSyou can also use BINARY for OCTETS if you prefer - it seems more
meaningful to me to use BINARY.
>> Yet I remembered that varchars would loose leading anddoesn't it?
>> trailing spaces? That would mean that any (byte)32 could get lost,
I've never heard of VARCHAR losing spaces, leading or trailing.
The BINARY/OCTETS stores data in such a way as to render spaces as 0x00
rather than 0x20 - so that might prevent the 'loss'. FB Book supplement,
page 58 says this: "Two special character sets, NONE and OCTETS, can be
used in declarations. However, OCTETS cannot be used as a connection
character set. The two sets are similar, except that the space character
of NONE is ASCII 0x20, whereas the space character OCTETS is 0x00. NONE
and OCTETS follow different rules to those that other charsets do
regarding conversions."
However, I created a table with two columns, one BINARY and one OCTETS
and put '...123...' into each (a dot represents a space). In isql, they
come back as hex values:
SQL> create table norm (
CON> a varchar(10) character set octets,
CON> b varchar(10) character set binary
CON> );
SQL> commit;
SQL> insert into norm values (' 123 ',' 123 ');
SQL> commit;
SQL> select * from norm;
A B
==================== ====================
202020313233202020 202020313233202020
SQL> commit;
The FB Book supplement says, on page 206: "One trick to note is that
CHAR and VARCHAR types defined in character set OCTETS
(alias BINARY) now display in hexadecimal format. Currently, this
feature cannot be toggled off."
Looking at the above table using IBExpert, shows the data to (still)
have the correct number of leading and trailing spaces.
Looking at how the data are actually stored internally:
SQL> select rdb$relation_id
CON> from rdb$relations
CON> where rdb$relation_name = 'NORM';
RDB$RELATION_ID
===============
132
SQL> select * from rdb$pages
CON> where rdb$relation_id = 132
CON> and rdb$page_type = 4;
RDB$PAGE_NUMBER RDB$RELATION_ID RDB$PAGE_SEQUENCE RDB$PAGE_TYPE
=============== =============== ================= =============
167 132 0 4
SQL> shell;
tux> ./fbdump ../blank.fdb -p 167
DATABASE PAGE DETAILS
=====================
...
Page[0000]: 169
tux> ./fbdump ../blank.fdb -p 169
DATABASE PAGE DETAILS
=====================
...
Data[0000].hex: 01 fc fd 00 02 09 00 fd 20 03 31 32 33 fd 20
03
00 09 00 fd 20 03 31 32 33 fd 20 01 00
Data[0000].ASCII: . . . . . . . . . 1 2 3 .
.
. . . . . 1 2 3 . . .
...
The above is in internal compressed format, as Firebird stores data in
this way. The following bytes represent the column data, when
uncompressed again:
fc 00 00 00 = record header.
09 00 = data length = 9.
20 20 20 31 32 33 20 20 20 00 = three spaces, 123, three spaces,
0x00 padding to varchar(10).
09 00 = data length = 9.
20 20 20 31 32 33 20 20 20 00 = three spaces, 123, three spaces,
0x00 padding to varchar(10).
So it appears as if the spaces are NOT being converted to 0x00 in an
OCTET or BINARY character set.
SQL> show table norm;
A VARCHAR(10) CHARACTER SET OCTETS
Nullable
B VARCHAR(10) CHARACTER SET OCTETS
Nullable
SQL> commit;
You can also see that even though I said BINARY, it does actually create
an OCTETS character set for the column data.
So, I suspect that your binary data is safe enough.
Cheers,
Norman.
Information in this message may be confidential and may be legally privileged. If you have received this message by mistake, please notify the sender immediately, delete it and do not copy it to anyone else. We have checked this email and its attachments for viruses. But you should still check any attachment before opening it. We may have to make this message and any reply to it public if asked to under the Freedom of Information Act, Data Protection Act or for litigation. Email messages and attachments sent to or from any Environment Agency address may also be accessed by someone other than the sender or recipient, for business purposes. If we have sent you information and you wish to use it please read our terms and conditions which you can get by calling us on 08708 506 506. Find out more about the Environment Agency at www.environment-agency.gov.uk
Information in this message may be confidential and may be legally privileged. If you have received this message by mistake, please notify the sender immediately, delete it and do not copy it to anyone else.
We have checked this email and its attachments for viruses. But you should still check any attachment before opening it.
We may have to make this message and any reply to it public if asked to under the Freedom of Information Act, Data Protection Act or for litigation. Email messages and attachments sent to or from any Environment Agency address may also be accessed by someone other than the sender or recipient, for business purposes.
If we have sent you information and you wish to use it please read our terms and conditions which you can get by calling us on 08708 506 506. Find out more about the Environment Agency at www.environment-agency.gov.uk