Subject | Re: [firebird-support] Calcurate row size |
---|---|
Author | Ann W. Harrison |
Post date | 2004-07-13T21:51:06Z |
At 01:13 AM 7/13/2004, Listeny wrote:
integer 4 bytes
short 2 bytes
int64 8 bytes
numeric (1,x) to numeric (4,x) 2 bytes
numeric (5,x) to numeric (9,x) 4 bytes
numeric (10,x) to numeric (18,x) 8 bytes
numeric > (18,x) 8 bytes
decimal (1,x) to decimal (4,x) 2 bytes
decimal (5,x) to decimal (9,x) 4 bytes
decimal (10,x) to decimal (18,x) 8t bytes
decimal > (18,x) 8 bytes
float 4 bytes
double 8 bytes
char(n) n bytes
varchar (n) n+2 bytes
blob 8 bytes
array 8 bytes
Then it gets harder. Every record has a 13 byte header and a 4 byte index
entry on page. Fragmented records have an additional 4 bytes of fragment
header. For every field, there's a bit that indicates null status, so for a
record with n fields there are (n + 7)/8 bytes of null flags. The page
itself has 28 bytes of overhead, and 13 bytes of free space is reserved for
each record stored in a page.
Then you ought to consider the actual blob/array sizes. Although blob or
array fields take up only 8 bytes in the record, they are stored without
compression so however big they are, they take that much space somewhere,
plus a certain amount of overhead. In the case of blobs and arrays, the
overhead is negligible.
Then it gets a bit harder still. Depending on the hardware and operating
system architectures, the record includes some number of pad bytes so that
data ends up aligned correctly on word, long, or quad boundaries.
Then it gets data dependent. The entire record excluding the header (13 or
17 bytes, depending on fragmentation) is subject to run length compression,
using a single byte to indicate run length. For a record that has no
duplicate bytes, you must add one byte for every 128 bytes of data
length. For a record with more than two adjacent bytes with the same
value, add two bytes and subtract the length of the run of identical
values. The compression does not recognize field values, so a four int64
values all containing the value 0 will compress from 24 bytes to 2.
Then you might want to consider the index overhead... or maybe not.
Cheers,
Ann
P.S. A dbkey is eight bytes of which the first 4 currently contain the
relation id and the second 4 express the record number within the
database. In the relatively near future, the portion of the dbkey allotted
to the record number will increase to allow more records per table.
At the moment, a database is limited to 2**32 pages - probably actually
2**31 because of coding errors. A blob or array id concatenates four bytes
of page number and four bytes of offset on page.
>Hi,Believe it or not, that's a hard question. You can start with the easy part.
>
> Would someone tell me resources/documents of 'how to calculate number of
>bytes on each row' for FB1.5 ?
integer 4 bytes
short 2 bytes
int64 8 bytes
numeric (1,x) to numeric (4,x) 2 bytes
numeric (5,x) to numeric (9,x) 4 bytes
numeric (10,x) to numeric (18,x) 8 bytes
numeric > (18,x) 8 bytes
decimal (1,x) to decimal (4,x) 2 bytes
decimal (5,x) to decimal (9,x) 4 bytes
decimal (10,x) to decimal (18,x) 8t bytes
decimal > (18,x) 8 bytes
float 4 bytes
double 8 bytes
char(n) n bytes
varchar (n) n+2 bytes
blob 8 bytes
array 8 bytes
Then it gets harder. Every record has a 13 byte header and a 4 byte index
entry on page. Fragmented records have an additional 4 bytes of fragment
header. For every field, there's a bit that indicates null status, so for a
record with n fields there are (n + 7)/8 bytes of null flags. The page
itself has 28 bytes of overhead, and 13 bytes of free space is reserved for
each record stored in a page.
Then you ought to consider the actual blob/array sizes. Although blob or
array fields take up only 8 bytes in the record, they are stored without
compression so however big they are, they take that much space somewhere,
plus a certain amount of overhead. In the case of blobs and arrays, the
overhead is negligible.
Then it gets a bit harder still. Depending on the hardware and operating
system architectures, the record includes some number of pad bytes so that
data ends up aligned correctly on word, long, or quad boundaries.
Then it gets data dependent. The entire record excluding the header (13 or
17 bytes, depending on fragmentation) is subject to run length compression,
using a single byte to indicate run length. For a record that has no
duplicate bytes, you must add one byte for every 128 bytes of data
length. For a record with more than two adjacent bytes with the same
value, add two bytes and subtract the length of the run of identical
values. The compression does not recognize field values, so a four int64
values all containing the value 0 will compress from 24 bytes to 2.
Then you might want to consider the index overhead... or maybe not.
Cheers,
Ann
P.S. A dbkey is eight bytes of which the first 4 currently contain the
relation id and the second 4 express the record number within the
database. In the relatively near future, the portion of the dbkey allotted
to the record number will increase to allow more records per table.
At the moment, a database is limited to 2**32 pages - probably actually
2**31 because of coding errors. A blob or array id concatenates four bytes
of page number and four bytes of offset on page.