Subject | Re: [Firebird-Architect] blobs causes table fragmentation. |
---|---|
Author | Jim Starkey |
Post date | 2004-10-05T14:09:35Z |
Let me take a few minutes and sketch some the problems the various
proposals face.
The current implementation represents a blob in record with the 32 bit
record number of the blob. On access, the record number is decomposed
into pointer page sequence number, pointer page slot (which gives data
page number), and record number on the pointer page. If blobs are moved
out of record number space, then either the size of the blob pointer
size must be increased or we get stuck with an architectural limitation
of one blob per page. It may have to be increased to accomodate 40 bit
record numbers, but I don't know whether anyone has gone there yet.
Personally, I think the idea behind 40 bit record numbers is deeply
flawed -- the problem isn't with the size of the record number but the
fact that the numbering isn't anywhere near dense. I would rather than
seen either a change in record number calculation or an overhaul of data
storage mechanism, but that's a different issue.
An architectural limit of a single blob per page would be a performance
and operational disaster, wasting a cache slot and data page of 4096
bytes to hold a one byte blob. It is so unthinkably bad to preclude
serious discussion.
Somethings blobs are used to big things and sometimes blobs are used
when the worst case size of a character field isn't known. Street
addresses, for example, are often stored as blobs for no reason other
than there really isn't a maximum length. For these types of blobs,
either intermingling with records or storage in a separate colinear data
space makes sense.
proposals face.
The current implementation represents a blob in record with the 32 bit
record number of the blob. On access, the record number is decomposed
into pointer page sequence number, pointer page slot (which gives data
page number), and record number on the pointer page. If blobs are moved
out of record number space, then either the size of the blob pointer
size must be increased or we get stuck with an architectural limitation
of one blob per page. It may have to be increased to accomodate 40 bit
record numbers, but I don't know whether anyone has gone there yet.
Personally, I think the idea behind 40 bit record numbers is deeply
flawed -- the problem isn't with the size of the record number but the
fact that the numbering isn't anywhere near dense. I would rather than
seen either a change in record number calculation or an overhaul of data
storage mechanism, but that's a different issue.
An architectural limit of a single blob per page would be a performance
and operational disaster, wasting a cache slot and data page of 4096
bytes to hold a one byte blob. It is so unthinkably bad to preclude
serious discussion.
Somethings blobs are used to big things and sometimes blobs are used
when the worst case size of a character field isn't known. Street
addresses, for example, are often stored as blobs for no reason other
than there really isn't a maximum length. For these types of blobs,
either intermingling with records or storage in a separate colinear data
space makes sense.