Subject | RE: [Firebird-Architect] Blob levels |
---|---|
Author | Steve Summers |
Post date | 2005-10-13T17:08:30Z |
Dmitry wrote:
All,
Now we have three blob levels (0, 1, 2) which define how the blob is physically stored. Level 0 means on-page data, level 1
contains pointers to data pages, level 2 contains pointers to pointers to data pages. Hence a maximum blob size is somewhere
near:
(page_size / 4) * (page_size / 4) * page_size
which means ~64MB for 1K pages and ~4GB for 4K pages.
I don't know whether this restriction is documented somewhere or not. But the real problem is that the code never checks whether
a level 2 blob can be overfilled. Just try to load a 65MB blob into a 1K page size database and the engine bugchecks. I've just
added an overflow check into v2.0 to report a better error message instead. But I have a question: do we consider this
limitation okay? Or perhaps we could extend the current scheme to have level 3, 4 etc blobs to address bigger data? Of course,
this talk is not about v2.0.
Opinions?
----
Isn't the page size limit now 16K? If so, the absolute limit with the current scheme is something like 268 gigabytes, right?
I hate to disagree with the architect, but as a developer, I'm having a difficult time imagining a real-world use case for
saving individual attachments that are several times larger than the entire Star Wars series recorded in High definition as one
giant file. Even with the fastest hardware available today, the engine would take at least minutes just to extract that much
data from the drive, pretty much locking up the system from doing anything else the whole time. That doesn't sound like a
particularly good design, to me. I'd break it up into separate movies, at least.
Now, I do see real-world cases where 64MB isn't enough- but if I have an application that is storing attachments that can get
anywhere near 64M, let alone beyond, I'd be stupid to select the smallest page size and maximize the page management overhead,
wouldn't I? Or am I missing something?
I agree that arbitrary limits are bad, and limits that require more complex designs of the applications using the database
engine to get around the limitations are really bad. But being "forced" to set the page size to more than 4K if I'm storing a
DVD collection and 4G might not be enough sometimes doesn't seem like a "More complex design" of my application- it seems like a
sensible design decision.
Wouldn't the time required to extend the three level scheme to 4 or more levels, to make it possible for stupid design decisions
(e.g. setting page size to 1K for a database full of huge attachments) to still work, be better spent on features that are
actually useful?
All,
Now we have three blob levels (0, 1, 2) which define how the blob is physically stored. Level 0 means on-page data, level 1
contains pointers to data pages, level 2 contains pointers to pointers to data pages. Hence a maximum blob size is somewhere
near:
(page_size / 4) * (page_size / 4) * page_size
which means ~64MB for 1K pages and ~4GB for 4K pages.
I don't know whether this restriction is documented somewhere or not. But the real problem is that the code never checks whether
a level 2 blob can be overfilled. Just try to load a 65MB blob into a 1K page size database and the engine bugchecks. I've just
added an overflow check into v2.0 to report a better error message instead. But I have a question: do we consider this
limitation okay? Or perhaps we could extend the current scheme to have level 3, 4 etc blobs to address bigger data? Of course,
this talk is not about v2.0.
Opinions?
----
Isn't the page size limit now 16K? If so, the absolute limit with the current scheme is something like 268 gigabytes, right?
I hate to disagree with the architect, but as a developer, I'm having a difficult time imagining a real-world use case for
saving individual attachments that are several times larger than the entire Star Wars series recorded in High definition as one
giant file. Even with the fastest hardware available today, the engine would take at least minutes just to extract that much
data from the drive, pretty much locking up the system from doing anything else the whole time. That doesn't sound like a
particularly good design, to me. I'd break it up into separate movies, at least.
Now, I do see real-world cases where 64MB isn't enough- but if I have an application that is storing attachments that can get
anywhere near 64M, let alone beyond, I'd be stupid to select the smallest page size and maximize the page management overhead,
wouldn't I? Or am I missing something?
I agree that arbitrary limits are bad, and limits that require more complex designs of the applications using the database
engine to get around the limitations are really bad. But being "forced" to set the page size to more than 4K if I'm storing a
DVD collection and 4G might not be enough sometimes doesn't seem like a "More complex design" of my application- it seems like a
sensible design decision.
Wouldn't the time required to extend the three level scheme to 4 or more levels, to make it possible for stupid design decisions
(e.g. setting page size to 1K for a database full of huge attachments) to still work, be better spent on features that are
actually useful?