Subject RE: Re[2]: [Firebird-Architect] blobs causes table fragmentation.
Author Leyne, Sean
Dmitry,

> >>As I understand current behavior, blobs less than 256 bytes are
> >>stored at the same data page, in record.
>
> AWH> That's not correct. Any blob that will fit on a data page is
> AWH> stored on a data page, preferably the page with the record,
> AWH> assuming it has space. There's no hard coded limit.
>
> This is bad, and I think I'm right suggesting this feature.

Without empirical evidence to back up your request, the request is no
better than the current implementation -- a bad idea.

You can appreciate that "gut intuition" is not enough to have a feature
added to the engine, proof/analysis is required.


> AWH> Storing small blobs on page with the record reduces I/O if the
> AWH> blob data is referenced -- and small blobs are more likely to be
> AWH> referenced than large blobs.
>
> Client components don't differ is this large blob or not.
> Remembering BDE, using SELECT * FROM TABLE (it's way to hell,
> of course) loads all blobs in that table to cache.
> So, it depends on client components/driver - it load all blobs
> into cache or doesn't load any blobs at all.

But what does the client operation have to do with how the data is
physically stored in the database file?

Based on this observation, I would argue that the current approach of
fitting data into the row data page would be preferred, since the blob
would resolved without needing an additional disk read.


> >>There are some systems that store blobs with different size,
> >>so blob data is spreaded between data and blob pages.
> >>This makes table more "fragmented" and seriously slowdown
> >>record retrieval if 'select' does not selects blob fields.
>
> AWH> Do you have any evidence of that? Other than a hypothesis
> AWH> based on a (questionable) understanding of the implementation?
>
> I had when this idea came during Yaffil tests. Right now I don't
> have test case, sorry.
>
> AWH> based on a (questionable) understanding of the implementation?
>
> Why not? SQZ_block in yaffil was made on the same basis. It was
> made (very little code changes), tested, and produced good results.

In not familiar with the Yaffil implementation, but I'm sure we'd all be
interested in the details of the implementation as well as the testing
methodology and metrics/results.


Sean