Subject Re: [Firebird-Architect] blobs causes table fragmentation.
Author Alexandre Kozlov
>
> At 02:44 AM 10/4/2004, Dmitry Kuzmenko wrote:
>
> >As I understand current behavior, blobs less than 256 bytes are
> >stored at the same data page, in record.
>
> That's not correct. Any blob that will fit on a data page is
> stored on a data page, preferably the page with the record,
> assuming it has space. There's no hard coded limit.
>
>
> Storing small blobs on page with the record reduces I/O if the
> blob data is referenced -- and small blobs are more likely to be
> referenced than large blobs.

And this can be resolved by having BLOB data placed into different table

>
> > If blob is bigger it
> >is stored at blob page, not data page.
>
> As Pavel explained, there are three levels of blob storage and
> each of them stores something on page.
>
> >There are some systems that store blobs with different size,
> >so blob data is spreaded between data and blob pages.
> >This makes table more "fragmented" and seriously slowdown
> >record retrieval if 'select' does not selects blob fields.

Absolutely. This is quite clear.
Suppose:
1) you just start database (so cache doesn't work yet);
2) your database is very large mostly due to BLOB which occupy, say, 95%
of space on disk;
3) Simple "SELECT COUNT(*) FROM LargeTable" will probably may take hours (of
course, after cache stabilized it will take just about 1 minute)

>
> Do you have any evidence of that? Other than a hypothesis
> based on a (questionable) understanding of the implementation?
> It's been some years, but when last I compared our blob
> performance with other systems, it was pretty good.

Ann, this is not only a theory. I had and have such a situation. And
simulated separation of searching fields and BLOBs (by putting BLOBs into a
different table and then backup and restore database on empty disk). The
result: after starting the database, searching became almost 1000 times
faster than with fragmentation. Of course, I am talking only about
performance "just after the database starting" - when cache is still empty -
but it can be too long depending of database size . And I believe that even
if not to take into account slow work at the starting of database,
possibility to keep differently searchable data and BLOBs will give you not
only advantage in maintenance but some improvement in performance.

Alexander

>
> >The suggestion is - to add some header page flag that
> >will disable blob storing at data page.
>
> Let's be sure we have a problem before rushing to solutions.
>
>
>
> Regards,
>
> Ann
> www.ibphoenix.com
> We have answers.
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
>
> E1-I: This message has been scanned for viruses and dangerous content by
UML's antivirus scanning services.
>
>
>