Subject Re: [Firebird-Architect] Compression
Author Thomas Steinmaurer
> 27.09.2011 10:18, Thomas Steinmaurer wrote:
>>> Records are already compressed as delta + RLE. Page level compression is pointless
>>> because page of variable size is a headache.
>>
>> Still fixed page size on disk with compressed data.
>
> And this is exactly why page compression is pointless: you are writing to disk full
> page size anyway, no difference between page with uncompressed data and page with
> compressed data + some filling zeros.

Ehm, doesn't more compressed data fits into a page compared to
uncompressed data?

>>> What is "table level" compression I have no idea.
>>
>> Compression of a particular table instead of the entire database/pages.
>
> It sounds as a record compression turned on for particular tables only. Nice for Oracle
> who didn't have compression before, but strange for FB who always compressed data.

Yes, but possibly there is room for improvement. The following is very
simplified: Imagine, I have a database with one table currently at the
size of 560MB. Compressed with a file compressor, it's cut down to
160MB. So, a very naive statement is that for a SELECT COUNT(*), the
full-table scan has to read much less from disk than the uncompressed
version and as disk I/O is still the bottle-neck of a lot of systems out
there. Talking about mechanical drives here and not SSD. This might be
an improvement, because CPU/SMP power will steadily increase. And if the
compression/decompression algorithm can utilize a SMP environment, even
better. And if compression not only applies to disk but also to network
bandwith, even even better.

In the NoSQL world, storing TB/PB of data in a distributed storage
system like Hadoop/HBase, ALWAYS recommends compression for the sake of
disk usage and bandwith usage. Why not investigate that principle for
single node RDBMS installations?

Regards,
Thomas