Subject Re: [Firebird-Architect] Compression
Author Thomas Steinmaurer
> В Вт., 27/09/2011 в 10:45 +0200, Dimitry Sibiryakov пишет:
>> 27.09.2011 10:33, Alexander Peshkov wrote:
>>> Must say that for example NTFS supports writing compressed pages to disk
>>> and does not fill it with 0, instead puts more data to the page.
>> Well, when you tell that, I think that we don't need to duplicate this functionality
>> and everybody who want database file to be compressed can just turn NTFS compression on.
> This does not work as expected - with NTFS-compressed file we can't read
> physical block from disk.
>>> We
>>> probably can do something like this - when adding new record/version to
>>> the page compress it and when result does not fit, use another page for
>>> that record/version.
>> Isn't it exactly the way the engine already use? AFAIK, if compressed record doesn't
>> fit to free space on the primary page, it is chained to other page.
> No. Compressing whole page can be more efficient than record by record.
>> Well, we are currently investigating various compression options for
>> an
>> Oracle installation and a whitepaper discusses that CPU overhead for
>> compression/decompression is minimal etc ...
> BTW, can someone provide a link to that white papers. As far as I've
> known before, use of any decompress algorythm except RLE is slower yhan
> reading data from disk. This is the primary reason why we still use RLE.

I'm by far no export on compressin algorith, but I'm currently involved
in a Hadoop/HBase cluster project storing TB/PB of measurement values.

While until now LZO was the preferred algorithm (even due to
license/deployment restrictions), Google released their "Snappy"
algorithm, use in their BigTable implementation. It pretty much offers
the same compression ratio than LZO, it outperforms LZO performance wise.

See here: