firebird-architect - Re: [Firebird-Architect] Compression

Subject	Re: [Firebird-Architect] Compression
Author	Thomas Steinmaurer
Post date	2011-09-27T09:16:01Z

> В Вт., 27/09/2011 в 10:45 +0200, Dimitry Sibiryakov пишет:
>> 27.09.2011 10:33, Alexander Peshkov wrote:
>>> Must say that for example NTFS supports writing compressed pages to disk
>>> and does not fill it with 0, instead puts more data to the page.
>>
>> Well, when you tell that, I think that we don't need to duplicate this functionality
>> and everybody who want database file to be compressed can just turn NTFS compression on.
>>
>
> This does not work as expected - with NTFS-compressed file we can't read
> physical block from disk.
>
>>> We
>>> probably can do something like this - when adding new record/version to
>>> the page compress it and when result does not fit, use another page for
>>> that record/version.
>>
>> Isn't it exactly the way the engine already use? AFAIK, if compressed record doesn't
>> fit to free space on the primary page, it is chained to other page.
>>
>
> No. Compressing whole page can be more efficient than record by record.
>
>
>> Well, we are currently investigating various compression options for
>> an
>> Oracle installation and a whitepaper discusses that CPU overhead for
>> compression/decompression is minimal etc ...
>
> BTW, can someone provide a link to that white papers. As far as I've
> known before, use of any decompress algorythm except RLE is slower yhan
> reading data from disk. This is the primary reason why we still use RLE.

I'm by far no export on compressin algorith, but I'm currently involved
in a Hadoop/HBase cluster project storing TB/PB of measurement values.

While until now LZO was the preferred algorithm (even due to
license/deployment restrictions), Google released their "Snappy"
algorithm, use in their BigTable implementation. It pretty much offers
the same compression ratio than LZO, it outperforms LZO performance wise.

See here:
http://ofps.oreilly.com/titles/9781449396107/performance.html#tbl_compressioncomp

and

http://agaoglu.tumblr.com/post/4605524309/lzo-vs-snappy-vs-lzf-vs-zlib-a-comparison-of

Regards,
Thomas