Subject | Re: [firebird-support] Varchar size overhead |
---|---|
Author | Ann W. Harrison |
Post date | 2009-01-21T21:36:36Z |
Douglas Tosi wrote:
so it's very tight - and not very commented.
Here's the algorithm, you do the counting on your fingers.
When a record is stored, it is first buffered at its full
declared length - meaning that a varchar 1000 used 1002
bytes, a char 1000 uses 1000 bytes, a double uses 8 bytes,
etc.
Then firebird does run length compression on that buffer.
The first byte is a count. If it's positive <n>, it means
that the next <n> bytes are part of the record. If negative,
it means that the next byte is to be replicated <n * -1>
times.
Thus the string "abcddddef" becomes <3>abc<-4>d<2>ef
Best,
Ann
>Good luck following the compression code - it's used a lot
>> 2.) 127 ->2 (from a friend)
>> 3.) 128 -> 2 (Ivan)
>> 4.) 256 -> 2 (Ann Harrison, message below),
>
so it's very tight - and not very commented.
Here's the algorithm, you do the counting on your fingers.
When a record is stored, it is first buffered at its full
declared length - meaning that a varchar 1000 used 1002
bytes, a char 1000 uses 1000 bytes, a double uses 8 bytes,
etc.
Then firebird does run length compression on that buffer.
The first byte is a count. If it's positive <n>, it means
that the next <n> bytes are part of the record. If negative,
it means that the next byte is to be replicated <n * -1>
times.
Thus the string "abcddddef" becomes <3>abc<-4>d<2>ef
Best,
Ann