Subject Re: [Firebird-Architect] Re: Record Encoding
Author Dmitry Yemanov
"Jim Starkey" <jas@...> wrote:
>
> 1. Blob compression is a performance optimization that trades CPU
> cycles to decompress to reduce the number of page reads.
> 2. Blob compression should be automatic and invisible to users

Agreed.

> 3. Blob compression is problematic with blob seek, and an acceptable
> compromise must be found, though the compromise may require a
> per-blob compromise

Let's discuss the possible compromises.

> 4. Client side compression and decompression has the double benefit
> of reducing the server CPU load while reducing the number of byte
> and packets transmitted over the wire

First, this means that no pluggable algorithms are possible, as you already
mentioned. Personally, I can live with this. Second, it brings enough issues
for the single PC usage including the embedded one. Why do I need a good
over-the-wire compression if I have no wire at all? Did you also consider
e.g. Pocket PC usage (with much slower CPUs)?

> 5. Even with client side compression and decompression, the engine
> must be able to decompress blobs to execute operators and UDFs.

Agreed. Should temporary blobs to be also compressed when sent to the
client?

> 6. Provision must be made for future rolling upgrade to better
> compression schemes

Of course.

> The client choice of compression does not dictate what the server must
> do. The server must have the ability to compress an uncompressed blob,
> decompress a compressed blob, or to decompress and recompress with a
> different algorithm based on per field or per database settings. Also,
> the engine cannot encode a blob with a compression algorithm not
> understood by the client.

Too many builtin [de]compression algorithms complicates the client library.
Do you want them to be linked-in or loaded dynamically (i.e. some predefined
set of plugins)?

> The requirements of end to end compression make user supplied
> compression libraries infeasible.

Agreed.

> It also makes compromises to support
> blob seek problematic.

And this is what we must solve.

I see the only solution - add a new blob type (which is independent from
streamed/segmented). I mean a new BPB option - compressed/not compressed.
The default behaviour (no option is specified) is either hardcoded or
settable via the config file. With a default set to compressed, we have all
the benefits of Jim's proposal, but it still keeps other usage options
possible. This solution solves all needs and costs us virtually nothing from
the implementation POV.

Questions:

1) what should a blob seek do for a compressed blob? Decompress in server's
memory or return an error?
2) how should temporary blobs be transfered? Inherit the compress option
from its persistent source? Use a config setting?


Dmitry