firebird-architect - Re: [Firebird-Architect] Re: Record Encoding

Subject	Re: [Firebird-Architect] Re: Record Encoding
Author	Jim Starkey
Post date	2005-05-15T12:20:18Z

Roman Rokytskyy wrote:

>>Why client must know physical storage? It just gets a coded buffer and
>>description how it is coded (compressed), then client will just
>>decompress it. Or I am missing something?
>>
>>
>
>Maybe I was wrong. I meant that client library must know that what is
>sent from the server or the engine must be decompressed. I assumed
>that with Vulcan architecture application would load the DB engine
>library directly, but I suspect that this will go via the client
>library anyway. So, the implication is that we have to have two APIs -
>one for the application (our current Firebird API) and one between db
>engine and client library.
>
>
>

Here are the major points made so far:

1. Blob compression is a performance optimization that trades CPU
cycles to decompress to reduce the number of page reads.
2. Blob compression should be automatic and invisible to users
3. Blob compression is problematic with blob seek, and an acceptable
compromise must be found, though the compromise may require a
per-blob compromise
4. Client side compression and decompression has the double benefit
of reducing the server CPU load while reducing the number of byte
and packets transmitted over the wire
5. Even with client side compression and decompression, the engine
must be able to decompress blobs to execute operators and UDFs.
6. Provision must be made for future rolling upgrade to better
compression schemes

Blob filters have not proven successful for compression, are not user
transparent, and do no provide for client side compression and
decompression. Furthermore, a "compressed" blob subtype is incompatible
with any other subtype specification, making compression and use of
other blob filters mutually exclusive.

Something to be considered is that the client doesn't readily know the
ultimate target field when a user does a setBlob or setBytes on a
prepared statement, so the client doesn't really have the information
necessary to apply differential compression. The data encoding
mechanism has even less information.

The new API, unless I get terminally discouraged and give up the whole
thing, is structured in layers. The upper layers are programmer
friendly -- the existing DSQL API and a JDBC clone, though others are
possible. The lower layer is strictly message based. Compression, if
it takes place on the client, must be implemented in the data
encoding/decoding performed by the upper API layers and reflected in the
encoding itself. This, in turn, dictates that the engine and client
must have a common repertoire of compression algorithms or than each
dynamically restricts itself to a common subset.

The client choice of compression does not dictate what the server must
do. The server must have the ability to compress an uncompressed blob,
decompress a compressed blob, or to decompress and recompress with a
different algorithm based on per field or per database settings. Also,
the engine cannot encode a blob with a compression algorithm not
understood by the client.

The requirements of end to end compression make user supplied
compression libraries infeasible. It also makes compromises to support
blob seek problematic.

[Non-text portions of this message have been removed]