Subject "Wire protocol" for long content
Author paulruizendaal
The below exchange between Roman and Sean occured in the thread about
blobs causing table fragmentation. I repeat it here because I think
it touches upon a subject that was discussed last Summer on FB-Arch.

>> Current behavior is even more flawed when we consider the API.
>> When we do want to improve the performance for small BLOBs that
>> are already loaded into memory, why do we require additional
>> network call toaccess BLOB's content? Why don't we send its
>> content right away to the client? Currently we fetch records that
>> contain BLOB ids, and then for each ID we have to load BLOB
>> segments (in this case that is usually one segment)...
> I agree that the current situation is less than ideal.
> But I need to point out: The problem you describe is with the
> API/wire protocol! Not with the manner in which the data is stored
> in the database!!
> If a new wire-protocol allowed for the return of Blobs with the row
> data (perhaps to a defined size) then there would be no problem
> with the current storage approach.

Last Summer we discussed the 'problem' of having to declare a string
length for UDF's returning a string. This length did not matter much,
other than putting an upper limit on the length of this string if
used in the select list of a query. They way I understand, this is
because the length is used to define the size of the message port /
wire message. At the time I suggested that one way to solve it was to
send strings of unkown length just like blobs -- put an id in the
message port and fetch segments as needed -- perhaps using a better
API than the current one.

The above suggests that for short blob's, it would be better to send
them inline, like varchar's. So, perhaps the future is to send all
short long stuff (say up to 4K -- whatever number optimizes network
performance) inline and larger stuff using id's + "segments". Where
requested, the client lib could transparantly reassemble the segments
to give the appearance of an inline message or vice versa. So, for
example, app's using the current API get the current (potentially
simulated) behaviour and app's using a new API can specify how they
want to read/write long columns.

Note that I need blob id's for Oracle-mode, so my vote is to keep
them available in some way.

I am not proposing that we change the wire-protocol just for this,
just that we keep it in mind whenever it is time for an upgrade.