Subject User defined storage: BYTES
Author Geoff Worboys
Hi All,

Firstly, apologies if this has been discussed before. I did
a quick search of the tracker but couldn't find anything that
was obviously the same. The idea seems familiar, so don't
know if I've seen it here or simply had the idea before.


At various times over the years I've thought it would be good
to be able to easily introduce my own data-type into Firebird.
A couple of examples would be:
. a timestamp that includes time-zone information
. a decimal floating point type
. a variable precision binary floating point type (eg MAPM)
. a UUID

UUID now exists, sort of, and apparently decimal floating
point is planned, but the point is that it would have been
convenient to have been able to define storage and processing
for these more easily, more directly, without waiting for
them to appear in the engine.

One way that has always existed is what you have chosen to do
with UUID: have the user define CHAR(16) CHARACTER SET OCTETS
as a field/domain. This works but is not ideal, for example
in a UDF or system-function the only type validation you can
do is check the length and hope (and lets face it, lengths
like 16 (128bits), and 32 (256bits) are likely to be pretty
common in user defined types).

Another way, that I have used, is to use VARCHAR(xx) ASCII,
store the data in text form and define custom collations. This
is useful for sorting but not very compact for storage, nor
very helpful with UDFs.

Enough waffle, hopefully you understand the basic requirement.

My thought was that Firebird could introduce something like a
a BLOB but for small data types. Call it BYTES. Like the blob
data type it would support a sub-type, some could be system
defined (eg: 1 = UUID fields) and the others could be for user
types.

CREATE DOMAIN BINARY_D
BYTES(20); -- default sub_type of 0

CREATE DOMAIN UUID_D
BYTES(16) SUB_TYPE 1;

CREATE DOMAIN MY_TIMESTAMP_D
BYTES(10) SUB_TYPE -1;

In most instances this new data-type would behave exactly the
same as: VARCHAR(nn) CHARACTER SET OCTETS
Indeed the default subtype of 0 would be essentially identical.

The main real difference would be the fact of the sub-type
available to UDFs and the API for identification & validation,
just as blob sub-types allow validation etc of their content.


It might have been nice to leave it as CHAR and so be able to
use collation features etc to manage sorting etc, but that
would only be feasible if the XSQLVAR stuff were being extended
so that sub-type and character-set were being carried in
separate fields. If that happened then SUB_TYPE could become
an extension of all character types... but perhaps I go too
far. Without such collation support the data-type would have
to sort and index by simple byte order, but for most of the
requirements I've had for this sort of thing I could live with
that.


Does this idea appeal to anyone else?

--
Geoff Worboys
Telesis Computing