Subject Architecture -- Why you should care
Author Jim Starkey
Two virtual aspects of computing -- standards and architecture -- allow
big pieces of software to be made out of little pieces of software
without getting the various authors together to sniff under each other's
tails. Standards are a static statement of how things should be. An
architecture is a statement of how things fit together. A good
architecture is also a standard, but very few standards qualify as
architectures. Let me show a non-database example where they can clash.

The Unix lseek function was originally defined as taking a 32 bit signed
displacement. This was later changed to a symbolic type "off_t", but
remained 32 bits. This limited seeks to files no larger than 2
gigabytes. When disk sizes blew past this, the limitation became
intolerable, operating systems implemented 64 bit seeks. The question
became how the C runtime library would handle the problem.

Microsoft (and presumably others) added a second call, lseek64, that
took a 64 bit displacement and left the original lseek with a 32 bit
displacement. The reasoning is solid. Code written with the original
lseek can't take advantage of the new version without revision, and if
the code is going to be revised anyway, might as well add a new call so
existing coding would continue to work unchanged. Principled, simple,
backwards compatible, and trivial to code to.

The Unix community believed that the official lseek call said nothing
about the word size of the displacement (which they considered an
implementation detail) and refused to sanction a separate 64 bit call.
Simply introducing the new size would have broken virtually all existing
code, so they made it semi-magic. If you defined a magic macro and the
system supported 64 bit seeks, "off_t" was 64 bits and the call was
effectively a 64 bit seek. If the macro wasn't defined or the system
didn't support 64 bit seeks, "off_t" was 32 bits. This meant that any
code requiring 64 bit seeks on platforms that supported it has to
include runtime code to detect and handle the two cases.

The C call "lseek" is an example of standard that is next to useless as
an architecture. An architecture would say "this is what lseek does"
and live with it. If the requirements change, an architecture promises
to leave lseek along and define something else to cope with the new
requirements.

The question at hand in Firebird is handles. To quote from the "OSRI
Reference Manual" published 28 March 1991 by the Interbase Software
Corporation, page 9:

"The call interface is: ... Handle-based. OSRI objects that use
handles are database attachments, transaction, compiled requests,
and active blobs. These objects are formally created and destroyed,
and are represented by handles during the course of their active
state. No object is global, and all state information resides in
these objects.

Note [with double boxed border] A handle is an uninterpreted 32-bit
quantity. Programmers should not make any assumptions about what it is.

Handles were 32 bit uninterpreted quantities on 32 bit systems and 32
bit uninterpreted quantities on 16 bit systems.

The Interbase/Firebird include files defined handles as pointers, which
on 32 bit systems is completely consistent with the architecture on 32
bit systems. They could just as well have been defined as longs or
unsigned longs.

User code was explicitly allowed to move handles, copy handles, pass
handles around arguments around as function arguments. By architecture,
they could be passed by any 32 bit datatype, including datatypes
explicitly constrained to 32 bits such as the Firebird internal SLONG type.

So the question were facing is: What takes precedence? Our stated
architecture or an implementation artifact of a 32 bit include files
extrapolated to 64 bit systems compiling in 64 bit mode?

There are two way to resolve this. One is to analyze the user impact of
changing the handle from 32 bits to 64 bits when compiling for 64 bits.
Most user code will need a conversion anyway, and if users where careful
to use the optional handle types rather than the official architecture,
their code should work pretty much unchanged.

Or, that this is an architecture. It is a contract we made with our
users. If we need to change something covered by the architecture, we
will extend the architecture, not break it.

This issue has been addressed dozens of times across the years. You
will find isc_compile_request and isc_compile_request2, etc., all over
the place. Almost all structures start with a version number to protect
existing code. When incompatible blr changes have been made, but
version number has been bumped.

There may, in fact, be no existing user code that passes handles around
as explicit 32 bit quantities. Trying to determine whether there are
and if so what the impact of a proposed incompatible change is the wrong
answer. The right answer is "that's the way we defined it; that the way
it will work."

Vulcan is not yet in compliance with the architecture, but will be when
it ships. If a program written against the OSRI specification and the
Firebird Vulcan API is linked and run with the proposed Firebird 1.5.1
API, it will crash and burn. If a program was written against the OSRI
specification preserves only 32 bits of a 64 bit handle, it will work on
some platforms and implementation and fail on other platforms and
implementations.

Borland had a very low regard for architecture, but respected it
nevertheless. I expect more from the Firebird project.

I urge the Firebird admins and the Firebird 1.5.1 release managers not
to ship code that violates our own published architecture.

--

Jim Starkey
Netfrastructure, Inc.
978 526-1376