Subject | Transportable databases (1 of 3) |
---|---|
Author | Ann W. Harrison |
Post date | 2006-11-16T20:15:38Z |
A sponsored development project worked out fairly well
and I would like to check the changes involved into the
Firebird head tree and the Vulcan branch. The changes
are dependent on ODS11, so backporting to Firebird 1.5x
is not feasible.
The project was to create a database that could be accessed
(read, write, and metadata update) on a PowerPC and an Intel
system, specifically Mac OS and Windows.
My goals were
1) no impact on the performance of native databases -
i.e. databases that were created on the current
platform
2) minimal code changes
3) no ODS changes at all
I present a description of the work for your questions and
discussion. This message describes the goals and high level
design. The next level is structural page level changes, which
will be described in a second message The third level is user
data and system table changes which will be a third message.
Essentially, the differences between the on disk representation
of firebird databases are in two areas: alignment and endianness.
For ODS-11, alignment is the same on PowerPC and Intel, and, I
think, all other currently supported platforms. So the only
differences are in the order of bytes in multi-byte binary and
floating point numbers. The "mixed-endian" code handles those
differences.
When Firebird opens a database, it reads the database header page,
starting with the page header. If the database was created on a
big-endian system and read on a little-endian system, reading the
page header will produce a "bad checksum" error. The error is a
hold-over from the ancient days when pages were actually checksummed;
now it just means that the value in the checksum element of the
page header is not "12345".
When a "mixed endian" Firebird encounters that error, it converts
the checksum to the other endian format. If that produces 12345,
the mixed-endian Firebird converts the whole header page and checks
that the implementation belongs to an architecture with compatible
alignment. If it does, Firebird sets a flag in the dbb saying that
this is an "other endian" database and continues to open it.
When Firebird reads a page from an "other endian" database, it
converts all the structural information on the page to the "native
endian" format. Before Firebird writes a page from the page
cache to an "other endian" database, it copies the page to a
spare buffer and converts structural information back to "other
endian" format. Those two conversions - immediately after reading
and immediately before writing - handle all the endian problems
except those in user data.
User data is converted in the record buffer, after it is expanded
by SQZ_decompress, using the format descriptor named in the record
header. This gets a bit tricky when applying deltas, since the
delta must be applied to the "other endian" record, not the local
format, but those cases are handled. Blob segment lengths and the
page pointers for upper level blobs are also converted. Blob data
is not.
The sponsored work did not include arrays, but array data will be
converted. Nor did it include external tables. There is no good
way to recognize the endian affinity of an external file, but we
could add information to the external table declaration.
Regards,
Ann
and I would like to check the changes involved into the
Firebird head tree and the Vulcan branch. The changes
are dependent on ODS11, so backporting to Firebird 1.5x
is not feasible.
The project was to create a database that could be accessed
(read, write, and metadata update) on a PowerPC and an Intel
system, specifically Mac OS and Windows.
My goals were
1) no impact on the performance of native databases -
i.e. databases that were created on the current
platform
2) minimal code changes
3) no ODS changes at all
I present a description of the work for your questions and
discussion. This message describes the goals and high level
design. The next level is structural page level changes, which
will be described in a second message The third level is user
data and system table changes which will be a third message.
Essentially, the differences between the on disk representation
of firebird databases are in two areas: alignment and endianness.
For ODS-11, alignment is the same on PowerPC and Intel, and, I
think, all other currently supported platforms. So the only
differences are in the order of bytes in multi-byte binary and
floating point numbers. The "mixed-endian" code handles those
differences.
When Firebird opens a database, it reads the database header page,
starting with the page header. If the database was created on a
big-endian system and read on a little-endian system, reading the
page header will produce a "bad checksum" error. The error is a
hold-over from the ancient days when pages were actually checksummed;
now it just means that the value in the checksum element of the
page header is not "12345".
When a "mixed endian" Firebird encounters that error, it converts
the checksum to the other endian format. If that produces 12345,
the mixed-endian Firebird converts the whole header page and checks
that the implementation belongs to an architecture with compatible
alignment. If it does, Firebird sets a flag in the dbb saying that
this is an "other endian" database and continues to open it.
When Firebird reads a page from an "other endian" database, it
converts all the structural information on the page to the "native
endian" format. Before Firebird writes a page from the page
cache to an "other endian" database, it copies the page to a
spare buffer and converts structural information back to "other
endian" format. Those two conversions - immediately after reading
and immediately before writing - handle all the endian problems
except those in user data.
User data is converted in the record buffer, after it is expanded
by SQZ_decompress, using the format descriptor named in the record
header. This gets a bit tricky when applying deltas, since the
delta must be applied to the "other endian" record, not the local
format, but those cases are handled. Blob segment lengths and the
page pointers for upper level blobs are also converted. Blob data
is not.
The sponsored work did not include arrays, but array data will be
converted. Nor did it include external tables. There is no good
way to recognize the endian affinity of an external file, but we
could add information to the external table declaration.
Regards,
Ann