Subject | RFC: Improvements to External Tables |
---|---|
Author | Helen Borrie |
Post date | 2008-10-12T08:16:38Z |
The External Tables feature is a useful adjunct. However, it is rife with sins that inhibit its usefulness and make us look bad.
1. The one that bothers most of us is its restriction to fixed format records. Love it or hate it, .CSV and other delimited variants have become a de facto standard over the past two decades; and XML is fast overtaking that. FF, on the other hand, makes Firebird look antediluvian for what's a standard, even "must-have" feature of even the worst DBMS's on offer.
2. Its worst fault is its total lack of portability, even between servers with identical OS/FS platforms. That's because the table definition is totally restricted to the exact filesystem location that existed on the machine where the table was defined, e.g., h:\databases\datafiles on one Windows machine in the house. If I want to port this database, even to another Windows machine, the database is broken. It is *so* broken that
a) I can't drop or alter a SP that refers to it (and typically we write SPs to process external data!)
b) I can't edit or access or drop the table because the new server can't find the file in the declaration
c) Similarly, I can't hack RDB$RELATIONS because the file is inaccessible.
Catch-22!
Indeed, the only hack apparently is to open the database file in a text editor and do a search-and-replace. Yeah. It works. On a 3 MB database, that is...
3. Now, you might argue that all it takes is to put the data file tree beneath the Firebird root, e.g. in ../datafiles (similar to UDFAccess) and set up ExternalFileAccess RESTRICT to the relative tree address. Then you could just plonk all your external files there and define external tables to point to this platform-independent location. Doesn't work. For example, pathing the <filespec> argument as 'datafiles\specialdir\special.data' results (on Windows) in a violation of your RESTRICT list as Firebird attempts to open a file in c:\windows\system32\datafiles\specialdir.
If ExternalFileAccess were made to behave more like UDFAccess, where the engine recognises a relative path as "one it knows about" one part of this problem could be addressed. But it's still not a nice solution, since we want external applications to have access to the directory. We don't want them pushing data into our *software* installation *at all*, do we? In fact, from what I've heard of Vista and the newer server platforms, apps can't write data into c:\program files.
So - is it possible to do some or all of the following:
1. Add a new config parameter for ExternalFileRoot to take care of the platform dependence problem. There can be no default, obviously, so enforce ExternalFileAccess=NONE, regardless of current settings, if ExternalFileRoot is empty or invalid.
2. Allow (encourage? enforce?) a UDFAccess-style relative path for the <filespec> argument of CREATE TABLE...EXTERNAL FILE
3. Make Firebird interpret slash and backslash in the subdirectory parts of the strings appropriately for the platform it's running on.
4. And - of course - wake up and fix the code for handling delimited text record formats and get REAL LOVE by enabling an external table to suck in the contents of text or binary files (as XML processors can do) for treatment as clobs and blobs.
Helen
1. The one that bothers most of us is its restriction to fixed format records. Love it or hate it, .CSV and other delimited variants have become a de facto standard over the past two decades; and XML is fast overtaking that. FF, on the other hand, makes Firebird look antediluvian for what's a standard, even "must-have" feature of even the worst DBMS's on offer.
2. Its worst fault is its total lack of portability, even between servers with identical OS/FS platforms. That's because the table definition is totally restricted to the exact filesystem location that existed on the machine where the table was defined, e.g., h:\databases\datafiles on one Windows machine in the house. If I want to port this database, even to another Windows machine, the database is broken. It is *so* broken that
a) I can't drop or alter a SP that refers to it (and typically we write SPs to process external data!)
b) I can't edit or access or drop the table because the new server can't find the file in the declaration
c) Similarly, I can't hack RDB$RELATIONS because the file is inaccessible.
Catch-22!
Indeed, the only hack apparently is to open the database file in a text editor and do a search-and-replace. Yeah. It works. On a 3 MB database, that is...
3. Now, you might argue that all it takes is to put the data file tree beneath the Firebird root, e.g. in ../datafiles (similar to UDFAccess) and set up ExternalFileAccess RESTRICT to the relative tree address. Then you could just plonk all your external files there and define external tables to point to this platform-independent location. Doesn't work. For example, pathing the <filespec> argument as 'datafiles\specialdir\special.data' results (on Windows) in a violation of your RESTRICT list as Firebird attempts to open a file in c:\windows\system32\datafiles\specialdir.
If ExternalFileAccess were made to behave more like UDFAccess, where the engine recognises a relative path as "one it knows about" one part of this problem could be addressed. But it's still not a nice solution, since we want external applications to have access to the directory. We don't want them pushing data into our *software* installation *at all*, do we? In fact, from what I've heard of Vista and the newer server platforms, apps can't write data into c:\program files.
So - is it possible to do some or all of the following:
1. Add a new config parameter for ExternalFileRoot to take care of the platform dependence problem. There can be no default, obviously, so enforce ExternalFileAccess=NONE, regardless of current settings, if ExternalFileRoot is empty or invalid.
2. Allow (encourage? enforce?) a UDFAccess-style relative path for the <filespec> argument of CREATE TABLE...EXTERNAL FILE
3. Make Firebird interpret slash and backslash in the subdirectory parts of the strings appropriately for the platform it's running on.
4. And - of course - wake up and fix the code for handling delimited text record formats and get REAL LOVE by enabling an external table to suck in the contents of text or binary files (as XML processors can do) for treatment as clobs and blobs.
Helen