Subject | Feature Request |
---|---|
Author | Christian Stengel |
Post date | 2005-04-30T18:07:36Z |
Hi architects,
I am not shure wether something of this is planned for Vulcan but I
have 5 possible improvements for firebird on my personal wishlist :-)
(and might ask you, what you think about them):
1. fast import utility
2. splitting tables internally into multiple tables (Oracle has
something like that)
3. metric calculations
4. compressed data type
5. Index File for external tables
I must admit, that I have not locked at Vulcan (as I'm not able to
compile it on my PPC) - but I'm using Firebird since its beta state.
1. fast import utility
Currently I have a project where I get 35.000.000 records per day from
my biggest source. In Firebird 1.5 this takes 75 minutes to import with
native c api (24 Minutes for processing the data and 61 Minutes for the
import - on an PowerBook - the server will be faster :-)). (Currently
it's not clear which database system we will take - it depends on the
performance :-) ).
This is actually very good (as this is done via insert statements -
classic - local connection), but other database systems (like MS SQL
Server, or MySQL) have import utilities, where it's a faster (but
Firebird has other pros the others don't have).
So it would be great to have an import utility, that can import large
sets of data faster (maby this could be done by preventing access to
the database and simply writing the data to the disk or don't know).
2. Splitting tables internally into multiple tables
This has already been requested in this list but was rejected as there
was no example where it will be useful.
For me, it would be great, if you can define in the create table
statement, something like if value starts with A write it to table
TABLE_A, if it starts with B to TABLE_B .... This should also be
possible with dates, numeric values .... and access the table at whole
or the individual chunks.
In my case I get a file with 35.000.000 records in it per day where 90%
are from today, 8 % from yesterday and the rest from last week. As I
have heard (on Fulda conference) that firebird can only handle a few
hundret million records per table, I have to split this data up to a
different table per day.
This has the advantage, that I can disable the index before importing
the data, pumping it in, and reactivating the index afterwards again.
It also has the advantage, that I can delete records on a dayly base
very quickliy by simply droping a table and recreating it (I would run
out of space after a view months :-).
The disadvantage I have with this approach is, that my boss might come
to me once or twice a week and says: give me all records where criteria
is xyz. So I have to look at all my tables to query his request.
An automatic aproach would be really great :-)
3. metric calculations
Ann sometimes asked whether there is a need for spartial extensions
(like in GIS applications). It would also be great, if firebird could
calculate with metrics or pysical data (e.g. the field is defined as
square meters and I can cast it to sqare miles, kilometers or something
else - like kilogram, pounds ...., calculate with it properly ...).
4. compressed data field
if I know, that I am not going to search for a field - it would be
great, if the value of this field could be stored compressed (e.g. zip
or bzip2 compressed) to save space :-).
5. Index file for external tables
there are external tables in firebird - howabout an additional file
with index information? The index gets updated when data is inserted by
the engine or when recreated with alter index. So eveyone knows: when
changing the data with an external program - the index has to be
recreated :-)
That's a lot of stuff - It'll be to late for my current project - but
maby sometimes in the future ... :-)
By the way: the fdb size is 3.8 GB with 35.000.000 records - MS SQL
Server needs 8.5 GB for this :-)
Thanks,
Chris
I am not shure wether something of this is planned for Vulcan but I
have 5 possible improvements for firebird on my personal wishlist :-)
(and might ask you, what you think about them):
1. fast import utility
2. splitting tables internally into multiple tables (Oracle has
something like that)
3. metric calculations
4. compressed data type
5. Index File for external tables
I must admit, that I have not locked at Vulcan (as I'm not able to
compile it on my PPC) - but I'm using Firebird since its beta state.
1. fast import utility
Currently I have a project where I get 35.000.000 records per day from
my biggest source. In Firebird 1.5 this takes 75 minutes to import with
native c api (24 Minutes for processing the data and 61 Minutes for the
import - on an PowerBook - the server will be faster :-)). (Currently
it's not clear which database system we will take - it depends on the
performance :-) ).
This is actually very good (as this is done via insert statements -
classic - local connection), but other database systems (like MS SQL
Server, or MySQL) have import utilities, where it's a faster (but
Firebird has other pros the others don't have).
So it would be great to have an import utility, that can import large
sets of data faster (maby this could be done by preventing access to
the database and simply writing the data to the disk or don't know).
2. Splitting tables internally into multiple tables
This has already been requested in this list but was rejected as there
was no example where it will be useful.
For me, it would be great, if you can define in the create table
statement, something like if value starts with A write it to table
TABLE_A, if it starts with B to TABLE_B .... This should also be
possible with dates, numeric values .... and access the table at whole
or the individual chunks.
In my case I get a file with 35.000.000 records in it per day where 90%
are from today, 8 % from yesterday and the rest from last week. As I
have heard (on Fulda conference) that firebird can only handle a few
hundret million records per table, I have to split this data up to a
different table per day.
This has the advantage, that I can disable the index before importing
the data, pumping it in, and reactivating the index afterwards again.
It also has the advantage, that I can delete records on a dayly base
very quickliy by simply droping a table and recreating it (I would run
out of space after a view months :-).
The disadvantage I have with this approach is, that my boss might come
to me once or twice a week and says: give me all records where criteria
is xyz. So I have to look at all my tables to query his request.
An automatic aproach would be really great :-)
3. metric calculations
Ann sometimes asked whether there is a need for spartial extensions
(like in GIS applications). It would also be great, if firebird could
calculate with metrics or pysical data (e.g. the field is defined as
square meters and I can cast it to sqare miles, kilometers or something
else - like kilogram, pounds ...., calculate with it properly ...).
4. compressed data field
if I know, that I am not going to search for a field - it would be
great, if the value of this field could be stored compressed (e.g. zip
or bzip2 compressed) to save space :-).
5. Index file for external tables
there are external tables in firebird - howabout an additional file
with index information? The index gets updated when data is inserted by
the engine or when recreated with alter index. So eveyone knows: when
changing the data with an external program - the index has to be
recreated :-)
That's a lot of stuff - It'll be to late for my current project - but
maby sometimes in the future ... :-)
By the way: the fdb size is 3.8 GB with 35.000.000 records - MS SQL
Server needs 8.5 GB for this :-)
Thanks,
Chris