Subject | RE: [firebird-support] RE: At my wits end |
---|---|
Author | Epstein, Ed |
Post date | 2004-03-13T05:57:01Z |
As for the assumptions 1,2, & 3 The data may be coming from more than one
client at one time. The data is also permanent. It never gets removed.
The data is reviewed in summary from the clients in different reports/views.
It is also heavily manipulated with near endless modifications to its
status. Only at some undetermined point is it moved off into another table
where it is archived. Even in the archived state it is still used in
reports and views, just not ever manipulated or exported. On frequent
occasion (4-5 times a week) large sets of data will be exported based upon
selection criteria. This is not a problem at all. I have used what has
been termed a "firehose" connection. Its very fast. I have offloaded 2
million records at around 5000 records per second to disk from a client.
As for the model, I fully and totally agree with you. However, I cannot
make this choice. Firebird/Interbase was chosen since the production
database for the other data services my company provides runs on either
Firebird or Interbase. I don't actually know. What I do know is that the
CIO told me that I MUST mantain this compatibility for them so that their
frontends can access my database as well. This started out in Access while
i was creating the database and frontend initially. I have 7 years
experience programming that. Its not too bad and works for small projects.
It even handled 15 million records quite nicely at first. Of course within
90 days the data requirements broke it :)
As for the 3rd party programs to get the data there initially that would
work and would have the neccessary speed I am sure. Its just that the
ultimate goal of my project is to eliminate the need for 3rd party programs
and make this easy enough that a trained person (even an executive) could
run the program without the need for an IT support staff constantly running
3rd party programs in the background while they wait. I would then be
assigned another project afterwards.
So the requirements of my job dictate that
1) Its interbase/firebird compatible.
2) the frontend does absolutely everything.
3) its easy enough to use that somebody that does not even know what CSV
stands for, could use it.
4) its fast enough to get the jobs done in the required amount of time. The
same day turn around on processed files. Currently only possible since I am
doing it manually.
I am in a pickle I think.
Edwin Epstein
privileged and confidential and protected from disclosure.
If the reader of this message is not the intended recipient, or an employee
or agent responsible for delivering this message
to the intended recipient, you are hereby notified that any dissemination,
distribution or copying of this communication is
strictly prohibited. If you have received this communication in error,
please notify us immediately by replying to the message and
deleting it from your computer.
-----Original Message-----
From: Alan McDonald [mailto:alan@...]
Sent: Friday, March 12, 2004 6:52 PM
To: firebird-support@yahoogroups.com
Subject: RE: [firebird-support] RE: At my wits end
1. The data is inserted from one client only and not any of the clients
(e.g. an administrators desktop)
2. The data has a relatively short life-cycle (it will possibly be deleted
after only a short while and new data will take it's place).
3. The data being inserted undergoes relatively minor modification
during/after insertion and the life of this data is spent being dispensed to
other clients to report/view in specified criteria rather than endlessly
edited and transformed.
Correct me if I'm wrong.
If indeed, some of these assumptions are correct, then I would have to
wonder why you have chosen a full trasnaction based SQL engine. I don't mean
to belittle your choice, but if there is little in the way of permanent data
manipulation and/or permanency, I would have probably chosen MySQL classic
ISAM tables to do this job. No transaction overhead, rapid insertion
possibilities and plenty of VB support.
If, on the other hand, you do need long term storage with continuing data
manipulation, then transactions may be necessary and you may well need
plenty of disk space after 12 months of insertions.
Dan has already given you some stats which hold true. Might I also suggest
the use of CleverComponents datapump which uses OLEDB as one type of source
which I would think can be a CSV filetype provider. It is configurable and
the config is saved and you can manipulate the number of insertions per
commit, it's always been fast for me but I haven't timed it. It is third
party but you are already using third party stuff I see and I doubt that if
it does not run at the desired speed for you then you will have to re-think
your model.
Alan
Yahoo! Groups Links
client at one time. The data is also permanent. It never gets removed.
The data is reviewed in summary from the clients in different reports/views.
It is also heavily manipulated with near endless modifications to its
status. Only at some undetermined point is it moved off into another table
where it is archived. Even in the archived state it is still used in
reports and views, just not ever manipulated or exported. On frequent
occasion (4-5 times a week) large sets of data will be exported based upon
selection criteria. This is not a problem at all. I have used what has
been termed a "firehose" connection. Its very fast. I have offloaded 2
million records at around 5000 records per second to disk from a client.
As for the model, I fully and totally agree with you. However, I cannot
make this choice. Firebird/Interbase was chosen since the production
database for the other data services my company provides runs on either
Firebird or Interbase. I don't actually know. What I do know is that the
CIO told me that I MUST mantain this compatibility for them so that their
frontends can access my database as well. This started out in Access while
i was creating the database and frontend initially. I have 7 years
experience programming that. Its not too bad and works for small projects.
It even handled 15 million records quite nicely at first. Of course within
90 days the data requirements broke it :)
As for the 3rd party programs to get the data there initially that would
work and would have the neccessary speed I am sure. Its just that the
ultimate goal of my project is to eliminate the need for 3rd party programs
and make this easy enough that a trained person (even an executive) could
run the program without the need for an IT support staff constantly running
3rd party programs in the background while they wait. I would then be
assigned another project afterwards.
So the requirements of my job dictate that
1) Its interbase/firebird compatible.
2) the frontend does absolutely everything.
3) its easy enough to use that somebody that does not even know what CSV
stands for, could use it.
4) its fast enough to get the jobs done in the required amount of time. The
same day turn around on processed files. Currently only possible since I am
doing it manually.
I am in a pickle I think.
Edwin Epstein
> Continuity Partners Inc.CONFIDENTIALITY NOTICE: The information contained in this message may be
> 2753 S. Highland Dr.
> Suite 1010
> Las Vegas, Nevada 89109
> (702) 851-7850 Ext. 247
> eepstein@...
>
privileged and confidential and protected from disclosure.
If the reader of this message is not the intended recipient, or an employee
or agent responsible for delivering this message
to the intended recipient, you are hereby notified that any dissemination,
distribution or copying of this communication is
strictly prohibited. If you have received this communication in error,
please notify us immediately by replying to the message and
deleting it from your computer.
-----Original Message-----
From: Alan McDonald [mailto:alan@...]
Sent: Friday, March 12, 2004 6:52 PM
To: firebird-support@yahoogroups.com
Subject: RE: [firebird-support] RE: At my wits end
> As for my use of tps, I meant transactions per second. I guess I shouldOK Ed, I still have to make some assumptions:
> really be saying Inserts per second. I personally have tried commits from
> around 10-10000. The C++ test code was committing every 1000 records.
>
1. The data is inserted from one client only and not any of the clients
(e.g. an administrators desktop)
2. The data has a relatively short life-cycle (it will possibly be deleted
after only a short while and new data will take it's place).
3. The data being inserted undergoes relatively minor modification
during/after insertion and the life of this data is spent being dispensed to
other clients to report/view in specified criteria rather than endlessly
edited and transformed.
Correct me if I'm wrong.
If indeed, some of these assumptions are correct, then I would have to
wonder why you have chosen a full trasnaction based SQL engine. I don't mean
to belittle your choice, but if there is little in the way of permanent data
manipulation and/or permanency, I would have probably chosen MySQL classic
ISAM tables to do this job. No transaction overhead, rapid insertion
possibilities and plenty of VB support.
If, on the other hand, you do need long term storage with continuing data
manipulation, then transactions may be necessary and you may well need
plenty of disk space after 12 months of insertions.
Dan has already given you some stats which hold true. Might I also suggest
the use of CleverComponents datapump which uses OLEDB as one type of source
which I would think can be a CSV filetype provider. It is configurable and
the config is saved and you can manipulate the number of insertions per
commit, it's always been fast for me but I haven't timed it. It is third
party but you are already using third party stuff I see and I doubt that if
it does not run at the desired speed for you then you will have to re-think
your model.
Alan
Yahoo! Groups Links