firebird-support - Re: [firebird-support] Firebird/Internet performance questions

Subject	Re: [firebird-support] Firebird/Internet performance questions
Author	David Johnson
Post date	2004-01-15T02:06:27Z

> DBXpress??

yes

> Your major big mistake is starting and committing a transaction for every
>row created. In a real application you'd never do that. Access isn't
>comparable for many reasons, one of which is that it doesn't support
>transactions

So I can only expect 68 transaction per second or roughly 250,000 transactions per hour (I agree, row count is beside the point)? I will run a recalibrated test to see how far this gets me.

While Access in far from my preferred platform, I have worked enough with Access to know that it does support transactioning and dual phase commit - it is just rarely used by Access programmers. It also has other serious issues that make it undesirable for this project.

>Nope. Don't do whole-table row counts as a performance test. The only
>performance it tests is how long it takes to scan the table. Firebird
>doesn't store row-counts, it has to physically count rows to return a
>count. It would be rather pointless to store row-counts which would never
>be correct (a moving target in a multi-transaction, multi-generation
>environment).
>
>If you need to count rows for some intelligent purpose, select blah, count
>* GROUP BY blah has its uses.
>
>Standard SQL provides language to do what your Access databases need
>rowcounts for - see esp. the EXISTS() predicate.

The whole table count was intended as an existence check - it is how you typically check for the existence of a table in DB2.

I would accept your moving target argument except that my primary environment is DB2 on an IBM 2500 sysplex. DB2 shares the characteristic of "multi-transaction, multi-generation environment", and outperforms its nearest competitor in cheat ... er ... benchmarks by 30%. It depends on the row count as part of its query optimization algorithm. It is understandable that you would assume my primary experience was Access ... I never specified my experience. :o)

I accept that this is part of a design tradeoff, and that there are other ways to achieve my ends. Since part of my requirements are to make this product upgradable from desktop to enterprise with zero application or framework code changes, querying system tables directly to determine table existence is not an acceptable mechanism. However, I do have an acceptable alternative since I know the structure of my data.

I prefer to minimize the amount of application/framework code that is specific to any DBMS, hence the use of DBXpress.

>An ascending index would quickly find the lowest record and extremely
>slowly find the last one. Add a descending index to find the highest of
>anything.

A B+ tree index with 4,000,000 entries of 30 bytes on 8k page size, using an index only scan, should require no more than 3 I/O to reach the end of the index. This holds true to 16,000,000 rows. This suggests either (a) that Interbase is using linear indexing rather than B+ tree or (b) the query optimization surrounding MAX/MIN and indexes is incomplete. I will look at the firebird source sometime and see what I can discover.

>There's said to be a bit of a gain with precompiled queries. Actually, all
>I've ever heard of embedded's supposed speed advantages is hearsay. I've
>never done it myself, or seen an embedded production application. (The
>command-line tools are embedded apps.)

All of my queries are precompiled and parameterized. The Prepare phase is too expensive for the scale processes I am considering.

>Nothing batched is going to perform fast if you abide by one transaction
>per operation though. But then, a batched test isn't going to tell you
>anything about performance of random interactive transactions, either,
>which presumably is what you're looking to do (from the Subject title).

You'd be surprised at how much you can test with well designed batched tests. :o)

The thing to remember is that the tests must always be interpreted correctly. Your input has been invaluable in interpreting my test results and provided fuel for a second pass.

Since the typical insert transaction in my framework will be in the neighborhood of ten statements, I will work with this. I will also prepare a test case for random accesses through the database.

Any other sggestions you have would be appreciated.

Thanks,
David Johnson

----- Original Message -----
From: Helen Borrie
To: firebird-support@yahoogroups.com
Sent: Tuesday, January 13, 2004 8:11 PM
Subject: Re: [firebird-support] Firebird/Internet performance questions

At 07:17 PM 13/01/2004 -0800, you wrote:
>Hello all. Thanks in advance for your help.
>
>As an experienced IT person in a large company, I am accustomed to having
>an entire department whose concern is the performance of the corporate
>DBMS. However, I am working on a project of my own that will use
>Interbase/Firebird as the back end persistence layer for an enterprise
>scaled distributed system.
>
>Current environment: Delphi 6 build 6.163, SQLExpress

DBXpress??

>, Firebird 1.5 RC8 with default configurations
>
>My first operational persistence class for the framework uses Borland's
>SQLExpress. The test case code follows
>
>procedure TTestContact_001_Persistor.Persist;
>var
> aTransDesc: TTransactionDesc;
>begin
> aTransDesc := TContact_001_Persistor.getInstance.getTransactionDesc;
> TTestBed.getInstance.SQLConnection.StartTransaction(aTransDesc);
> try
> TContact_001_Persistor.getInstance.Persist (FTestMemento);
> TTestBed.getInstance.SQLConnection.Commit(aTransDesc);
> except
> TTestBed.getInstance.SQLConnection.Rollback(aTransDesc);
> raise
> end;
>
> // Test is presumed to have succeeded if no exception is raised
>end;
>
>
>procedure TTestContact_001_Persistor.SimplePerformance;
>const
> TEST_ROW_COUNT = 4000000;
>var
> i: Integer;
> var
> StartTime: TDateTime;
> FinishTime: TDateTime;
>begin
> FTestMemento.FVoid_Stamp.User_OID := EmptyGUID;
> FTestMemento.FVoid_Stamp.User_SID := 0;
> FTestMemento.FVoid_Stamp.Intermediary_OID := EmptyGUID;
> FTestMemento.FVoid_Stamp.Intermediary_SID := 0;
> FillChar (FTestMemento.FVoid_Stamp.Time, sizeof (TSQLTimeStamp), 0);
> FTestMemento.FVoid_Stamp.MethodName := '';
> FTestMemento.FVoid_Stamp.ProgramName := '';
>
> StartTime := now;
> for i := 0 to TEST_ROW_COUNT do
> begin
> CreateGUID (FTestMemento.FOID);
> FTestmemento.FCreate_Stamp.Time := DateTimeToSQLTimeStamp (now);
> Persist;
> end;
> FinishTime := now;
>
> check (false,
> Format ('Time to insert %d rows: %d days %s',
> [ TEST_ROW_COUNT,
> trunc (FinishTime-StartTime),
> FormatDateTime ('hh:mm:ss.zzz', FinishTime-StartTime)]));
>
>end;
>
>Observations:
>
>1. Completion of the insert performance test for 4,000,000 rows into a
>clean test database required 16:16:12 (roughly 68 rows per second) on an
>800 MHz Win98 box with 512 MB RAM. Every row insert was committed. This
>is slower than I expected. I initially thought that my disappointment was
>because I am used to my DBMS running on big iron (DB2 on large MVS
>systems), but even with M$ Access I have seen far more complex opertations
>perform row inserts at twice that rate. I am certain that performance
>tuning would improve the performance gap. How mch improvement can I
>expect to see?

Your major big mistake is starting and committing a transaction for every
row created. In a real application you'd never do that. Access isn't
comparable for many reasons, one of which is that it doesn't support
transactions.

Also, using generic data access components doesn't help client/server much
at all. There are several better solutions - some would say you picked the
worst. If you need to stay at the dumb end of the connectivity spectrum,
there is a DBXpress driver available (commercial, cheap) for Firebird.

>2. Even though every row insert was committed, when the test case was
>completed the DBMS maintains that the table is still "in use" as long as
>the connection is held. Is this a "bug" or a side effect of the
>optimization? I am curious.

It is "as designed". This isn't a desktop dbms. As a rule of integrity,
end-users don't create and drop objects (not that they can't - but the
system isn't designed to make it easy!!)

Creating and dropping database object happens, like everything else in
Firebird, within transactions. Transactions isolate one unit of work from
another and protect dependencies. The "object in use" message tells you
that you can't drop the object right now because it (or something dependent
on it) is being used by an uncommitted transaction. Were you running an
admin tool that was looking at the table?

>It is a very low impact issue since it is rare that you are first
>inserting rows into a table and then dropping the same table within
>milliseconds. The "table drop" is actually part of the test bed
>maintenance, to keep clean up the test database for the run of the test
>suite, so disconnecting then reconnecting for the table drop is an
>acceptable solution.
>
>3. I am accustomed to a "select count (*) from table" query taking
>virtually no time, regardless of the table size. Firebird appears to do a
>table space scan when you run this query. Is this a configurable option
>in Firebird?

Nope. Don't do whole-table row counts as a performance test. The only
performance it tests is how long it takes to scan the table. Firebird
doesn't store row-counts, it has to physically count rows to return a
count. It would be rather pointless to store row-counts which would never
be correct (a moving target in a multi-transaction, multi-generation
environment).

If you need to count rows for some intelligent purpose, select blah, count
* GROUP BY blah has its uses.

Standard SQL provides language to do what your Access databases need
rowcounts for - see esp. the EXISTS() predicate.

>4. I applied an index to the create stamp time field so I could readily
>identiy the first and last records from the test run. After indexing, the
>first record was found rapidly, but the last record took several minutes
>to identify. With 4,000,000 records in a B+ tree on 8k Page boundaries, I
>believe that I should have seen a maximum of 40 milliseconds (3 pages I/O
>on the index, then 1 page I/O on the table space). Does anyone haveany
>comments or suggestions?

An ascending index would quickly find the lowest record and extremely
slowly find the last one. Add a descending index to find the highest of
anything.

>5. What sort of performance improvement can I expect to see if I switch to
>embedded SQL? I am targeting this system to support 8,000 concurrent
>users in an active environment. Much of the framework acts to hide the
>DBMS performance limits, but there are limits to how much it can hide. :o)

There's said to be a bit of a gain with precompiled queries. Actually, all
I've ever heard of embedded's supposed speed advantages is hearsay. I've
never done it myself, or seen an embedded production application. (The
command-line tools are embedded apps.)

Nothing batched is going to perform fast if you abide by one transaction
per operation though. But then, a batched test isn't going to tell you
anything about performance of random interactive transactions, either,
which presumably is what you're looking to do (from the Subject title).

/helen

Yahoo! Groups Sponsor
ADVERTISEMENT

------------------------------------------------------------------------------
Yahoo! Groups Links

a.. To visit your group on the web, go to:
http://groups.yahoo.com/group/firebird-support/

b.. To unsubscribe from this group, send an email to:
firebird-support-unsubscribe@yahoogroups.com

c.. Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.

[Non-text portions of this message have been removed]