Subject Re: [firebird-support] hello, can someone translate the follow error messages? ref/eDN8022297953
Author Helen Borrie
At 12:26 AM 9/01/2008, dennis wrote:

>Could you please describe what gstat returns? What measurement have each one number bellow? (bytes, packet)
>
>>
>>Database header page information:
>>
>> Flags 0 - not used (I think...)
>> Checksum 12345 - Not used
>> Generation 4914769 - No. of transactions since create date
>> Page size 1024 - DB page size in bytes - TOO SMALL! use 4096 or 8192
>> ODS version 11.0 - see note below
>> Oldest transaction 4897071 - transaction # of oldest interesting transaction (OIT)
>> Oldest active 4914740 - transaction # of oldest active transaction (OAT)
>> Oldest snapshot 4914740 - transaction # of OAT before last garbage collection
>> Next transaction 4914765 - transaction # that will be used next
>> Bumped transaction 1 - not used
>> Sequence number 0 - don't know
>> Next attachment ID 0 - don't know
>> Implementation ID 16 - don't know
>> Shadow count 0 - counts shadow databases being written to
>> Page buffers 0 - size of page cache if set
>> Next header page 0 - not sure, possibly pointer to header page of next file
in a multi-file database
>> Database dialect 3 - Native SQL dialect for ODS 10 and higher databases
>> Creation date Nov 16, 2007 13:28:27 - date this database was created/last restored
>> Attributes force write - Force write means disk writes are synchronous, i.e. written
to disk immediately on being posted

ODS = On-disk Structure. Each major version of Firebird involves feature additions
and changes, sometimes storage format changes, etc. ODS 11 means
your DB was created by Firebird 2.0.x. ODS 10.1 - Fb 1.5, ODS 10 - Fb 1.0
or IB 6.0.x, ODS 9.1 - IB 5.6, etc...

In the case of your db, it was created on a much older ODS and was restored under Fb 2.0.x. We can tell this because of the page size. In Fb 1.5 the default page size was 2048 bytes. In Fb 2.0.x it is 4096, and Fb 2.0 won't even allow you to create a new database with a page size of 1024. INCREASE THIS PAGE SIZE. You can do this during a restore.

"interesting transaction" - a transaction that is not active (could be committed,
rolled back or "limbo") but there are other transactions still running
that are "interested" in the original state of records that were affected
by that transaction.

SQL dialect - make SURE that your clients connect using Dialect 3 and the Fb 2.0.x client library.

If there is a large "gap" between the OIT and the Oldest Snapshot then it is
usually an indication that your application is not managing transactions very
well. Garbage builds up and gradually performance will slow down. Use of
Commit Retaining is the most usual cause.

>Is "safe restore" you mentioned something special that the classic database
>restore? Does it have any extra setting?

No. It means you restore the backup using the -C[reate_database] switch and
a DIFFERENT database name, wait for it to finish, then try to connect to this
new database. If all is OK, then take the original database offline, rename it,
then rename the newly-created database.

>In the application I use only the CommitRetaining. I do CommitRetaining
>after each post and I do CommitRetaining instead of Commit in order remain
>the dataset opened after each post. I do not start and commit the
>transaction because after each one post I do a commit. This application runs
>on pos machines 3 years now, I haven't face problems in transaction layer
>with this mechanism. I do not say that is the most right way but is a
>mechanism that covers what we want from the application, the direct update
>of the database. Is something wrong with that with this way?

Yes. Sooner or later this will bite you - the more users, the more records, the more throughput, the sooner. CR is OK on stand-alone desktop applications but it does not scale well for the multi-user situation. It's not too bad if used for some things - with the utmost care and control - but you must ensure that those transactions are hard-committed regularly.

In the meantime, implement a backup/safe restore routine. It appears that 6 weeks is too long a gap between restores for this site. Fix up the page size issue and implement a routine weekly or monthly backup/restore, until you are able to deploy a healthier application.

>The most strange thing is that in other branches, the same application has
>not problem at all.

It will, eventually. Possibly the problem branch has fewer server resources and/or higher throughput and/or more users and/or worse user discipline and/or longer lunch breaks and/or too many other things happening on the server and/or a faulty network.

> But I am still investigating

Do a backup and safe restore, increase the page size and monitor the header statistics. That will tell you more than just guessing or searching for needles in haystacks. But don't neglect the network issue - at least satisfy yourself that the network errors you're seeing are not caused by faulty network cards/routers/cables or badly-configured DHCP.

And find out what is DIFFERENT at this site, compared to the others.

./heLen