Subject Re: [IB-Architect] Classic vs SuperServer was IB/FB Lock manager failures
Author Alexandre Kozlov
Dmitry,

at those times this crash appeared there was nobody in support command who
cared about it.
So I might be not remember the whole history in details. That was
uninterruptable nightmare during month -
it's for sure. I have some experience for troubleshooting but beleive me
this was some kind of
evil/idiotic error but very sensible to whole engine behavior.
After all we have "reproduced" with good frequency but I do not have this
test right now. As I remember this
started like this: I write record but can't see it in the same tansaction or
even being locked trying
to read it or with some variations on this theme. After this database was
corrupted. The structure of database
was not simple but not extremely complex too. It contained blob data.

Just remember one more details:

1) this error was arisen when we were trying to read one record from two
connections;
2) this error was arisen on some pairs (computers) of connections
(beautiful!!!);
3) and other delirium

At those time we found that others have the same problem. Here is our answer
to one of them:

Hi Micha,

I'm sorry to hear that, but I'm glad to see that not only we have the same
problem.

Let me first to share our experience to solve this. After testing all
version of
IB open source and Firebird on Windows/Linux, trying all available versions
of IB Client and changing our code we found that only IB 6.5 does not have
the trouble with index/database corruption. We just installed IB 6.5 server
and forgot all this miracles.

I feel that the bug is the same like it is described in:
http://sourceforge.net/tracker/?func=detail&atid=109028&aid=221960&group_id=
9028

The one is still in open bugs:
http://firebird.sourceforge.net/rabbits/pcisar/FirebirdBugsOpen.html

What is suspicious for me: why THE BUG that was found 2000-11-08 05:32
still exist and nobody wants to care to fix this most terrible error I ever
heard.

I can add that the bug appears only on complex database structure in both
classic and superserver versions.

Grigori

mgastde wrote:

Hi,

we have very strange errors in one of our customers production
environment.

We are using Firebird CS 1.0.0 Build 796 on a cluster system with
RedHat
Linux 7.2.

About 7 weeks weeks ago we started with a brand new implemented
production control system. After 5 weeks of work without any trouble,
we
started to get problems with the system. We have got messages in our
application like
"database file appears corrupt () wrong page type
page XXX is of wrong type (expected 7, found 0)"
"database file appears corrupt () wrong page type
page YYY is of wrong type (expected 5, found 0)"
These errors are NOT written to the interbase.log file. They arise
only
in our applications. If we start same the transaction a second time,
the
error usually does not occur again, but sometimes it does. At this
point
we started gfix -v -f to examine the database. It reported us a lot of
messages like "Index 3 is corrupt on page 299129 in table FSATZ
(159)".

After running
gfix -m -f -i
gbak -b -g -t -v
gbak -c -r -p -v
gfix -v -f did not report any errors in our database.

But the same errors occurs after only a few days of work. The same
cycle
of gfix -m -i, backup, restore, gfix -v -f repairs the database only
for
a few days.

Some information to our system:
We have two databases: One consists of only one table with two rows,
the
primary key and a blob field. It has about 1 millions of records and a
total size of about 9 GB spanned over 5 files (with a maximum of 30
files with each 250000 8 KB blocks). This database works fine.
The second database is much more complicated: The database size is
about
1 GB. We have configured the system to use 3 files with 500000 4 KB
blocks to stay under the maximum file size of 2 GB.
1. The biggest table has about 2.8 millions of records.
2. The total amount of records over all tables is round about 6
millions
of records.
3. We have 80 tables with 131 foreign keys, 112 stored procedures and
171 triggers.
4. Disabling the sweep process did not change anything.
5. The index errors usually occurs in the most heavily frequented
tables.
6. We have two different types of transactions:
- Import and export of productions data is done in transactions with
up
to 20000 insert operations in the first database (with one table) and
up
to 40000 insert or update operations in the second database. One of
these transactions is running up to 6 minutes.
- User transactions which usually does not change a lot of data.
We have no long running transactions.

Does anyone have an idea? Any hints?
Are there for example any circumstances to allow errors to be
transported over a cycle of gfix, backup, restore and gfix?

Thanks a lot in advance.

Micha


----- Original Message -----
From: "Dmitry Yemanov" <dimitr@...>
To: <IB-Architect@yahoogroups.com>
Sent: Thursday, September 19, 2002 11:55 AM
Subject: RE: [IB-Architect] Classic vs SuperServer was IB/FB Lock manager
failures


> Alexander,
>
> > six months ago SS6 (both from borland free and firebird 1.0)
> > took a lot of our time crashing several times
> > very large dabase unrecoverably. These events were lasted
> > until we install trial version of SS6.5 from Borland.
> > And it works guys.
> > So it may be not so bad if some of folk try to eliminate
> > these serious bugs from current version of SS.
>
> Since you seem to be the only person who's aware of this bug, how do you
see
> us trying to fix it? "Hey guys, there's a serious bug somewhere within the
> engine, could you please fix it, thanks"? Sorry, the things are not done
> this way. Do you have a reproducable test case? Can you provide any useful
> pointers? A detailed information about those crashes? A zipped backup of
> your database? Then probably some of folks could start investigating the
> source of this problem.
>
>
> Dmitry
>
> To unsubscribe from this group, send an email to:
> IB-Architect-unsubscribe@onelist.com
>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
>