firebird-support - Re: IB reliablility

Subject	Re: IB reliablility
Author	alex_vnru
Post date	2002-01-19T13:46:12Z

--- In ib-support@y..., "Claudio Valderrama C." <cvalde@u...> wrote:

> ""Paul Beach"" <pabeach@w...> wrote in message
> news:03df01c19eac$25d40520$5bb6ba8f@home...
> > > > --- In ib-support@y..., "Sergey Balter" <balter@h...> wrote:
> > > > > THE CONCLUSION
> > > > > IB may corrupt database WHEN SWEEP OR GARBAGE
> > > > > COLLECT AND DATA UPDATE (performs insert, delete,
> > > > > update) at THE SAME TIME.
> >
> > Only in V5.5 SuperServer as far as I am aware.
>
> Not sure if Alex is referring to the v5.5 bug. Is this your case,

Alex?

> I remember that my first beta IB6 rel notes said that the sweeper

could get

> in the way of other operations and corrupt the db, but the issue

vanished

> mysteriously from the acknowledged bugs and I ignore if Borland

fixed it

> before Dec-1999.

You are right, Claudio, I never used IB5.5. Since 1996 to the end of
2000 for production database I used IB4 on SCO Unix, then about 2
weeks agonized with IB6 SS on Linux RH, changed it for FB 0-9.1 CS and
shortly after for 0-9.4p1 CS, which served this database up to RC2
announce.
During this years hardware was updated many times, from Acer Altos
2xP100 on-board Adaptec SCSI up to 2xXeon933 AMI MegaRAID Enterprise
1600, always powerful Smart UPS was used. Approximately one time for 8
month I had light corruption in database, kind of

- broken index (raw is visible by natural select but not using index)
- decompression overrun buffer
- can't find record back version

Scenario of showing corruption itself was always the same:

Office finished usual work, at 0:00 starts cron which

1. make sweep
2. make backup
3. perform script via isql which fills statistical tables
4. save gbk to tape
5. transfer gbk to another computer
6. perform control restore

earlier step 5 was absent and restore was made into another directory
at the same computer.

At the morning: backup or control restore is unsuccesfull, script
not performed (not in case of broken index, of course), users cry -
your damned program once more don't work.
Analysis of gbak log shows that some table have violated PK. After
gfix applyed we can see 1-3 raws in this table which are filled with
nulls or it's interpretations accordingly type of data. We delete them
using non-indexed attribute columns in Where clause, perform
backup/restore and go on. At first time we had suspictions that some
users don't disconnect when go home, and after including netstat >>
log into cron shows that each time it really occured when corruption
occured. But not each time when users did'nt disconnect corruption
occures, first is rather often, second VERY seldom. We decided to stop
connections compulsory and one time got deadly corrupted database, I
reported this occurance in full at

http://groups.yahoo.com/group/ib-support/message/3279

I understand this is from category "in my cellar somewhat noises",
poltergeist. I don't insist it's IB/FB problem, maybe OS one. But
looking at discussion around bug #221960 I think that my observations
can have deal with this.

Best regards, Alexander V.Nevsky.