Subject Re: [ib-support] Re: Sweep and corrupted DB
Author Helen Borrie
At 10:01 AM 25-06-02 +0000, you wrote:

> > all. It's also hard to see how you were able to ascertain that
>sweeping
> > was occurring when a user connected or disconnected.
>
>I made a small program that did nothing but doing tons of inserts and
>updates.
>I KNOW for a fact that BEFORE the program started I was about 400
>transactions short of my sweep interval.
>
>If I let the machine run for like 2 hours, and stops and disconnects
>it all works fine.
>I have made around 30000 transactions, which says, that sweep MUST
>HAVE run AT LEAST twice !!

Erm, no. The sweep interval is counted as the difference between the
oldest interesting transaction and the next transaction. Once a
transaction is committed or rolled back, it is no longer "interesting" and
the OIT moves forward. So, in a system which takes care of committing or
rolling back work, theoretically a sweep which is preset by a non-zero
sweep interval will never occur. As far as I can tell from your postings,
it's fairly unlikely that any auto-sweeping is happening in your databases,
because the gap never gets even slightly close to the threshold.

All that given, the sweep interval sets a *threshold* for the next
sweep. That is, the next sweep will not occur *before* the gap between OIT
and Next Trans reaches the sweep interval figure. It does not predict what
the gap will actually be when the sweep occurs.

What I can't tell you (because I simply don't know) is how the engine
decides when to commence sweeping. Claudio or Ann would know that, being
afficionados of the source code.


>If I do the same as above, but starts up another program (which can
>be the same as above, another one I wrote, etc) then my DB gets
>corrupted. Almost instantly.

How do you know it is corrupted?

> > commonest of these is a disk with corrupted blocks. Have you
>performed any
> > thorough disk surface scans during your investigations? Note that
>if you
>
>Yes, but only on the two site locatted here.
>There are and has not been ANY hardware with defects.
>None what so ever.
>Here I must point out, that we serve somewhere around 70 users, which
>runs all kinds of programs. None of the other get any errors.

What kind(s) of errors do you get with your Firebird application? This is
actually *important*. For example, does your app poke Scandinavian
alphabet characters in charset NONE columns? Does it attempt to
concatenate very long strings? Does it call any UDFs (user-defined
functions)?


>What is DML ?

Data manipulation language - the SQL for creating and altering data, as
constrasted with DDL, which are the statements (create, alter, drop,
recreate) for operating on database objects.

> > What sweep bug are you referring to?
>
>The bug that was in IB 5.5 !
>As fare as I'm concerned I have the same BUG.

I never used 5.5. I used 5.1.1 for two years and then went directly to 5.6
for three years. I'm still using 5.6 for existing projects; otherwise
Firebird. I never encountered any bug with sweeps, either auto or
manual. Generally, though, I set the sweep interval to zero and keep an
eye on the statistics. I don't consider auto-sweeping a wanted feature in
a system where there is good turnaround of transactions...a personal view.

But I'm still pretty certain that the association you have made between
sweeping and connecting/disconnecting is tenuous. I don't think we have
even seen any evidence that auto-sweeps have been done on your databases at
all.


>I can make this error occur in FireBird vers. 1.0 and Interbase
>version 6.0.2.0
>
>The program mentioned above connects to the DB through Borland
>Database Engine (version 5.1.1.1).

Now we may be getting to something that could be anticipated to cause
problems. Those are the BDE drivers from Delphi 5, right? Did you realise
that that version of the BDE (known as 5.11) is for InterBase 5? It is not
certified for IB 6 and there is no reason it would be, since IB 6 didn't
exist when it was released.

Borland has released a BDE 5.2 for IB 6, but it is known to have problems
with some Dialect 3 data types.

Aside from driver incompatibility, the BDE all by itself can get smashed up
by users crashing out of their connections. Then, Paradox artifacts get
left behind that prevent anyone from logging in until they are cleaned
up. On networks you can have horrible complications if there are other BDE
applications in use that were installed without concern for the existing
BDE setup (unfortunately, all too typical of many shareware and commercial
apps that use the BDE). Perhaps it is not your databases that are getting
broken at all, but the connectivity setup on your networks.

>As far as errors here are some on the server site just around when
>the error occurs:

[snipped some very ordinary-looking logs]


Unfortunately, all these tell you is that your Firebird server has
intermittent network problems, that you possibly have some users who are
crashing out of an application and that the server crashes
occasionally. The InterBase log doesn't tell you much more except that
possibly you had someone there who terminated an application without
committing work.

Don't you have any errors coming back to the client apps from these
databases that you think are corrupt?

heLen

All for Open and Open for All
Firebird Open SQL Database · http://firebirdsql.org ·
http://users.tpg.com.au/helebor/
_______________________________________________________