Subject Re: Sweep and corrupted DB
Author mivi71dk
> This is too vague. A DB that is physically damaged can't be backed
up so,
> whatever is going on, something in your overall environment is
breaking
> something (data structures? indexes? orphaning transactions? I
haven't
> seen any hard information...)

I NEVER said anmything about the file being physically damaged.
The file it self seems al right, but I can't perform ANYTHING on it
before a backup - restore rutine.
Beside I can FORCE this to happen in 4 DIFFERENT LOCATIONS !

Both on a server and on a local machine !


>
> It's hard to see how sweeping is implicated in corruption that is
> coincidental with users connecting and disconnecting. Sweeping
removes
> back versions of rows that have been made obsolete by either
committed or
> rolled back transactions. It operates on the artifacts of
operations that
> are already complete. It has nothing to do with users or
connections at
> all. It's also hard to see how you were able to ascertain that
sweeping
> was occurring when a user connected or disconnected.

I made a small program that did nothing but doing tons of inserts and
updates.
I KNOW for a fact that BEFORE the program started I was about 400
transactions short of my sweep interval.

If I let the machine run for like 2 hours, and stops and disconnects
it all works fine.
I have made around 30000 transactions, which says, that sweep MUST
HAVE run AT LEAST twice !!


If I do the same as above, but starts up another program (which can
be the same as above, another one I wrote, etc) then my DB gets
corrupted. Almost instantly.



>
> >Here efter no one can access the DB until a Backup - Restore has
been
> >issued.
>
> If backup and restore fixes your problems, then it's likely there
is data
> structure damage occurring from causes external to the database:
the
> commonest of these is a disk with corrupted blocks. Have you
performed any
> thorough disk surface scans during your investigations? Note that
if you

Yes, but only on the two site locatted here.
There are and has not been ANY hardware with defects.
None what so ever.
Here I must point out, that we serve somewhere around 70 users, which
runs all kinds of programs. None of the other get any errors.

> filecopy a damaged database to another disk, you will merely copy
any data
> flaws engendered by damage on the original disk...
>
> Data structure damage can also occur when a filesystem copying
program is
> run whilst transactions are active in the database. Compression
utilities,
> system backup utilities, even a naive system admin performing
copy/paste...
>

No one does anything to the file.
No person, no filesystem, and no nothing.
At least 4 times when I made this error occur I was ALONE on the LAN,
and 2 times it was a stand alone machine.

> You can - and probably will - break data structures if you have
application
> programs that perform DML on system tables. This includes some of
the more
> naive third-party utility tools, not just your own application code.

What is DML ?


>
> >Now some questions:
> >
> >1.
> >Why do people tell me, that the sweep bug has been fixed, where as
> >far as I'm concerned it has NOT !
>
> What sweep bug are you referring to?

The bug that was in IB 5.5 !
As fare as I'm concerned I have the same BUG.


>
> >2.
> >In which NG can I discuss BUGS in firebird ?
>
> Firebird-devel@l... - List-Subscribe:
> <https://lists.sourceforge.net/lists/listinfo/firebird-devel>
>
> I have to observe that your postings on and around this topic have
> been confused, confusing, contradictory and based on illogical
assumptions
> (like connecting during a sweep corrupts databases...)
>
> If you think you are observing a bug, make things simple by
reporting the
> versions of all servers and clients concerned, describing the
conditions
> exactly, including the server and client platforms, the network
> protocol, tasks that the server was performing when corruption was
> supposed to have occurred (both the database server and other
services on
> the server machine); and pasting any relevant snippets from
interbase.log
> files.
>
> I know how frustrating it can be when things are going wrong - but
things
> tend to go more and more wrong when you focus on the unlikely while
> overlooking the obvious. Problem definition should not be a
process of
> squeezing drops of blood from a rock.


See on 2 site I have DISABLE the sweep, and there has been NO
PROBLEMS since.
On 2 other site I maintained the sweep for some time, and the both
got corrupted DBs every second day.
I just disbled sweep there as well 3 days ago, and now they run
smothly.


So based on the above I can only asume, that the sweep task does
something to the DB.
Maybe its because I do something along in my programs, which I may
not do. But I can't seem to find anything logic.

The only thing I do under start and shut down of a program is getting
connected users to do something. Thats all.

I can make this error occur in FireBird vers. 1.0 and Interbase
version 6.0.2.0

The program mentioned above connects to the DB through Borland
Database Engine (version 5.1.1.1).

As far as errors here are some on the server site just around when
the error occurs:
Here I will say, that those listed below comes ffrom several
different machines, but they have been on ALL servers.


UDVIKLING (Server) Mon Jun 24 12:48:26 2002
INET/inet_error: send errno = 10054

UDVIKLING (Server) Mon Jun 24 13:07:09 2002
INET/inet_error: read errno = 10054

UDVIKLING (Client) Mon Jun 24 15:16:45 2002
C:\Program Files\Firebird\bin\ibserver.exe: terminated
abnormally (-1)


UDVIKLING (Client) Mon Jun 24 15:16:46 2002
Guardian starting: C:\Program Files\Firebird\bin\ibserver.exe


UDVIKLING (Client) Mon Jun 24 15:17:08 2002
C:\Program Files\Firebird\bin\ibserver.exe: terminated
abnormally (-1)


UDVIKLING (Client) Mon Jun 24 15:17:09 2002
Guardian starting: C:\Program Files\Firebird\bin\ibserver.exe


UDVIKLING (Client) Mon Jun 24 15:17:52 2002
C:\Program Files\Firebird\bin\ibserver.exe: terminated
abnormally (-1)


UDVIKLING (Client) Mon Jun 24 15:17:53 2002
Guardian starting: C:\Program Files\Firebird\bin\ibserver.exe


UDVIKLING (Server) Mon Jun 24 15:27:40 2002
INET/inet_error: read errno = 10054

UDVIKLING (Server) Mon Jun 24 15:27:40 2002
INET/inet_error: read errno = 10054



Along with this:

MIVIAMD1900 (Client) Tue Jun 25 09:30:32 2002
REMOTE INTERFACE/gds__detach: Unsuccesful detach from
database.
Uncommitted work may have been lost

MIVIAMD1900 (Client) Tue Jun 25 09:30:32 2002
INET/inet_error: send errno = 10054

MIVIAMD1900 (Client) Tue Jun 25 09:30:58 2002
C:\Programmer\borland\interbase\bin\ibserver.exe: terminated
abnormally (-1)


MIVIAMD1900 (Client) Tue Jun 25 09:30:58 2002
INET/inet_error: read errno = 10054

MIVIAMD1900 (Client) Tue Jun 25 09:30:59 2002
Guardian starting:
C:\Programmer\borland\interbase\bin\ibserver.exe



Regards
Michael