Subject | Re: Sweep and corrupted DB |
---|---|
Author | mivi71dk |
Post date | 2002-06-25T10:01:50Z |
> This is too vague. A DB that is physically damaged can't be backedup so,
> whatever is going on, something in your overall environment isbreaking
> something (data structures? indexes? orphaning transactions? Ihaven't
> seen any hard information...)I NEVER said anmything about the file being physically damaged.
The file it self seems al right, but I can't perform ANYTHING on it
before a backup - restore rutine.
Beside I can FORCE this to happen in 4 DIFFERENT LOCATIONS !
Both on a server and on a local machine !
>removes
> It's hard to see how sweeping is implicated in corruption that is
> coincidental with users connecting and disconnecting. Sweeping
> back versions of rows that have been made obsolete by eithercommitted or
> rolled back transactions. It operates on the artifacts ofoperations that
> are already complete. It has nothing to do with users orconnections at
> all. It's also hard to see how you were able to ascertain thatsweeping
> was occurring when a user connected or disconnected.I made a small program that did nothing but doing tons of inserts and
updates.
I KNOW for a fact that BEFORE the program started I was about 400
transactions short of my sweep interval.
If I let the machine run for like 2 hours, and stops and disconnects
it all works fine.
I have made around 30000 transactions, which says, that sweep MUST
HAVE run AT LEAST twice !!
If I do the same as above, but starts up another program (which can
be the same as above, another one I wrote, etc) then my DB gets
corrupted. Almost instantly.
>been
> >Here efter no one can access the DB until a Backup - Restore has
> >issued.is data
>
> If backup and restore fixes your problems, then it's likely there
> structure damage occurring from causes external to the database:the
> commonest of these is a disk with corrupted blocks. Have youperformed any
> thorough disk surface scans during your investigations? Note thatif you
Yes, but only on the two site locatted here.
There are and has not been ANY hardware with defects.
None what so ever.
Here I must point out, that we serve somewhere around 70 users, which
runs all kinds of programs. None of the other get any errors.
> filecopy a damaged database to another disk, you will merely copyany data
> flaws engendered by damage on the original disk...program is
>
> Data structure damage can also occur when a filesystem copying
> run whilst transactions are active in the database. Compressionutilities,
> system backup utilities, even a naive system admin performingcopy/paste...
>No one does anything to the file.
No person, no filesystem, and no nothing.
At least 4 times when I made this error occur I was ALONE on the LAN,
and 2 times it was a stand alone machine.
> You can - and probably will - break data structures if you haveapplication
> programs that perform DML on system tables. This includes some ofthe more
> naive third-party utility tools, not just your own application code.What is DML ?
>The bug that was in IB 5.5 !
> >Now some questions:
> >
> >1.
> >Why do people tell me, that the sweep bug has been fixed, where as
> >far as I'm concerned it has NOT !
>
> What sweep bug are you referring to?
As fare as I'm concerned I have the same BUG.
>assumptions
> >2.
> >In which NG can I discuss BUGS in firebird ?
>
> Firebird-devel@l... - List-Subscribe:
> <https://lists.sourceforge.net/lists/listinfo/firebird-devel>
>
> I have to observe that your postings on and around this topic have
> been confused, confusing, contradictory and based on illogical
> (like connecting during a sweep corrupts databases...)reporting the
>
> If you think you are observing a bug, make things simple by
> versions of all servers and clients concerned, describing theconditions
> exactly, including the server and client platforms, the networkservices on
> protocol, tasks that the server was performing when corruption was
> supposed to have occurred (both the database server and other
> the server machine); and pasting any relevant snippets frominterbase.log
> files.things
>
> I know how frustrating it can be when things are going wrong - but
> tend to go more and more wrong when you focus on the unlikely whileprocess of
> overlooking the obvious. Problem definition should not be a
> squeezing drops of blood from a rock.See on 2 site I have DISABLE the sweep, and there has been NO
PROBLEMS since.
On 2 other site I maintained the sweep for some time, and the both
got corrupted DBs every second day.
I just disbled sweep there as well 3 days ago, and now they run
smothly.
So based on the above I can only asume, that the sweep task does
something to the DB.
Maybe its because I do something along in my programs, which I may
not do. But I can't seem to find anything logic.
The only thing I do under start and shut down of a program is getting
connected users to do something. Thats all.
I can make this error occur in FireBird vers. 1.0 and Interbase
version 6.0.2.0
The program mentioned above connects to the DB through Borland
Database Engine (version 5.1.1.1).
As far as errors here are some on the server site just around when
the error occurs:
Here I will say, that those listed below comes ffrom several
different machines, but they have been on ALL servers.
UDVIKLING (Server) Mon Jun 24 12:48:26 2002
INET/inet_error: send errno = 10054
UDVIKLING (Server) Mon Jun 24 13:07:09 2002
INET/inet_error: read errno = 10054
UDVIKLING (Client) Mon Jun 24 15:16:45 2002
C:\Program Files\Firebird\bin\ibserver.exe: terminated
abnormally (-1)
UDVIKLING (Client) Mon Jun 24 15:16:46 2002
Guardian starting: C:\Program Files\Firebird\bin\ibserver.exe
UDVIKLING (Client) Mon Jun 24 15:17:08 2002
C:\Program Files\Firebird\bin\ibserver.exe: terminated
abnormally (-1)
UDVIKLING (Client) Mon Jun 24 15:17:09 2002
Guardian starting: C:\Program Files\Firebird\bin\ibserver.exe
UDVIKLING (Client) Mon Jun 24 15:17:52 2002
C:\Program Files\Firebird\bin\ibserver.exe: terminated
abnormally (-1)
UDVIKLING (Client) Mon Jun 24 15:17:53 2002
Guardian starting: C:\Program Files\Firebird\bin\ibserver.exe
UDVIKLING (Server) Mon Jun 24 15:27:40 2002
INET/inet_error: read errno = 10054
UDVIKLING (Server) Mon Jun 24 15:27:40 2002
INET/inet_error: read errno = 10054
Along with this:
MIVIAMD1900 (Client) Tue Jun 25 09:30:32 2002
REMOTE INTERFACE/gds__detach: Unsuccesful detach from
database.
Uncommitted work may have been lost
MIVIAMD1900 (Client) Tue Jun 25 09:30:32 2002
INET/inet_error: send errno = 10054
MIVIAMD1900 (Client) Tue Jun 25 09:30:58 2002
C:\Programmer\borland\interbase\bin\ibserver.exe: terminated
abnormally (-1)
MIVIAMD1900 (Client) Tue Jun 25 09:30:58 2002
INET/inet_error: read errno = 10054
MIVIAMD1900 (Client) Tue Jun 25 09:30:59 2002
Guardian starting:
C:\Programmer\borland\interbase\bin\ibserver.exe
Regards
Michael