Subject Re: Fixing Database
Author salisburyproject
I hope this provides the missing details:

- FB is v1.5.1. Superserver
- ODBC is v1.2.0.68
- The PC has 1Gb RAM, 40 GB Disk space, of which 28 free

Here is the connection string:

DRIVER=Firebird/InterBase(r) driver;
DBNAME=D:\DataCenter\DB\Data\DCDB.FDB; UID=GWC; PWD=******; ROLE=SU

For isql:

connect "D:\DataCenter\DB\Data\DCDB.FDB" user 'gwc' password '******';

The path is stored in registry in the following
format "..\..\DCDB.FDB", if on the same drive. If on another the full
path is stored. The application transforms always to full path, incl.
drive, as you can see above.

All I have in firebird.log is:

JOYBRINGER (Client) Thu Mar 10 22:31:53 2005
Guardian starting: D:\Program
Files\DataCenter\DB\bin\fbserver.exe

I guess that the sort files are from the selects from the "temp"
table. It is not a real temp table, but a table where the raw data
from the DB2 is initally inserted. Then a human checks and modifies
some fields, which are again saved in this "temp" table.
Next, the application generates some additional data (like hash
values, automatic classification, etc.) - sorry, I cannot reveal all
details here :(.
This is done in memory. The modified data is inserted - single record
each time, updating all four tables (4 or more INSERT statements) and
commiting the transaction. Yes, commit after each row in the main
table. Two other tables may have more than one INSERT. ID's as PK's
are generated BEFORE INSERT from two generators for the main and
hashes tables.

Indices: 1 on varchar(20) and 1 on varchar(250) + 3 on int. Charset
is NONE.

I am afraid that I cannot paste here the exact INSERT statements, but
they are really simple. The data is passed verbatim, not from another
select. Example:
INSERT INTO TABLE1 VALUES ('','blah','blah','5');
Only VARCHAR, CHAR and INT fields are used.

The logs I mention are mine, but they will contain any ODBC error, as
returned by the driver. As I wrote, there was no such error.

The temp (sort) files are from the same day, but I cannot relate them
to a given operation, as someone was working with the app all the
day. I guess they are result from SELECT from the "temp" table. It
has 2.3 million records and the SELECT has ORDER BY clause.

Thanks a lot,

Kiril.


--- In firebird-support@yahoogroups.com, Helen Borrie <helebor@t...>
wrote:
> At 11:24 AM 13/03/2005 +0000, salisburyproject wrote:
>
> >sorry if I wrongly presented the exact case.
> >I am still worried about the situation when if such corruption
occurs
> >I cannot give solution to the customer, without contacting third
> >party. I will need to clarify this to them - the choice of FB and
> >Windows is their's.
> >
> >So, I will present the case from begining.
> >
> >Windows 2000, SP4.
>
> What version AND model (Classic/Superserver/Embedded) of Firebird
are you
> using?
>
> >Plenty of disk space, memory, etc.. Not a resource problem.
>
> If your ORDER BY and GROUP BY queries are storing huge temp files
that are
> not getting deleted, then your resources might not be as plentiful
as you
> think.
> -- Firebird 1.5 tries to maintain sort files in memory and will
only resort
> to writing them to disk if memory gets too low.
> -- Firebird 1.0 always writes sort files to disk
>
> Now --- if the corrupting operation is writing sort files, one
could guess
> that your insert operation is of the form
>
> insert into blah (<list of columns>)
> select list of columns from sourcetable
> order by ...
>
> which would cause sort files to be written to disk in Fb 1, and
would cause
> them to be paged to disk in Fb 1.5 if memory were inadequate. But
then -
> if the operation appears to finish, you close the statement, then
close the
> connection, and are left with sort files on disk, then what REALLY
happened
> is that the server crashed for some external cause that it couldn't
detect
> for itself. Your description of path management (further down)
does
> nothing to eliminate the possibility that your path configuration
has
> enabled this to happen. See later comments.
>
> >The Database was created with FB's isql.
> >
> > >From gstat -h:
> >Attributes force write
> >
> >It has 4 tables and has no Stored Procedures.
> >Two generators are used for insertion of unique row ID in two
tables.
> >5 indices - 2 on varchar fields, 3 on int fields.
>
> Describe the sizes and segments of the indices.
>
>
> >The application uses FB's ODBC driver to connect (via ADO).
>
> Version?
>
> >The DB path is kept in registry and is relative to the app location
> >(e.g. no connection with different paths can be made).
>
> Firebird isn't a file-served database: you do not connect to
databases by
> way of relative paths.
> -- What does your application do with the path it extracts from the
> Registry, to make it a legal path for the server?
> -- What connection protocol is specified in your DSN?
> -- What *exactly* does your connection string look like when the
connection
> is made?
>
> >DB is local (not remote, shared, mapped, etc. drive)
>
> But it's being accessed via a relative path, right? What is the
network
> protocol (if you didn't answer this earlier)?
>
>
> >The application migrates external data (from DB2 and text files) to
> >FB. Each record needs processing and updating by a human. A temp
> >table is used to store modified data,
>
> Describe what you mean by "temp table".
> -- How is the temp table created?
> -- When is it created?
>
> What is the human doing? i.e. is this an interactive application?
or is
> the human driving a client app that consumes a lot of resource,
e.g. loops
> through a dataset of 50,000 rows? repaints a row counter at each
iteration?
>
> >which after being validated by
> >the application logic is inserted into FB. Simple inserts
statements.
>
> Give an exact sample of the SQL that performs the inserts. Also
describe
> how you are handling transactions.
>
> >When the corruption occured, the DB was updated with ~50K rows,
> >growing up to 234 Mb on disk.
>
> What batch sizes are the commits of these ~50K rows broken up
into? (By
> common practice, it should be around 8K rows per transaction; less
if the
> rows are large; possibly more if the rows are small).
>
> >The application was closed properly (I can tell this from internal
> >logs), including proper connection closing (sure about this!).
>
> By "internal logs", do you mean firebird.log? (interbase.log if you
are
> using Fb 1.0). Or an application log that merely records what the
> application attempted to do?
>
>
> >I then tried to connect again from a small viewer application and
got
> >the error. Tried with isql - same.
>
> Using exactly what connection string? Report the exact string that
you
> used in the viewer and with isql. I mean - EVERYTHING - including
the host
> name.
>
> >gfix runs with -v -f -i, but
> >doesn't fix the Database.
> >
> >Running IBFirstAid, I found that a page is beyond the end of file,
> >i.e. some data was not written to the disk.
> >
> >Looked in Windows log for some problems - nothing.
> >
> >Now, something I mentioned in the first post:
> >There were 4 huge files (~500 Mb) in the Winnt/Temp dirctory,
called
> >fb_sort_xxx
>
> Check the times on these files. As mentioned before, if the
connection is
> supposedly closed, then these files should be gone. If they
persist, then
> EITHER the server crashed (look at the firebird log) OR the
connection is
> still live and the commit is incomplete.
>
>
> >All my classes have error handling, and if an error is returned by
> >the DB, it is written to a log. But there was no error at all while
> >inserting data.
> >
> >What I did until now was to use the data in temp table and repeat
the
> >process on a new DB and all went well. It is almost the same size
> >(few bytes difference) from the corrupted one.
> >
> >I don't know what other details I can provide here...
> >I am sure that in the process of inserting data there was no error
> >and no other connection to the DB.
>
> However, if the connection was live and a transaction was
uncommitted when
> you used your viewer, isql or gfix to connect to the database using
a
> compromised path, then the compromised connection would see a
corrupt
> database.
>
> There are still some essential facts missing here. Please review
this
> posting carefully and provide the requested details.
>
>
> >The Windows host has no backup software at all, there were no other
> >(except the usual) processes running. I checked for viruses with
> >Norton and AVG - nothing.
>
> Set virises aside for now. Let's address the connection issues as
> precisely as possible.
>
> ./hb