firebird-support - Re: Should updates be this slow?

Subject	Re: Should updates be this slow?
Author	inoffensive_2006
Post date	2007-09-05T04:28:14Z

--- In firebird-support@yahoogroups.com, "Alan McDonald" <alan@...> wrote:

>
> > > > Hi Folks:
> > > >
> > > > I'm running Superserver 2.0.1 in Windows XP Media on a
> > > > low end AMD X2, a 3800+, and a 100GB IDE drive. My app is
> > > > running on the system.
> > > >
> > > > This isn't the system I'd choose for the best performance,
> > > > but it's probably representative of what my customers will
> > > > have.
> > > >
> > > > The contact table, describing persons and organizations,
> > > > is a fat thing, but a SELECT of the database key and a
> > > > varchar large enough to hold a person's first and last name
> > > > will plow through at a little over 7000 records a second, with
> > > > overhead for my C++ code to allocate a class instance and put
> > > > its address in a STL vector.
> > > >
> > > > The vector is sorted and an index is calculated for each
> > > > record. The UPDATE sets a new value for a BIGINT index
> > > > in the contact table.
> > > >
> > > > This operation is grinding along at about 330 records per
> > > > second.
> > > >
> > > > I've altered the the index for these records, to disable
> > > > the index during the updates, and enable after the updates.
> > > >
> > > > Does this sound like the performance I should expect, or
> > > > am I doing something wrong?
> > > >
> > > > I'm ignorant of the ways of SQL and Firebird, so I'm
> > > > interested in suggestions.
> > > >
> > > > Thanks
> > > > Larry
> > >
> > > What, precisely, are you trying to do here?
> > > you're retrieving all records in a table, calculating a field on the
> > client
> > > side and updating that field value as you move through the records
> > (or after
> > > retrieving all records)?
> > > You are somehow using this calculated value as an index of some kind
> > > relevant to your application and/or business model? (STL vector?

what's

> > > that?)
> > > Sorry, I'm lost as to what this is all trying to achieve but it

sounds

> > > clumsy whatever it is.
> > > Alan
> > >
> > Thanks Alan:
> >
> > It may be clumsy.
> >
> > Small picture - should an UPDATE of a BIGINT in a table
> > take 20 times longer than a SELECT on a BIGINT key and a
> > VARCHAR in the same table?
> >
> > This is the big picture.
> >
> > I'm developing a new piece of code to keep an array of
> > small records that consist of keys and the sort order of a
> > contact record in a Windows memory mapped file.
> >
> > I've decided to start working on the code that will read
> > these fields, and I want to generate the BIGINT sort order
> > index for each record to prepare the table for reading.
> >
> > The production code will be generating these indexes as
> > the records are inserted into the database, but a different
> > piece of code, that doesn't generate these indexes, is
> > building the table now.
> >
> > It takes a little more than a second to read all 8000 of
> > the records, and 24 seconds to update them.
> >
> > I'm just curious about the speed of the updates. And I'm
> > wondering if I'm doing something that makes them that slow.
> >
> > I am interested in any suggestions.
> >
> > I appreciate your swift response to my question. It's
> > almost 4:00 in the morning here, so I'll take a look at
> > this thread in a few hours.
> >
> > Thanks
> > Larry
>
> it's hard to know if you have the table setup properly. show the

metadata.

> PK declared? etc?
>
> I still can't work out where YOUR index departs from the DATABASE index.
> you can't index the record on the server?

Thanks Again Alan:

I've never tried stored procedures, but I've been reading about
PSQL in Helen's book. I'll try to code this on the server side
tonight.

> what calc are you doing on this index field?

I'm glad you asked. Every new row will receive a 64 bit key
from a generator named for the table that has the new row. I
want to assign a number indicating an alphabetical sort order to
each row. The sort order is referred to in an ascending and a
descending index.

Records are going to be added regularly, so I want an index
scheme that allows me to assign a sort order to a new row that
often needs to fit between the sort order of two existing rows.
So I've defined a constant of my C++ code called MAX_INT_64 and
assigned it the max value of a signed 64 bit int.

The first item inserted into the database will have a sort
order of 1/2 of MAX_INT_64's value, and the next item will have a
sort order of either 1/4 or 3/4 of MAX_INT_64. And so on...

This should give me many inserts before a collision on the
sort order, but such a thing is possible if 64 sorted rows are
imported. A collision will result in re-calculating the sort
order values.

Re-indexing the alpha sort order will involve assigning
new sort order values to each row. These values will be evenly
distributed from 0 to MAX_INT_64.

The sort order is mine, because I don't believe an existing
collation will give me what I want.

If there is a simpler method that will do what I want, I'm
interested.

> Updates are slower than selects especially if the cache already has the
> records you are retrieving, it's still hard to understand the exact

process

> you are using.

Having the records in cache slows upates?

> What language?

The language is C++. "STL vector" is a Standard Template
Library vector. Everything is dynamic SQL passed to Firebird
through IBPP's Execute().

> What connectivity components?

Connectivity? Superserver 2.0.1 running on the same machine
as the app. The DatabaseFactory() used to open the database is
passed a blank text for the server, and the database path and
name is identified as "C:\....".

Does that answer your question? I'm not sure.

> What metadata do you have?

I'm not sure what the term "Metadata" means.

Here is some code from CREATE TABLE SQL:

CREATE TABLE CON
(
CON_KEY BIGINT NOT NULL, /* 64 Bit */
...
CON_ALPHA_SORT_ORDER BIGINT NOT NULL, /* This is the index */
/* I'm building. */
...
CON_SORT_NAME VARCHAR(1024) NOT NULL, /* Encrypted last, */
/* first, M.I. */
...
PRIMARY KEY
(
CON_KEY
)
};

The contact table also has the following indexes defined.

CREATE ASCENDING INDEX
ASCENDING_ALPHA_INDEX
ON
CON (CON_ALPHA_SORT_ORDER);

CREATE DESCENDING INDEX
DESCENDING_ALPHA_INDEX
ON
CON (CON_ALPHA_SORT_ORDER);

I know you are trying to help me, and I hope this
answers your questions.

All of the times I've presented in this thread were
calculated from runs after the index had been altered to
inactive.

I've just added "NOT NULL" to CON_ALPHA_SORT_ORDER and
CON_SORT_NAME. And re-built the database, with the old
code that now puts "0" in the CON_ALPHA_SORT_ORDER column.

That improved the updates to 1540 per second. Almost
5 times faster. Calculating and updating the index for
8000 records now takes 5.3 Seconds.

That's a big improvement. Updating CON_ALPHA_SORT_ORDER
now takes about 5 times as long as the SELECT on the key and
sort name.

Is this approaching the performance you would expect from
a client side dynamic SQL modify of a single value? Or is
there room for improvement?

I'll try my first stored procedure to see what effect that
has.

I appreciate everybody's suggestions.

Thanks
Larry