Subject backup, restore & db size (2)
Author Claudio Valderrama C.
Following on the issue with the strange conversion error that Guido reported
(that has nothing of strange, BTW), I clicked accidentally in the list of
sent posts and found that my letter was chopped, although the signature is
present. I should have dumped less lines than needed from the text file
where I was writing the case!

«More interesting is that I went to another copy of the 8 KB gdb and issued:
alter index i_saname_a_u inactive;
alter index i_saname_a_u active;

Engine answer: crash with memory error, go to home, good bye.

IB 6.0.0.627 could complete the task.
»

These were the last lines. Hmm, where's my explanation. In no place, too
pathetic. So, I will take advantage and put it in much more detail.

First, the command
select S.SAID from SALESAREAS S,DOS_CONSTANTS D where D.NAME=S.SANAME
and D.SHORTCUT=80
is tricky because the SHORTCUT column is a character string. Viewing Guido's
db, that column has every kind of garbage: entries that resemble dates, free
text and a few numbers. Try to convert them to numbers and see what happens.
:-)

Here's an excerpt of the column:
E-@
MOR
WOS
01
0/0/0
1-1000
Bertelsmann

The "conversion error" msg as absolutely nothing to do with page sizes. What
happens is that the engine makes no promises on the order of evaluation in
AND/OR conditions so any side can be taken first and whatever is taken, it's
an implementation detail. Different from the evaluation rules held in
typical programming languages. SQL was meant to be dumb. So, once you backup
and restore, the physical order of fields in the db may change. In the
statement above, the correct command is
and D.SHORTCUT = '80' instead of 80 as number. The precendence rules try to
convert anything to number first when comparing operands.
Basically, after backup, the records happened in an order that allowed the
D.NAME=S.SANAME
condition to exclude by index anything that could cause an error in the
second condition. Before people ask why an index would pick some entries and
not the same entries the second time, the answer is in the backup: even if
you restore again to a 4KB db, you will see the bug. When you restore, index
selectivity is recalculated. And observe the difference:

select S.SAID from SALESAREAS S,DOS_CONSTANTS D where D.NAME=S.SANAME
and D.SHORTCUT=80

4K db:
PLAN JOIN (S NATURAL,D INDEX (I_DOS_CONSTANTS2))
=> this is on DOS_CONSTANTS.NAME

8 KB db:
PLAN JOIN (D NATURAL,S INDEX (I_SANAME_A_U))
=> this is on SALESAREA.SANAME

In the second case, the index I_SANAME_A_U is on the field that has
non-numeric entries, so unlike the old db, those entries aren't skipped.
Then they are evaluated for the condition that cannot be satisfied when an
index, SHORTCUT cannot be converted to number, hence "conversion error" when
comparing with 80. For some reason, there's a difference in the dbs Guido
sent me that determine the plan change:

Database: H:\proy\guido\PAGESIZETEST4k.GDB
SQL> select rdb$statistics from rdb$indices where
rdb$index_name='I_DOS_CONSTANTS2';
0.001811594236642122

SQL> select rdb$statistics from rdb$indices where
rdb$index_name='I_SANAME_A_U';
0.1250000000000000

Database: H:\proy\guido\PAGESIZETEST8k.GDB
SQL> select rdb$statistics from rdb$indices where
rdb$index_name='I_DOS_CONSTANTS2';
0.0000000000000000

SQL> select rdb$statistics from rdb$indices where
rdb$index_name='I_SANAME_A_U';
0.0000000000000000

After this
SQL> alter index I_dos_constants2 inactive;
SQL> alter index I_dos_constants2 active;
SQL> alter index I_saname_a_u inactive;
SQL> alter index I_saname_a_u active;
and a commit in the 8KB database, we get:

SQL> select rdb$statistics from rdb$indices where
rdb$index_name='I_DOS_CONSTANTS2';
0.001811594236642122

SQL> select rdb$statistics from rdb$indices where
rdb$index_name='I_SANAME_A_U';
0.1250000000000000
that's exactly what we got in the 4 KB databases. But the engine still uses
the same plan than before for the 8 KB db:

SQL> select S.SAID from SALESAREAS S,DOS_CONSTANTS D where D.NAME=S.SANAME
CON> and D.SHORTCUT=80;

PLAN JOIN (D NATURAL,S INDEX (I_SANAME_A_U))

SAID
=======
Statement failed, SQLCODE = -413
conversion error from string ""

Let's force the plan, then:

SQL> select S.SAID from SALESAREAS S,DOS_CONSTANTS D where D.NAME=S.SANAME
CON> and D.SHORTCUT=80
CON> plan join(s natural, d index(i_dos_constants2));

PLAN JOIN (S NATURAL,D INDEX (I_DOS_CONSTANTS2))

SAID
=======
6

That's exactly the same plan and result obtained with the 4 KB database. I
can't overstate that there's nothing wrong in the engine.

The correct command is:
select S.SAID from SALESAREAS S,DOS_CONSTANTS D where D.NAME=S.SANAME
and D.SHORTCUT='80' <<<<<<<< string, not int.

The problem is not in the engine, but due to sloppy SQL programming (sorry,
Guido, but you should adjust your tests to the data type of the column for
consistent results). The page size has nothing to do per itself, other than
affecting the index distribution and other factors that the optimizer uses
to decide among alternative execution plans. But the page size increse is
not causing a conversion error. Data has not been mangled in any way in the
new 8K db.

For the people curious to know why the engine crashed and threw messages
during backup in my original message, I wanted to say the reason at the
tail, but I failed to put it: I wanted to demonstrate the dangers of using
an incompatible collations library. It happens because earlier IB6 &
Firebird used an incompatible hook with gdsintl that was fixed in newer
versions.

C.
--
Claudio Valderrama C. - http://www.cvalde.com - http://www.firebirdSql.org
Independent developer
Owner of the Interbase® WebRing