firebird-architect - Re: Additional index kinds like R*Tree for Firebird

Subject	Re: Additional index kinds like R*Tree for Firebird
Author	stephane
Post date	2011-01-14T09:18:09Z

For anyone that could be interested, i create a demo projet with the data (it's take me one week to fullfill the database with millions of rows)

the demo project and the database with plenty of records (10000000) are here : https://sourceforge.net/projects/alcinoe/files/alsqlbenchmark/1.01/

the main point i want you to see is the difference between Firebird multi index vs SqlIte3 Rtree

in firebird

CREATE TABLE HASH(
ID INTEGER NOT NULL,
x1_y1 SMALLINT NOT NULL,
x1_y2 SMALLINT NOT NULL,
x1_y3 SMALLINT NOT NULL,
x1_y4 SMALLINT NOT NULL,
x1_y5 SMALLINT NOT NULL,
PRIMARY KEY (ID)
);
CREATE INDEX HASH_X1_Y1_IDX ON HASH (X1_Y1);
CREATE INDEX HASH_X1_Y2_IDX ON HASH (X1_Y2);
CREATE INDEX HASH_X1_Y3_IDX ON HASH (X1_Y3);
CREATE INDEX HASH_X1_Y4_IDX ON HASH (X1_Y4);
CREATE INDEX HASH_X1_Y5_IDX ON HASH (X1_Y5);

Select
ID
from
HASH
where
x1_y1 >= xxx and
x1_y1 <= xxx + 20 and
x1_y2 >= yyy and
x1_y2 <= yyy + 20 and
x1_y3 >= zzz and
x1_y3 <= zzz + 20 and
x1_y4 >= www and
x1_y4 <= www + 20 and
x1_y5 >= uuu and
x1_y5 <= uuu + 20;

take around 180 ms with 10 000 000 rows table (you can try it by yourself with the demo data)

the same query under the SQLite3 with same amount of data (10 000 000) but with the Rtree index, it's take around 5 ms ! (you can also try it by yourself with the demo data)

NOTE :

In Firebird, The "trick" to do

CREATE INDEX HASH_X_Y_IDX ON HASH (X1_Y1, X2_Y2, X3_Y3, X4_Y4, X5_Y5);

and

Select
ID
from
HASH
where
x1_y1 in (xxx, xxx+1, xxx+2, xxx+3, ..., xxx+20) and
x1_y2 >= yyy and
x1_y2 <= yyy + 20 and
x1_y3 >= zzz and
x1_y3 <= zzz + 20 and
x1_y4 >= www and
x1_y4 <= www + 20 and
x1_y5 >= uuu and
x1_y5 <= uuu + 20;

make the time of the select go down from 150 ms to 20 ms !!!!

i m curious to know if their is any other "tricks" like this (or any way to improuve it) to optimise my query ...

also to finish, if you study it, i want also you know that sqlite use also a "trick" for is Rtree Index, because in fact the rtree index of sqlite3 if simple 3 more table for each table :

original Table is Hash;

sqllite simply create these 3 more table :

table Hash_node (Nodeno, data);
table Hash_rowid (rowid, nodeno);
table Hash_parent (Nodeno, parentnode);

the code source can be found here :
http://www.sqlite.org/cvstrac/fileview?f=sqlite/ext/rtree/rtree.c&v=1.14

Starting to simply implemente this easy way of rtree in Firebird can be a very good value for firebird !

regards
Stéphane

--- In Firebird-Architect@yahoogroups.com, "chris.waldmann" <Christian.Waldmann@...> wrote:
>
> From time to time there are request for support of Geographic Information System (GIS) in firebird. Supporting GIS is based on additional index kinds, additional data types and additional functions for spatial calculations.
> In this topic, I want only discuss the first step, additional index kinds.
>
> Goal and usage:
> R-Tree-Indexes are known to be efficient as a spatial access method (SAM) for searching multi dimensional objects like countries and roads on a map or volumes in a 3D CAD system.
> The optimized R-Tree known as R*Tree was also tested for indexing points. Also as point access method (PAM) the R*Tree is very efficient.
> Range queries on search platforms are multi dimensional point searches and can profit from R*Trees. GPS data is very common to day.
>
> From this point of view, every modern and future oriented database systems should include at least a R*Tree index system!
>
> Implementation Scenarios:
>
> Scenario SQL-Script: In SQLite, the R*Tree is implemented with additional tables, based on the existing B+tree-Index-system. In Firebird, additional tables, stored procedures and triggers can be used. A script generator would be useful to customize the implementation.
>
> Scenario hardcoded: only the R*Tree is implemented in parallel with B+Tree.
>
> Scenario GiST: when using the Generalized Search Tree (GiST), the storage of the index is hard coded, and the kind of index is implemented with 4 callback functions. Predefined functions for R*Tree, hash, full-text... can be supplied, but every user can add his index kind.
>
> Integration in SQL
> The syntax to indicate the index kind has to be defined, e.g. like "create MYINDEX on field1 ... using RTREE".
> How can the optimizer choose the best index?
>
> Feel free to comment on single subjects, and to think about sponsoring :-)
>
> Links:
> Basics, demo and implementations: http://en.wikipedia.org/wiki/R*_tree
> SQLite: http://www.sqlite.org/rtree.html
> Theory: http://epub.ub.uni-muenchen.de/4256/1/31.pdf
> GiST: http://gist.cs.berkeley.edu/
>