Subject Re: choosing best index
Author Alexander V.Nevsky
--- In firebird-support@yahoogroups.com, "p_lerner" <plerner@s...>
wrote:
> I know about the optimizer. But I was wondering if composite indexes
> is usefull for a multiexpression "where" clause:
>
> select * from A,B
> where A.b_id = B.b_id
> and A.b_id = value1
> and A.f2 = value2
> and b.f2 = value3
>
> how wold you index table A to optimize this query ?

It depends on many factors, main is amount of duplicates in this
columns. Let's consider extreme situation: if one of them is for
example boolean-like, and two values are represented approximately
equally in rows of table, individual index on this column will be
killer of performance, meanwhile as a segment of composite index it
can be useful. If column of this kind contain small amount of one of
the values and your queries will be always interested in rows with
this value, individual index will help, but any query which will
search another value will be terrible slow if optimizer will choice to
use index (what it often does). About composites:

> a) A.b_id,A.f2 (composite)?
> b) A.f2,A.b_id (composite)?

for this particular query - no difference or small one. For integers I
never could notice difference, for chars Ann advices to place more
duplicated first, but AFAIU this have relation to index compression
only - more compressed index require less pages to store and read. If
condition (join or where - no difference) will use only column b_id
index a) will be still useful if b_id is low duplicated, b) - not and
visa versa for column f2.
But I should warn you against optimizing to maximum every particular
query creating most suitable indices for it's execution. Firstly, many
indices will slow down data modifications. Secondly, if you have many
indices starting with the same segments, optimizer often became
confused and, at least in FB1 and earlier servers uses all of them or
even less optimal one, chances of it's confusion grows with complexity
of the query and after 7-8 tables in join we often have terrible plan
in this case. FB1.5 optimizer is MUCH more clever, but still is'nt
excellent. I made this conceptual mistake (optimized each particular
query to death creating indices) in a rather complex database too much
time ago and as a result is forced now to total usage of explicitly
planned queries. This is very annoying. Now I think much better was to
achieve average performance for each query having small amount of
indices and individual ones usually are the best candidates if there
are no many duplicates in this columns. But it's difficult to give
universal advice, but generally you should estimate your data and have
in mind every query you construct is'nt last in your program :) Create
indices to optimize more often performed ones and take less care about
seldom needed ones.

Best regards,
Alexander.