Subject [firebird-support] Re: case insensitive "where" clause
Author Svein Erling Tysvaer
STARTING can use an index, CONTAINING cannot, and LIKE cannot unless it
is a constant value not beginning with a wildcard.

"=" and BETWEEN can use an index, ">" can use an ASCENDING index and "<"
can use a DESCENDING index (I think, I'm not 100% certain about
descending indexes - they weren't very good when Firebird was at version
1.0).

The reason why CONTAINING and LIKE often cannot use an index, is because
values of fields in the index is stored from the first character
onwards (I've only used CONTAINING with CHAR, VARCHAR and BLOB SUBTYPE
TEXT, to me it doesn't sense to use it with numbers, dates or similar),
and if you do not know the first character(s), you cannot know where to
look in the index.

I think you're right in your assumptions regarding

where upper(Manufacturer) = 'CONSOLIDATED'

not using an index. Though, theoretically, if the field is defined with
a case insensitive collation, then the optimizer could know that
upper(Manufacturer) would be identical to Manufacturer and use the
defined index. Likewise, if you defined an expression index for
upper(Manufacturer), then the above WHERE clause could use an index.

Likewise, if

where Manufacturer containing 'consolidated'

are used and the field is defined as (VAR)CHAR(12), then the optimizer
could theoretically know that the above clause are identical to using
'='. If the field is defined as anything else, no index can be used
because the optimizer cannot know what the first letter(s) of the field are.

I seriously doubt the two theoretical possibilities are or will be
implemented - mainly because using UPPER or CONTAINING when they are not
the best choice is an active choice of the developer. That choice may
either be due to simply not being careful enough when writing SQL or an
active choice to prevent certain indexes from being used (similar to
using ||'' or +0). In my opinion, it is not Firebirds responsibility to
hide developer carelessness.

By the way, no-one has mentioned any 'case insensitive character set',
we have talked about case insensitive collations. A character set can
have many possible collations, some case insensitive, some case
sensitive. Not using any character set at all can have undesired effects
- I've always used ISO8859_1 and have no knowledge of what undesired
effects I could have experienced if I didn't specify the character set.
Maybe others will jump in and explain.

If you use CHARACTER SET ISO8859_1, then the field may either be defined
with a case insensitive collation or you can specify it in your select, e.g.

where Manufacturer collate PT_BR = 'CONSOLIDATED'

However, I don't think this will be able to use any index unless you
define an expression index (though I haven't checked), and the only case
I've experienced where I've tried to do something similar, was when I
had a field that contained Norwegian characters but that accidentally
had been defined without COLLATE NO_NO. Only when using WHERE
UPPER(MANUFACTURER COLLATE NO_NO) were the 3 special Norwegian
characters uppercased the same way as the 26 English (translating æøå to
ÆØÅ).

Set

bwc3068 wrote:
> Hi Everyone--
>
> thanks for the responses. i really appreciate it!
>
> i was unaware of "containing" and it sounds like that might work for me.
>
> it makes sense "like" "starting" and "containing" do not use indexes.
>
> and just to make sure I understand....
>
> not all my queries are built with "like"=20
>
> i do have many that use indexes and are done with "=3D" or ">" and what not=
> . these use indexes if there is one on the field automatically (correct?).
>
> and if I understand this thread properly...if I:
>
> where
> Manufacturer =3D 'Consolidated'
>
> it will use the index build on the Manufacturer field automatically.
>
> BUT if I use this to make it case insensitive then it does NOT use the inde=
> x
>
> where
> upper(Manufacturer) =3D 'CONSOLIDATED'
>
> and if i use this, it too is case insensitive and does not use the index (b=
> ut the where clause is slightly different)
>
> where
> Manufacturer containing 'consolidated'
>
> it seems like using the case insensitive character set is not a solution to=
> me because the tables are all already built.
>
> thanks again!
> kelly
>
>
> --- In firebird-support@yahoogroups.com, Svein Erling Tysv=E6r <svein.erlin=
> g.tysvaer@...> wrote:
>> Hi, I'm mostly repeating what others have answered before, just trying to=
> emphasize and put things together in one email:
>> =20
>> a) If you use an uppercase collation or an expression index, then an uppe=
> rcased search can be indexed.
>> =20
>> b) LIKE does not use an index.
>> The only case where LIKE can use an index, is where you use LIKE <constan=
> t> and <constant> starts with a fixed value. If you use a parameter rather =
> than a constant, or the constant starts with a wildcard (like your examples=
> do), then no index can be used.
>> =20
>> So your two examples will both be slow when you have lots (millions) of r=
> ecords in your table. An alternative to using "where upper(field) like '%UP=
> PERCASE LOOK FOR%'" is "where field containing 'LOOK FOR'". Unlike LIKE, CO=
> NTAINING is case insensitive. However, it still cannot use any index.
>> =20
>> Set