Subject RE: [firebird-support] Searching on BLOB data - case sensitive
Author Helen Borrie
Original question:

> >If I submit a SQL statement like this
> >
> >select
> > description
> >from
> > items
> >where
> > description like '%computer%'
> >
> >Where the field "description" is a BLOB field,
> >I find that the search is case sensitive.
> >How can I construct a statement to be
> >case insensitive. Do I set something in the
> >FB engine or construct a different SQL
> >statement?

At 09:08 AM 3/06/2005 +1000, Kevin Day wrote:


>A sure fire way is
> select
> description
> from
> items
> where
> upper(description) containing 'COMPUTER'
>
>or
>
> select
> description
> from
> items
> where
> upper(description) like '%COMPUTER%'

But not when searching a blob! Since you can't uppercase a blob,
UPPER(description) is a non-starter.

As to scanning (comments by others) neither CONTAINING nor LIKE searches,
even on indexable columns, can use an index, so a table scan is
inevitable. However an expression using LIKE 'something%' will resolve
internally to STARTING WITH 'something', which does use an index, if one is
available. Natch, *blob columns* are non-indexable.

Any CONTAINING search on a blob column has to scan the blob until it finds
the search string. This will be fast or slow to the degree to which the
searched strings are buried in the blob's text. Two (of several) ways you
can speed up a search on a text blob are

1) Have the client program extract keywords and prepend them to the top of
the blob text before you write the blob to the database - remembering that
"updating" a blob actually replaces the old one entirely, so keywords
stored this way won't survive an update
or
2) Maintaining a column (or a table) of keywords. Separate, intersecting
table needed if you need to store multiple keywords per blob; and should
be set up as a many:many relationship if the same keyword can appear in
multiple blobs.

There are other strategies, too, all involving interaction between database
and client application. SQL on its own is pretty limited.

./hb