Subject | Re: [firebird-support] Re: about searching in a blob |
---|---|
Author | Ivan Prenosil |
Post date | 2003-07-29T12:05:36Z |
> > "containing" is case insensitive, but possibly not quite perfect withUPPER() and CONTAINING need to know collation (other than default/binary one)
> > not-lower-ANSI characters.
>
> I'll try it out tomorrow, lets see what happens, I have many "��������"
if they should work correctly with accented characters.
Because it is not possible to specify collation for blob column,
(even not by direct updating system tables)
CONTAINING will be case insensitive for basic a-z,A-Z letters,
but case sensitive for national/accented ones.
Other potential problem is that by default blobs are stored as segmented,
and CONTAINING will not find string if it is split by segment boundaries.
For small volume of data I would recommend to write UDF that will
return 1/0 if the pattern is found/not found, and that handle all case-insensitive
(or even accent-insensitive) and segment boundaries problems inside.
It could also be designed to distinguish word boundaries,
and so will not find "chalice" if what you want is "alice".
e.g. SELECT * FROM address WHERE 1=SearchInBlob (address, 'glodt');
For higher volume of data full-text searching is ideal, of course,
i.e. cut blobs into words, store them in separate table/index, etc...
Ivan Prenosil
Ivan.Prenosil@...
http://www.volny.cz/iprenosil/interbase
----- Original Message -----
From: "Yves Glodt" <yg@...>
> On Monday 28 July 2003 22:13, Aage Johansen wrote:
> > On Mon, 28 Jul 2003 19:05:45 +0000 (UTC), Yves Glodt <yg@...> wrote:
> > > I have this table in my php knowledgebase system:
> > > CREATE TABLE "ADDRESS" (
> > > "CATEGORY" INTEGER NOT NULL,
> > > "NAME" VARCHAR(100) CHARACTER SET ISO8859_1 NOT NULL,
> > > "KEYINDEX" INTEGER NOT NULL,
> > > "ADDRESS" BLOB SUB_TYPE TEXT SEGMENT SIZE 100 CHARACTER SET
> > > ISO8859_1,
> > > "DATE_MODIFIED" VARCHAR(25) CHARACTER SET ISO8859_1,
> > > "HOST_MODIFIED" VARCHAR(25) CHARACTER SET ISO8859_1,
> > > "USER_MODIFIED" VARCHAR(100) CHARACTER SET ISO8859_1
> > > );
> > >
> > > and I have a search functionality which uses this query:
> > >
> > > $sth = ibase_query("SELECT KEYINDEX,NAME,CATEGORY,DATE_MODIFIED
> > > FROM ADDRESS WHERE UPPER(NAME) LIKE
> > > '%".strtoupper($_POST['crit'])."%' OR ADDRESS LIKE
> > > '%".$_POST['crit']."%' ORDER BY 2 DESC");
> > >
> > > I know the UPPER/strtoupper stuff for the varchar column is ugly
> > > but it works. Well for the blob it doesn't, I cannot get (or
> > > fake...) case- insensitivity with UPPER (I get string conversion
> > > errors)... How could this be done? I use FirebirdSS-1.0.2.908-0 on
> > > RH9
> >
> > You could try
> > ...
> > ADDRESS containing 'your string here'
> > order by 2 desc
> >
> > "containing" is case insensitive, but possibly not quite perfect with
> > not-lower-ANSI characters.
>
> I'll try it out tomorrow, lets see what happens, I have many "��������"
>
> thank you,
> Yves
>
> > --
> > Aage J.