Subject Re: [firebird-support] Re: about searching in a blob
Author Yves Glodt
On Tuesday 29 July 2003 14:05, Ivan Prenosil wrote:
> > > "containing" is case insensitive, but possibly not quite perfect
> > > with not-lower-ANSI characters.
> >
> > I'll try it out tomorrow, lets see what happens, I have many
> > "öäüéàèëê"
>
> UPPER() and CONTAINING need to know collation (other than
> default/binary one) if they should work correctly with accented
> characters.
> Because it is not possible to specify collation for blob column,
> (even not by direct updating system tables)
> CONTAINING will be case insensitive for basic a-z,A-Z letters,
> but case sensitive for national/accented ones.
> Other potential problem is that by default blobs are stored as
> segmented, and CONTAINING will not find string if it is split by
> segment boundaries.
>
> For small volume of data I would recommend to write UDF that will
> return 1/0 if the pattern is found/not found, and that handle all
> case-insensitive (or even accent-insensitive) and segment boundaries
> problems inside. It could also be designed to distinguish word
> boundaries,
> and so will not find "chalice" if what you want is "alice".
>
> e.g. SELECT * FROM address WHERE 1=SearchInBlob (address, 'glodt');
>
> For higher volume of data full-text searching is ideal, of course,
> i.e. cut blobs into words, store them in separate table/index, etc...

for now I'll stay with CONTAINING which works very well and faster than
the UPPER playing. all my éàèöäü are lowercase for now.

thank you all for your replies,
Yves


> Ivan Prenosil
> Ivan.Prenosil@...
> http://www.volny.cz/iprenosil/interbase
>
> ----- Original Message -----
> From: "Yves Glodt" <yg@...>
>
> > On Monday 28 July 2003 22:13, Aage Johansen wrote:
> > > On Mon, 28 Jul 2003 19:05:45 +0000 (UTC), Yves Glodt <yg@...>
wrote:
> > > > I have this table in my php knowledgebase system:
> > > > CREATE TABLE "ADDRESS" (
> > > > "CATEGORY" INTEGER NOT NULL,
> > > > "NAME" VARCHAR(100) CHARACTER SET ISO8859_1 NOT NULL,
> > > > "KEYINDEX" INTEGER NOT NULL,
> > > > "ADDRESS" BLOB SUB_TYPE TEXT SEGMENT SIZE 100 CHARACTER
> > > > SET ISO8859_1,
> > > > "DATE_MODIFIED" VARCHAR(25) CHARACTER SET ISO8859_1,
> > > > "HOST_MODIFIED" VARCHAR(25) CHARACTER SET ISO8859_1,
> > > > "USER_MODIFIED" VARCHAR(100) CHARACTER SET ISO8859_1
> > > > );
> > > >
> > > > and I have a search functionality which uses this query:
> > > >
> > > > $sth = ibase_query("SELECT
> > > > KEYINDEX,NAME,CATEGORY,DATE_MODIFIED FROM ADDRESS WHERE
> > > > UPPER(NAME) LIKE
> > > > '%".strtoupper($_POST['crit'])."%' OR ADDRESS LIKE
> > > > '%".$_POST['crit']."%' ORDER BY 2 DESC");
> > > >
> > > > I know the UPPER/strtoupper stuff for the varchar column is
> > > > ugly but it works. Well for the blob it doesn't, I cannot get
> > > > (or fake...) case- insensitivity with UPPER (I get string
> > > > conversion errors)... How could this be done? I use
> > > > FirebirdSS-1.0.2.908-0 on RH9
> > >
> > > You could try
> > > ...
> > > ADDRESS containing 'your string here'
> > > order by 2 desc
> > >
> > > "containing" is case insensitive, but possibly not quite perfect
> > > with not-lower-ANSI characters.
> >
> > I'll try it out tomorrow, lets see what happens, I have many
> > "öäüéàèëê"
> >
> > thank you,
> > Yves
> >
> > > --
> > > Aage J.
>
> ------------------------ Yahoo! Groups Sponsor
> ---------------------~--> Free shipping on all inkjet cartridge &
> refill kit orders to US & Canada. Low prices up to 80% off. We have
> your brand: HP, Epson, Lexmark & more.
> http://www.c1tracking.com/l.asp?cid=5510
> http://us.click.yahoo.com/GHXcIA/n.WGAA/ySSFAA/67folB/TM
> ---------------------------------------------------------------------
>~->
>
> To unsubscribe from this group, send an email to:
> firebird-support-unsubscribe@yahoogroups.com
>
>
>
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/

--
Linux 2.4.21-0.13mdk #1 Fri Mar 14 15:08:06 EST 2003 i686
19:34:29 up 14 min, 1 user, load average: 0.69, 0.27, 0.14