Subject | Re: Converting A Blob To A String Within A SP |
---|---|
Author | martinthrelly |
Post date | 2005-11-09T05:06Z |
--- In firebird-support@yahoogroups.com, Helen Borrie <helebor@t...>
wrote:
come to mind. i have now come around full circle and i am thinking
about going one step further and porting the index and full text
search documents out to an external file directory using DotLucene.
i can then write a UDF which communicates with DotLucene. i can call
that with an sql like "SELECT id from MYTABLE where id in (select *
from dotlucene where hits = "joe" or hits = "bloggs").
something like that. this way i port my full text search module away
from the database. which is good i think because it will be better
performance and also it blatantly doesnt belong there in terms of
its structure.
the problem with having an external directory is concurrency but i
think i can perhaps manage this via a queue which is sent items via
my insert/update triggers.
thanks again for the reply
wrote:
>the
> At 02:51 AM 9/11/2005 +0000, you wrote:
>
> >hello helen. let me describe what i want to do with a simplified
> >example. sorry if this is a bit long but hopefully you will get
> >gist of what i am trying to do.to
> >
> >lets say i have a table called Clients with 3 columns as such:
> >
> >ID integer
> >Name varchar(32)
> >Notes Blob
> >
> >so i insert a row into the table like this
> >
> >ID: 1
> >Name: 'John Smith'
> >Notes: 'Keen chess player...etc'
> >
> >so now i have inserted this row i want to take each word and
> >populate a SEARCH_WORDS table in order to facilitate a full text
> >search.
> >
> >so from my insert trigger i call an SP which makes a long blob
> >consisting of all the words in my searchable fields. in this
> >case 'Name' and 'Notes. like this:
> >
> >declare variable KEYWORDS BLOB SUB_TYPE 1 SEGMENT SIZE 80;
> >keywords = :Name || ' ' || :Notes;
> >
> >so now i have a new blob containing all my keywords. now in order
> >populate my SEARCH_WORDS table i must seperate this keywords blobaccept a
> >into individual WORDS of type varchar(32). so i send this new blob
> >off to a second SP called SP_BLOB_SPLITTER.
> >
> >so i do somthing like this:
> >
> >for select WORDS from SP_BLOB_SPLITTER (:keywords) into word
> >do begin
> >insert into SEARCH_WORDS(:word); ...etc...
> >
> >in order to achieve this my SP_BLOB_SPLITTER must therefore
> >blob and somehow return a list of words within this blobdeliminated
> >by a ' '. i have achieved this with an input parameter of varcharwith
> >(32000) by iterating over each character. but how can i do this
> >a blob? how can i iterate over each character within a blobwithout
> >first making a varchar. and how can i make a varchar when the blobproblem is
> >might be too big? this is my problem.
>
> Honestly, I don't think it's your problem at all. I think your
> that you are providing a horrendously complicated insfrastructureof
> strings that you're hoping to parse effectively using just SQL.My advice
> is: forget it. Programming languages do string parsing a lot moreRender
> effectively. Relational databases do relationships very nicely.
> unto Caesar and all that jazz.a parsing
>
> Use SUBSTRING in your SP to extract strings from the blob. Write
> engine in C/C++ or Pascal that does what you want: accepts thesestring
> "chunks", parses out single words and makes sense of themas "keywords" and
> passes back candidate keywords to the SP. Export the engine asone or more
> UDFs. Then let the database engine work out what to do with thekeyword by
> testing keywords it already has and associations that it hasalready stored.
>keyword
> And this idea of storing something like "Joe Bloggs: fishing" as a
> entry isn't very scientific.you
>
> Store the documents in one table with whatever identifying data
> need. Store keywords in a primitive table with no affiliations toby an
> documents at all. Then link the documents to the keywords table
> intersection table that has foreign keys to both the document andthe
> keywords tables.do. For
>
> This is an over-simplification of what's a really complex thing to
> example, simply splitting out single words out of context doesn'tdo much
> for meaning. Complex tasks need smart languages.thanks for the reply helen. the words "square peg" and "round hole"
>
> ./heLen
>
come to mind. i have now come around full circle and i am thinking
about going one step further and porting the index and full text
search documents out to an external file directory using DotLucene.
i can then write a UDF which communicates with DotLucene. i can call
that with an sql like "SELECT id from MYTABLE where id in (select *
from dotlucene where hits = "joe" or hits = "bloggs").
something like that. this way i port my full text search module away
from the database. which is good i think because it will be better
performance and also it blatantly doesnt belong there in terms of
its structure.
the problem with having an external directory is concurrency but i
think i can perhaps manage this via a queue which is sent items via
my insert/update triggers.
thanks again for the reply