Subject RE: [firebird-support] rtf-to-plaintext udf?
Author Alan McDonald
> Hello,
>
> ich have several blob fields in my database that contain rtf text.
> To parse the contents and split it into words I am looking for a method
> to extract the plain text from the blobs.
> For this I think I have to write an udf (with Delphi).
> I asked about this in a Delphi forum and got the proposal to use the
> TRichEdit component (that is a Wrapper around Windows' RichEdit control).
> I am not happy about this because I think it is too much overhead to
> create an instance of such a component in an udf, because it can happen
> very often (for each record of course).
> But since it was the only idea I tried it with the result of the
> Firebird server crashing (not just occasionally but everytime the udf is
> executed).
>
> The code used to get the blob data into the richedit is:
>
> if (not Assigned(aBlob))
> or ( aBlob^.TotalSize = 0)
> then exit;
> len := aBlob^.TotalSize + 1;
> rtf := TRichEdit.Create(nil); //rtf is TRichEdit
> buffer := StrAlloc(len);
> str := TStringStream.Create('');
> try
> aBlob^.GetSegment(aBlob^.BlobHandle, buffer, len, bytesRead);
> rtf.PlainText := false;
> rtf.text := buffer;
>
> Then I want to switch to plaintext and write it to a stream and then to
> result:
>
> rtf.PlainText := true;
> rtf.Lines.SaveToStream(str);
> result := ib_util_malloc(length(rtf.text) + 1);
> ZeroMemory(result, length(rtf.text) + 1);
> result := resultString(PChar(str.dataString), str.Size + 1);
>
> finally
> StrDispose(buffer);
> rtf.Free;
> str.Free;
> end;
>
> It seems that the line "rtf.Plaintext := true" causes the server to crash.
>
> Can anybody tell me
> a)
> whether there is a problem with this approach
> b)
> whether the problem arises just from wrong handling (coding mistakes)
> c)
> whether there is a totally different approach to get the plain text out
> of a rtf blob.
>
> Thank you very much
> Urs

I do something similar but I do at the client side. I wouldn't do this at
the back end at all, sounds like a time-waster ( I mean a task which is too
long for a UDF to perform acceptably ) and also a problem with catching and
dealing with exceptions which are better placed at the client side.
I have a richtext binary blob field and a plain text blob field. At the
client, onPost event, I convert the text and assign it to the plain text
field without the user knowing. This helps me with report writers which
don't handle the native RVF format of TRichView.
Alan