Subject RE: [firebird-support] rtf-to-plaintext udf?
Author Alan McDonald
> Hello,
> ich have several blob fields in my database that contain rtf text.
> To parse the contents and split it into words I am looking for a method
> to extract the plain text from the blobs.
> For this I think I have to write an udf (with Delphi).
> I asked about this in a Delphi forum and got the proposal to use the
> TRichEdit component (that is a Wrapper around Windows' RichEdit control).
> I am not happy about this because I think it is too much overhead to
> create an instance of such a component in an udf, because it can happen
> very often (for each record of course).
> But since it was the only idea I tried it with the result of the
> Firebird server crashing (not just occasionally but everytime the udf is
> executed).
> The code used to get the blob data into the richedit is:
> if (not Assigned(aBlob))
> or ( aBlob^.TotalSize = 0)
> then exit;
> len := aBlob^.TotalSize + 1;
> rtf := TRichEdit.Create(nil); //rtf is TRichEdit
> buffer := StrAlloc(len);
> str := TStringStream.Create('');
> try
> aBlob^.GetSegment(aBlob^.BlobHandle, buffer, len, bytesRead);
> rtf.PlainText := false;
> rtf.text := buffer;
> Then I want to switch to plaintext and write it to a stream and then to
> result:
> rtf.PlainText := true;
> rtf.Lines.SaveToStream(str);
> result := ib_util_malloc(length(rtf.text) + 1);
> ZeroMemory(result, length(rtf.text) + 1);
> result := resultString(PChar(str.dataString), str.Size + 1);
> finally
> StrDispose(buffer);
> rtf.Free;
> str.Free;
> end;
> It seems that the line "rtf.Plaintext := true" causes the server to crash.
> Can anybody tell me
> a)
> whether there is a problem with this approach
> b)
> whether the problem arises just from wrong handling (coding mistakes)
> c)
> whether there is a totally different approach to get the plain text out
> of a rtf blob.
> Thank you very much
> Urs

I do something similar but I do at the client side. I wouldn't do this at
the back end at all, sounds like a time-waster ( I mean a task which is too
long for a UDF to perform acceptably ) and also a problem with catching and
dealing with exceptions which are better placed at the client side.
I have a richtext binary blob field and a plain text blob field. At the
client, onPost event, I convert the text and assign it to the plain text
field without the user knowing. This helps me with report writers which
don't handle the native RVF format of TRichView.