Subject Re: Picture Servlet Problems possibly OT
Author Roman Rokytskyy
William,

> > Blobs are fetched from the database by blocks, not segments.
>
> Then I guess I am confused as to the segment size of blobs in the
> db?

In old good times idea was to introduce some kind of "natural" block
size for some content. If I remember correctly, Ann or Paul B. have
told that the idea was to add automatically a new line in text fields
(that is why in documentation you see segment size 80, size of one
line of ASCII on the screen). As I understand, segment size influences
the way blob is stored on the database page. Also it is used in blob
filters. But I may confuse something. Anyway, remote protocol requires
us to provide a size we want to fetch from the current position, not
the segment number to fetch.

Also, engine supports streem blobs (in fact arrays are stored as
streem blobs) and I plan to switch from segmented blobs to streamed
blobs (at least to provide length() and seek() methods on blobs).

> Ok, so that value can be greater or smaller than the "storage"
> segment size on disk. What the value specifies.

Size of a block that is transmitted over the network. Should be
balanced with socket buffer size.

> I hope to one day be able to contribute on the Java side. I would
> like to in the next month or two look into making a J2ME version of
> the driver. Although I am not sure I am ready for that. I need to
> look at the code behind Jaybird and get an understanding of how it
> talks to the db. Which I assume is not directly via the API?

It talks via the wire protocol, that is not documented :). The best
reference for the protocol is source code of the engine itself, but in
fact you do not need too much knowledge there, since most of the
protocol was implemented by Alejandro Alberola. There are few
unimplemented methods, and this would require digging the C code.

> Good to know. OT Any rules of thumb to use when working at the API
> level not using JayBird. I mean when using Java I can let the driver
> deal with that, but what about with my C++ wrapper.

No clue. gds32.dll (or .so) hide remote protocol from you, so you
hardly can influence anything there (at least on the level we are
discussing here).

> Should I use a default buffer size? Or base it on the value of the
> on disk segment size? Or is there a formula to determine the
> preferred buffer size based on the on disk segment size?

Try using default buffer size. From my experience it should be power
of 2 (for best performance). Segment size should be selected so that
database page is filled the best (less db pages you have to fetch from
disc, better performance you get). I do not know any formula on this
topic. From my personal experience, segment size does not influence
the performance as much as blob buffer size. Anyway, blob buffer size
is highly related to socket buffer size, and I suspect, to the size of
TCP packet (usually 1500 bytes, but on DSL links usually 1480).

> Or is it a trial and error thing based on the size of what is being
> extracted?

Most likely.

> That is good to know. I already see good performance with Linux to
> Linux, and even Windows to Linux communication. I will play around
> with that as well and see what I get. I will most likely stick with
> your suggestion of 8k or 16k on Linux, and 8k on Windows.

If you have performance around 2.5 MByte/s on 50 MB blobs, I doubt
that you will get better performance. If you have significantly lower
performance, try to play with socket buffer size. I was able to get
approx. 10x improvement because of "incorrect" socket buffer size.

Best regards,
Roman Rokytskyy