firebird-support - Re: [firebird-support] FB SS 2.5 SMP

Subject	Re: [firebird-support] FB SS 2.5 SMP - clarification
Author	unordained
Post date	2012-04-23T16:50:12Z

---------- Original Message -----------
From: Michael Ludwig <milu71@...>

> How do you measure metadata volume? I mean, when looking at taskmgr or
> ps/top/vmstat, how do you know that the memory consumption is due to
> metadata and not buffers allocated for this or that?

------- End of Original Message -------

AFAIK, you guess. I was measuring it just by knowing what the current users
(me) were doing, watching disk i/o, timing during query-prepare times: initial,
subsequent, subsequent & different, subsequent & different but related to
previously prepared, etc. to see what would cause RAM usage to increase (and
stay increased), and what would lead to prepare-times going down.

In a CS environment, with connection pools and user activities that don't all
touch the same metadata (some just do some logging, some do minor clerk-level
stuff, and some do full workflow stuff), some instances of FB would sit for
hours (active, though) at a few dozen megs of RAM, others, once "initialized"
for workflow the first time, would eat 300. The initial prepare time for
queries touching workflow would often take 3-4 seconds, and there was a flurry
of i/o right about then. After that, most workflow-related queries (even
significantly different from the initial one) prepared quickly, on that
particular connection/instance. "workflow" can here be read as "tables that are
all related to each other through a web of cascading triggers."

Napkin math, sorry. I could be entirely wrong about the underlying cause. But
it does fit well with SS vs. CS [non-]shared-cache differences. I realize that
a prepare may involve caching some data-pages as well, not just pure metadata,
as FB tries to analyze the cardinality of joins, but it's at the very least not
reading the data itself, just examining the pointer pages, index roots, etc.

The reason for asking about the size of metadata in RAM, and the concept of
"inline", is that a full metadata-only backup is only 33 megs. The amount of
RAM seemingly eaten up the first time I prepared queries touching the workflow
tables was in the range of 180-250 megs. That's significantly more than the
pure definition of the metadata (all of it), and doesn't even need to actually
get "all of it". The triggers/SPs/tables are interdependent, but not 100%. I
also hide some of the dependencies via execute-statement (but only in
procedures that are called once, at commit; most everything else would be
exposed to the whole dependency web.) So a prepare on query X may require
examining metadata for 30 tables, 150 triggers, etc. -- but not necessarily all
of them, either. So ... that's expensive. Maybe there's an entirely unrelated
cause for the symptom.

Note: most people would not even remotely be affected by this. It seems to be
the result of having 2600 procedures, 9800 triggers, 2800 tables, with
interdependencies and per-object permissions. I don't want to scare anybody off
just because I very occasionally get sub-optimal behavior in my one, abnormal,
scenario.

Follow-up: so far I'm happy with the SS performance. I don't miss the multiple
cores, at least not for now. The single instance may sit at 800 megs of RAM,
but at least our page file isn't growing and we're not thrashing on swap space.
The response times are both lower and more consistent. (Before, it depended on
which CS instance you got connected to, and whether or not it had everything
cached that you needed. And yes, I did try tweaking connection-pool sizes, but
there was no ideal size for this problem.)

-Philip