Subject | Re: [Firebird-Architect] RFC: Proposal for the implementation |
---|---|
Author | Dmitry Yemanov |
Post date | 2004-11-29T13:24:16Z |
"Vlad Horsun" <hvlad@...> wrote:
SortMem/TempFile provide caching, just with different strategies.
tables, but perhaps it's not so important and/or can be improved in the
future. But the cache controlling code must be changed accordingly, e.g.
there's no need to flush temp pages to disk on commit/rollback.
virtual memory) as possible :-) It just flushes them too often from the temp
tables POV. And I'm not sure that temp page I/O should follow the careful
writes strategy. I have no other problems with CCH. But I tend to think that
in this case CCH should have two independent page pools (for generic pages
and temp ones), because I'd expect these two buffers to be configured
independently.
(b) doesn't require anything else to support attachment-level visibility per
se. No hidden columns, no transaction SBMs...
big will be such a SBM in the case of long 24x7 attachments and PRESERVE
ROWS option?
Dmitry
>If
> > 1) Data storage
> >
> > Temporary data by its definition doesn't require any recovery policies.
> > it disappears because of a hardware/software failure, it means no actualthan
> > data loss. Temporary data is also expected to have shorter life-time
> > persistent data and to provide faster access. All this means that itshould
> > be preserved in memory as much/long as possible and flushed on disk onlyif
> > there's not enough buffers to keep all temp data.This is somehow similar to what I was saying ;-) Both CCH and
>
> I think this is not correct. Temporary data shouldn't be preserved
> in RAM - it's must be cached like all other data.
SortMem/TempFile provide caching, just with different strategies.
> We can cache it withI still think that CCH is too weighty to provide good performance for temp
> our cache manager (CCH) or delegate it to file system, but don't retain
> temp data in RAM.
tables, but perhaps it's not so important and/or can be improved in the
future. But the cache controlling code must be changed accordingly, e.g.
there's no need to flush temp pages to disk on commit/rollback.
> As example, MSSQL before version 7.0 has option 'tempdb in RAM' butCCH also tends to preserve as much pages in RAM (okay, not in RAM, but in
> deprecate it since v7, IIRC.
virtual memory) as possible :-) It just flushes them too often from the temp
tables POV. And I'm not sure that temp page I/O should follow the careful
writes strategy. I have no other problems with CCH. But I tend to think that
in this case CCH should have two independent page pools (for generic pages
and temp ones), because I'd expect these two buffers to be configured
independently.
> > But, regardless of the CCH usage, I consider the whole idea of storingso
> > temp data inside the database via the existing PIO wrong, as it's just
> > provides the required semantics without any performance and/or cleanup
> > benefits. If the proper solution requires a separate page numbers space,
> > be it.Agreed.
>
> Separate page space seems most attractively from performance
> POV for me.
> > 2) Data visibilityI'd stick to (b) for both architectures, just to unify the code. Note that
> >
> > I see two ways to allow per-session data visibility:
> >
> > 1) One TempSpace instance per attachment. It means that different
> > attachments work with different temporary files.
>
> In fact we can imagine 4 tempspace scope :
> a) one per temporary table instance
> b) one per attachment
> c) one per database
> d) one per engine
>
> Option d) seems to be not practically useful, at least with
> current engine implementation
>
> Option c) seems to be not very friendly to classic server, but
> allows to avoid frequent file creation\deletion
>
> Option a) is most granulated but add most overhead from file
> system
>
> So, i prefer option b) for CS and c) for SS, or one per engine
> process.
(b) doesn't require anything else to support attachment-level visibility per
se. No hidden columns, no transaction SBMs...
> > 2) Nickolay's ATTACHMENT_ID idea to add a hidden column to both data andengine
> > indices and teach the optimizer to filter the rows.
>
> I think this is not so good. It's seems to be easy to implement,
> but i showed even more easiest way. But this has one big disadvantage -
> performance. Clean up at startup is fast but we don't want to reload
> just to empty temp tables ;) Regular clean up (sweep, garbage collection)Your suggestion requires to collect txn id's on a per attachment basis. How
> is slow for temp tables and must be avoided, imho.
big will be such a SBM in the case of long 24x7 attachments and PRESERVE
ROWS option?
> And i can't see how Nickolay idea satisfy tables with ON COMMITYep, this ability is important.
> DELETE option.
>
> At last, different page space allows read-only database to work
> with temp data and still remains read-only - this can be important
> for some applications
Dmitry