Subject | Re: [Firebird-Architect] RFC: Proposal for the implementation |
---|---|
Author | Vlad Horsun |
Post date | 2004-11-29T20:49:05Z |
> > Separate page space seems most attractively from performance POV forI don't name it simple
> me.
>
> For me either, but to making it useful is not that simple.
> For example on Windows creating temporary files is horrifyingly slow andFor this reason i considered different tempspace scopes
> also tends to fragment the filesystem, you can see this with on-disk
> sorting module.
> One big temporary space storage has better performance properties, butI'm afraid we don't have nothing better for that. About management - it's
> it needs management and is not much better than normal database in that
> regard.
seems to be much simple than DPM+PAG.
> Also, collapsing per-attachment temporary space is going to introduceCan't say right now. I must to teach himself how blobs are handled
> difficult issues which are going to translate to bad performance.
> Imagine you create a temporary blob, which is the normal way you go to
> assign values to blob fields.
>
> When you create it you don't know which table it is going to be attached
> to. You may pass blob to view or stored procedure and don't know the
> destination table at all. What tablespace should engine use to store
> this blob?
> > > 2) Nickolay's ATTACHMENT_ID idea to add a hidden column toIn my approach only VIO_chase_record_version needs to be
> > both data
> > > and indices and teach the optimizer to filter the rows.
> >
> > I think this is not so good. It's seems to be easy to
> > implement, but i showed even more easiest way.
>
> Your approach is not easier, at least indices code would have to be
> revamped and also would require special threatment of temporary tables
> in GC and sweep handlers.
modified. We simple must understand - what is a "garbage".
Canonical definition is : garbage is that record versions which not seen by
any transaction and is deleted or is not a primary version (Jim will correct me
if i'm wrong ;). We determine first part of that definition by comparing
transaction
number of record version with oldest snapshot.
In case of temporary tables we must add to this definition :
a) ON COMMIT DELETE tables - all records created by transactions
which are not live now (this records can't have secondary versions)
b) ON COMMIT PRESERVE tables - primary record versions created
by transactions which attachments are not live now
With this definition we don't need to physically DELETE unneeded
records at transaction\attachment end. GC will remove such records
whenever it will see it and at engine startup we can drop all temporary
tables data
> > But this has one big disadvantage - performance. Clean up at startupAnd this is bad. Since we know that data is temporary and know
> is
> > fast but we don't want to reload engine just to empty temp
> > tables ;) Regular clean up (sweep, garbage collection) is
> > slow for temp tables and must be avoided, imho.
>
> We do GC for normal tables anyways, temp tables are not going to change
> the picture significantly.
exactly when it's became unneeded, we must use this to improve
performance
> They are not going to be slower than normal tables and we'll haveThis task is needed and not related to temp tables only, right ? ;)
> motivation to optimize normal access paths, not one special case of
> temporary tables.
> BTW, note that due to multiple transactions allowed for one connectionThis is why i even didn't examine another ways to manage records.
> update conflicts are possible for temporary tables and the most of the
> versioning, conflict resolution and GC logic of CCH, DPM and VIO modules
> still applies.
For example like in external tables and so on
> > And i can't see how Nickolay idea satisfy tables with ONWhy ? Why we need to do unneeded work with a lot of IO ?
> > COMMIT DELETE option.
>
> Do what's asked. Delete rows from temp table in DFW handler at commit
> time.
> We are going to implement deferred constraintsThis is first time i hear that
> so this is not going to change the big performance picture either.Sorry, i don't see relation of deferred constraints with temporary tables
and it's performance
> > At last, different page space allows read-only databaseHmm... you joke, right ? There are many applications like catalog of spare
> > to work with temp data and still remains read-only - this can
> > be important for some applications
>
> For those special applications NBACKUP allows to have read-only main
> database file and redirect writes to separate read-write storage. :-)
parts or libraries or phone book etc which distributed on the CD\DVD with big
database on it. This applications often used embedded engine. Ability to perform
complex queries with temporary data is very desirable for its developers
> The benefit of my approach is that it would allow to share 99% ofIf you approach allows share 99% of code than mine allows 99.5% ;)
> implementation with normal tables motivating us to optimize caching
> logic, data placement logic and access paths for entire engine, not only
> for temporary tables.
But i still prefer more complex but more faster page spaces
Regards,
Vlad