Subject | Re: [firebird-support] Gbak and indexes |
---|---|
Author | Hugo Eyng |
Post date | 2016-02-25T18:50:34Z |
Hello Ann.
Very good. Thank you.
Hugo
Very good. Thank you.
Hugo
Em 25/02/2016 13:47, Ann Harrison
aharrison@... [firebird-support] escreveu:
Someone on a different forum asked whether gbak restores wouldn't be faster if gbak backed up indexes so it didn't have to rebuild them. The writer wanted the
clean-up and compaction that a gbak backup and restore provides, but not thetime spent retrieving records, sorting, and rebuilding indexes.
There's a problem with that. An old index is of absolutely no use in a restored database exactly because records are in different places. The index contains the physical location of records. Change the location, invalidate the index. That's the short answer.
The longer answer:
An index contains a key value and a "record id" also called a db-key. When an index search finds an acceptable approximation of the value desired, it uses the db-key to find the corresponding record.
(It's actually more complicated than that, and depends on whether the index is being used to optimize a sort, but in the end, an index search finds a db-key and uses it to find a record.)
What's in a db-key? Three things: a pointer page sequence number, and offset into the pointer page, and an index offset on the data page. A pointer page is a part of the structure of a table that's not visible outside the database. It's simply a page that contains a vector of page numbers. Those page numbers belong to the data pages in the table. The RDB$PAGES table contains the sequence number and page number of pointer pages for all tables, among other things. After a table has been referenced, the pointer page information is kept in memory. The pointer page may be in cache or Firebird may need to read it from disk.
With the pointer page in its virtual hand, Firebird reads the page number at the offset indicated by the db-key to get the page number of the appropriate data page and finds the data page in cache or on disk. The offset on the data page in the db-key is an indirect pointer to the record. Each data page has an index to the records on it containing the actual offset and length of the record, so a page can be reorganized without changing the db-keys of the records on it.
A gbak restore creates a new database and populates it with the data it backed up from the old database. Records will be stored on different pages and at different offsets. Pages that were part of one table in the old database may be part of a different table or some other structure in the new database.
Gbak restore could be made faster if Firebird recognized at a commit that the transaction created several indexes on the same table and built all the indexes in parallel rather than reading the table for each index, but using an old index in a new database is not a good idea.
Good luck,
Ann
-- Atenciosamente, Hugo Eyng