Subject | Re: [ib-support] GDB Size |
---|---|
Author | Ann Harrison |
Post date | 2001-02-10T18:23:33Z |
At 08:42 PM 2/9/2001 -0500, Robert F. Tulloch wrote:
The data in dBase took 1.8 megs.
An empty database grew to 12 megs when the data was stored with datapump.
A 6 mb database grew to 10+ mb when the data was loaded with IB-WISQL.
A backup file was 1.8 mb (starting from the 10mb database).
The database restored from backup was the restore was 3.7 mb.
OK, let's do the math.
Average record size:
For the sake of argument, I'm going to suppose that city names
average 10 characters. The run length compression works like this:
1 byte of count,
1 null byte,
2 bytes of length,
10 bytes of data,
1 count byte,
1 byte holding the filler character,
1 count byte,
6 bytes of data
23 bytes total
Assuming you used a 4096 page size, you've got
per table overhead:
1 pointer page 4096
1 index root page 4096
per record overhead:
46797 record headers 655158
46797 null flag bytes 46797
46797 page indexes 187188
record data:
46797 * 22 (w/o null flags) 1029534
reserved space:
46796 * 16 748752
total record space:
2667429
The data page header is 28 bytes, so the useful space on page is
4068 bytes, producing a total of 653 data pages. So the table
should have added about 2.7 Mb without indexes.
If each record were stored in a separate transaction, you'd
also add 11700 bytes of transaction state, or three TIP pages.
Interesting. Suggests that the data pump is not working very efficiently.
One possibility is that it stores the keys first then goes back to store
the data by updating each row. That wastes quite a lot of space & time.
One way to find out is to count the number of pointer pages used after the
initial store.
select count (*) from rdb$pages p, rdb$relations r
where p.rdb$page_type = 4
and p.rdb$relation_id = r.rdb$relation_id
and r.rdb$relation_name = "MY_TABLE"
It should be approximately 700, unless the data pump is doing something
strange.
Another likely situation is that you have indexes that you didn't mention.
Use the database statistics feature of IBConsole to check the index fill
level. My guess is that it's about 50% for the data that IB-WISQL loaded,
about 25% for the data loaded by datapump, and close to 100% for the data
loaded by gbak. Gbak is most efficient because it creates indexes after
storing data.
Regards,
Ann
www.ibphoenix.com
We have answers.
> I had gdb ~ 6mb. I added table:To summarize the issues:
>
> Table ALLZIP with fields:
>
> CITY VARCHAR(50)
> STATE CHAR(2)
> ZIP CHAR(6)
>
> I set up dialect 1 database with one table ALLSTATE with three fields:
>
> UCITY VARCHAR(50)
> USTATE CHAR(2)
> UZIP CHAR(6)
The data in dBase took 1.8 megs.
An empty database grew to 12 megs when the data was stored with datapump.
A 6 mb database grew to 10+ mb when the data was loaded with IB-WISQL.
A backup file was 1.8 mb (starting from the 10mb database).
The database restored from backup was the restore was 3.7 mb.
OK, let's do the math.
Average record size:
For the sake of argument, I'm going to suppose that city names
average 10 characters. The run length compression works like this:
1 byte of count,
1 null byte,
2 bytes of length,
10 bytes of data,
1 count byte,
1 byte holding the filler character,
1 count byte,
6 bytes of data
23 bytes total
Assuming you used a 4096 page size, you've got
per table overhead:
1 pointer page 4096
1 index root page 4096
per record overhead:
46797 record headers 655158
46797 null flag bytes 46797
46797 page indexes 187188
record data:
46797 * 22 (w/o null flags) 1029534
reserved space:
46796 * 16 748752
total record space:
2667429
The data page header is 28 bytes, so the useful space on page is
4068 bytes, producing a total of 653 data pages. So the table
should have added about 2.7 Mb without indexes.
If each record were stored in a separate transaction, you'd
also add 11700 bytes of transaction state, or three TIP pages.
Interesting. Suggests that the data pump is not working very efficiently.
One possibility is that it stores the keys first then goes back to store
the data by updating each row. That wastes quite a lot of space & time.
One way to find out is to count the number of pointer pages used after the
initial store.
select count (*) from rdb$pages p, rdb$relations r
where p.rdb$page_type = 4
and p.rdb$relation_id = r.rdb$relation_id
and r.rdb$relation_name = "MY_TABLE"
It should be approximately 700, unless the data pump is doing something
strange.
Another likely situation is that you have indexes that you didn't mention.
Use the database statistics feature of IBConsole to check the index fill
level. My guess is that it's about 50% for the data that IB-WISQL loaded,
about 25% for the data loaded by datapump, and close to 100% for the data
loaded by gbak. Gbak is most efficient because it creates indexes after
storing data.
Regards,
Ann
www.ibphoenix.com
We have answers.