Subject | why Blob is so slow ? |
---|---|
Author | Alexandre Benson Smith |
Post date | 2012-04-18T22:12:02Z |
For some time I wonder why blob's are so slow during back-up/restore.
when I access one blob alone I don't think it's slow, but during the
process of back-up/restore I can see that the table that holds the blob
took so many time to be processed.
Today I created a simple test case that resamble my real scenario:
Two Tables
CREATE TABLE DOCUMENT
(
DOCUMENTID integer NOT NULL,
PRODUCTID integer,
COMPANYID integer,
KIND char(1) NOT NULL COLLATE PT_BR,
DESCRIPTION varchar(40) NOT NULL,
CONSTRAINT PK_DOCUMENT PRIMARY KEY (DOCUMENTID)
);
CREATE TABLE DOCUMENTOBLOB
(
DOCUMENTBLOBID integer NOT NULL,
DOCUMENTID integer,
ITEM integer NOT NULL,
BINARYDATA blob sub_type 0 NOT NULL,
FILENAME varchar(255) NOT NULL COLLATE PT_BR,
CONSTRAINT PK_DOCUMENTBLOB PRIMARY KEY (DOCUMENTBLOBID),
CONSTRAINT UNQ_DOCUMENTBLOB UNIQUE (DOCUMENTID, ITEM)
);
Table Document has 469 records
Table DocumentBlob has 463 records
The design were made to support more then a BLOB per document (like many
JPG's pages, or a mix of JPG, XLS and PDF)
The database has 245MB.
I create a simple application to measure the size of the Blobs, the size
of the binary data is 236MB.
The blobs are not big, one of 37MB, one of 4MB, two of 2MB, twenty eight
between 1MB and 2MB and the rest less than a MB. The average size is
around 500KB per blob.
Here are the timing for a back-up restore:
# time /opt/firebird/bin/gbak blob_test.fdb blob_test.fbk -user sysdba
-password masterkey -t
real 0m6.927s
user 0m0.671s
sys 0m1.191s
# time /opt/firebird/bin/gbak blob_test.fbk blob_teste2.fdb -rep -user
sysdba -password masterkey -t
real 10m8.894s
user 0m0.042s
sys 0m0.037s
I think it's too slow to process less than 1k records and 250MB of data.
Some more info:
during the back-up or restore the CPU and i/o is low:
Tasks: 93 total, 1 running, 92 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni, 97.3%id, 2.7%wa, 0.0%hi, 0.0%si,
0.0%st
Tasks: 93 total, 1 running, 92 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.3%us, 0.0%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Firebird Version:
# /opt/firebird/bin/fbserver -z
Firebird TCP/IP server version LI-V2.1.4.18393 Firebird 2.1
gstat -a -r output:
Database "blob_teste2.fdb"
Database header page information:
Flags 0
Checksum 12345
Generation 17
Page size 16384
ODS version 11.1
Oldest transaction 1
Oldest active 2
Oldest snapshot 2
Next transaction 9
Bumped transaction 1
Sequence number 0
Next attachment ID 5
Implementation ID 19
Shadow count 0
Page buffers 0
Next header page 0
Database dialect 1
Creation date Apr 18, 2012 19:02:10
Attributes force write
Variable header data:
Sweep interval: 20000
*END*
Database file sequence:
File blob_teste2.fdb is the only file
Analyzing database pages ...
DOCUMENTO (128)
Primary pointer page: 137, Index root page: 138
Average record length: 35.67, total records: 469
Average version length: 0.00, total versions: 0, max versions: 0
Data pages: 3, data page slots: 3, average fill: 50%
Fill distribution:
0 - 19% = 1
20 - 39% = 0
40 - 59% = 0
60 - 79% = 2
80 - 99% = 0
Index PK_DOCUMENTO (0)
Depth: 1, leaf buckets: 1, nodes: 469
Average data length: 1.01, total dup: 0, max dup: 0
Fill distribution:
0 - 19% = 1
20 - 39% = 0
40 - 59% = 0
60 - 79% = 0
80 - 99% = 0
DOCUMENTOBLOB (129)
Primary pointer page: 139, Index root page: 140
Average record length: 40.37, total records: 463
Average version length: 0.00, total versions: 0, max versions: 0
Data pages: 7, data page slots: 7, average fill: 90%
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 0
60 - 79% = 0
80 - 99% = 7
Index PK_DOCUMENTOBLOB (0)
Depth: 1, leaf buckets: 1, nodes: 463
Average data length: 1.01, total dup: 0, max dup: 0
Fill distribution:
0 - 19% = 1
20 - 39% = 0
40 - 59% = 0
60 - 79% = 0
80 - 99% = 0
Index UNQ_DOCUMENTOBLOB (1)
Depth: 1, leaf buckets: 1, nodes: 463
Average data length: 5.15, total dup: 0, max dup: 0
Fill distribution:
0 - 19% = 0
20 - 39% = 1
40 - 59% = 0
60 - 79% = 0
80 - 99% = 0
Does anyone has any comments about it ?
see you !
Alexandre
when I access one blob alone I don't think it's slow, but during the
process of back-up/restore I can see that the table that holds the blob
took so many time to be processed.
Today I created a simple test case that resamble my real scenario:
Two Tables
CREATE TABLE DOCUMENT
(
DOCUMENTID integer NOT NULL,
PRODUCTID integer,
COMPANYID integer,
KIND char(1) NOT NULL COLLATE PT_BR,
DESCRIPTION varchar(40) NOT NULL,
CONSTRAINT PK_DOCUMENT PRIMARY KEY (DOCUMENTID)
);
CREATE TABLE DOCUMENTOBLOB
(
DOCUMENTBLOBID integer NOT NULL,
DOCUMENTID integer,
ITEM integer NOT NULL,
BINARYDATA blob sub_type 0 NOT NULL,
FILENAME varchar(255) NOT NULL COLLATE PT_BR,
CONSTRAINT PK_DOCUMENTBLOB PRIMARY KEY (DOCUMENTBLOBID),
CONSTRAINT UNQ_DOCUMENTBLOB UNIQUE (DOCUMENTID, ITEM)
);
Table Document has 469 records
Table DocumentBlob has 463 records
The design were made to support more then a BLOB per document (like many
JPG's pages, or a mix of JPG, XLS and PDF)
The database has 245MB.
I create a simple application to measure the size of the Blobs, the size
of the binary data is 236MB.
The blobs are not big, one of 37MB, one of 4MB, two of 2MB, twenty eight
between 1MB and 2MB and the rest less than a MB. The average size is
around 500KB per blob.
Here are the timing for a back-up restore:
# time /opt/firebird/bin/gbak blob_test.fdb blob_test.fbk -user sysdba
-password masterkey -t
real 0m6.927s
user 0m0.671s
sys 0m1.191s
# time /opt/firebird/bin/gbak blob_test.fbk blob_teste2.fdb -rep -user
sysdba -password masterkey -t
real 10m8.894s
user 0m0.042s
sys 0m0.037s
I think it's too slow to process less than 1k records and 250MB of data.
Some more info:
during the back-up or restore the CPU and i/o is low:
Tasks: 93 total, 1 running, 92 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni, 97.3%id, 2.7%wa, 0.0%hi, 0.0%si,
0.0%st
Tasks: 93 total, 1 running, 92 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.3%us, 0.0%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Firebird Version:
# /opt/firebird/bin/fbserver -z
Firebird TCP/IP server version LI-V2.1.4.18393 Firebird 2.1
gstat -a -r output:
Database "blob_teste2.fdb"
Database header page information:
Flags 0
Checksum 12345
Generation 17
Page size 16384
ODS version 11.1
Oldest transaction 1
Oldest active 2
Oldest snapshot 2
Next transaction 9
Bumped transaction 1
Sequence number 0
Next attachment ID 5
Implementation ID 19
Shadow count 0
Page buffers 0
Next header page 0
Database dialect 1
Creation date Apr 18, 2012 19:02:10
Attributes force write
Variable header data:
Sweep interval: 20000
*END*
Database file sequence:
File blob_teste2.fdb is the only file
Analyzing database pages ...
DOCUMENTO (128)
Primary pointer page: 137, Index root page: 138
Average record length: 35.67, total records: 469
Average version length: 0.00, total versions: 0, max versions: 0
Data pages: 3, data page slots: 3, average fill: 50%
Fill distribution:
0 - 19% = 1
20 - 39% = 0
40 - 59% = 0
60 - 79% = 2
80 - 99% = 0
Index PK_DOCUMENTO (0)
Depth: 1, leaf buckets: 1, nodes: 469
Average data length: 1.01, total dup: 0, max dup: 0
Fill distribution:
0 - 19% = 1
20 - 39% = 0
40 - 59% = 0
60 - 79% = 0
80 - 99% = 0
DOCUMENTOBLOB (129)
Primary pointer page: 139, Index root page: 140
Average record length: 40.37, total records: 463
Average version length: 0.00, total versions: 0, max versions: 0
Data pages: 7, data page slots: 7, average fill: 90%
Fill distribution:
0 - 19% = 0
20 - 39% = 0
40 - 59% = 0
60 - 79% = 0
80 - 99% = 7
Index PK_DOCUMENTOBLOB (0)
Depth: 1, leaf buckets: 1, nodes: 463
Average data length: 1.01, total dup: 0, max dup: 0
Fill distribution:
0 - 19% = 1
20 - 39% = 0
40 - 59% = 0
60 - 79% = 0
80 - 99% = 0
Index UNQ_DOCUMENTOBLOB (1)
Depth: 1, leaf buckets: 1, nodes: 463
Average data length: 5.15, total dup: 0, max dup: 0
Fill distribution:
0 - 19% = 0
20 - 39% = 1
40 - 59% = 0
60 - 79% = 0
80 - 99% = 0
Does anyone has any comments about it ?
see you !
Alexandre