Subject | Indexing on large char field |
---|---|
Author | Scott, Niall |
Post date | 2004-01-23T08:57:29Z |
Hi All
I have a small problem with a utility I am creating. It contains a
Varchar(1024) field which I need to do lots of locates on. Obviously it is
too big to index so I am looking for an algorithm use to create an index I
have tried what I thought would do it and that was a using a CRC of the
string to create an indexeable field. This works BUT I now find that there
are duplicate CRCs which brings me to my second problem.
I would like to use an SQL statement to find the duplicates in one field
(Int64) the same as Access can do but I can't seem to get it write.
The field table structure is
CREATE TABLE "LOCALFILES"
(
"FILENAME" VARCHAR(250) CHARACTER SET WIN1252,
"FILESIZE" INTEGER,
"DATEWRITTEN" TIMESTAMP,
"CREATED" TIMESTAMP,
"LASTACCESSED" TIMESTAMP,
"EXT" CHAR(20) CHARACTER SET WIN1252,
"SID" INTEGER,
"ATTRIB" CHAR(10) CHARACTER SET WIN1252,
"XFILENAME" VARCHAR(1024) CHARACTER SET WIN1252,
"ARCHFILE" VARCHAR(250) CHARACTER SET WIN1252,
"CRC" BIGINT,
"STATID" INTEGER
);
Speed for the duplicate check is not important.
I am using FB1.8 RC8 embed for this program at present and the local DBs
will hold approx 200,000 records with the main DB hold that x 10;
Thanks
Niall R Scott
Systems Engineer
I.S. Support
Cameron Controls
Aberdeen
+44 1224 282440
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This e-mail is confidential, may contain proprietary information
of the Cooper Cameron Corporation and its operating Divisions
and may be confidential or privileged.
This e-mail should be read, copied, disseminated and/or used only
by the addressee. If you have received this message in error please
delete it, together with any attachments, from your system.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I have a small problem with a utility I am creating. It contains a
Varchar(1024) field which I need to do lots of locates on. Obviously it is
too big to index so I am looking for an algorithm use to create an index I
have tried what I thought would do it and that was a using a CRC of the
string to create an indexeable field. This works BUT I now find that there
are duplicate CRCs which brings me to my second problem.
I would like to use an SQL statement to find the duplicates in one field
(Int64) the same as Access can do but I can't seem to get it write.
The field table structure is
CREATE TABLE "LOCALFILES"
(
"FILENAME" VARCHAR(250) CHARACTER SET WIN1252,
"FILESIZE" INTEGER,
"DATEWRITTEN" TIMESTAMP,
"CREATED" TIMESTAMP,
"LASTACCESSED" TIMESTAMP,
"EXT" CHAR(20) CHARACTER SET WIN1252,
"SID" INTEGER,
"ATTRIB" CHAR(10) CHARACTER SET WIN1252,
"XFILENAME" VARCHAR(1024) CHARACTER SET WIN1252,
"ARCHFILE" VARCHAR(250) CHARACTER SET WIN1252,
"CRC" BIGINT,
"STATID" INTEGER
);
Speed for the duplicate check is not important.
I am using FB1.8 RC8 embed for this program at present and the local DBs
will hold approx 200,000 records with the main DB hold that x 10;
Thanks
Niall R Scott
Systems Engineer
I.S. Support
Cameron Controls
Aberdeen
+44 1224 282440
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This e-mail is confidential, may contain proprietary information
of the Cooper Cameron Corporation and its operating Divisions
and may be confidential or privileged.
This e-mail should be read, copied, disseminated and/or used only
by the addressee. If you have received this message in error please
delete it, together with any attachments, from your system.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~