Subject | Re: [firebird-support] Re: ORDER BY on large VARCHAR columns |
---|---|
Author | unordained |
Post date | 2004-10-27T02:43:49Z |
The idea is to leave the current field as-is. Your regex, display, etc. would still run against
the "whole" field. For *sorting* only, you use other fields which are indexed. One field might be
enough, containing just a copy of the first 250 or so characters (252, I think, using plain ascii).
You sort by -that- field, but you search by the full field. If you make several fields (for each
chunk), you index all of them, and you can use the multi-field index feature (create index across
all four fields, in the order you would sort them in) to make it even better.
Let's say you currently have:
create table stuff
(
large_field varchar(1024)
);
If you have instead
create table stuff2
(
large_field varchar(1024),
partial_field varchar(252)
)
... and put an index on partial_field, and use a trigger to make sure it stays synchronized, then
the following should work:
select large_field from stuff2 where some_expression(large_field) order by partial_field,
large_field;
The indexed sort will get "close" to the end sorting you want, and then a painfully slow sort will
happen to finish it. But you've already got the strings sorted enough that you're not doing all of
the sorting against the full field. Right?
As you have an upper limit on the field length, you can also do
create table stuff3
(
large_field varchar(1024),
partial_field1 varchar(250),
partial_field2 varchar(250),
partial_field3 varchar(250),
partial_field4 varchar(250),
partial_field5 varchar(250)
)
(not sure if you care enough to go all the way, but you *can*)
You then create a single index on (partial_field1, ... partial_field5) and use the same query as
above, but sort by all of the columns, in that order. If you've got all the chunks of the string in
the partial fields, you don't need to worry about sorting by large_field anymore.
As to triggers, I'm terribly at the stored-procedure language, so I'll let someone else talk about
keeping these extra fields synchronized. Yes, it -would- be nice for Firebird to support indexing
on fields of arbitrary length, but until then ...
Yes, this is equivalent to creating an index on a function, but Firebird doesn't support that
(yet?). I hear Yaffil had some or all of that working. Instead, you have to fake it. There are no
function-based indices in Firebird.
-Philip
---------- Original Message -----------
From: "robert_difalco" <rdifalco@...>
the "whole" field. For *sorting* only, you use other fields which are indexed. One field might be
enough, containing just a copy of the first 250 or so characters (252, I think, using plain ascii).
You sort by -that- field, but you search by the full field. If you make several fields (for each
chunk), you index all of them, and you can use the multi-field index feature (create index across
all four fields, in the order you would sort them in) to make it even better.
Let's say you currently have:
create table stuff
(
large_field varchar(1024)
);
If you have instead
create table stuff2
(
large_field varchar(1024),
partial_field varchar(252)
)
... and put an index on partial_field, and use a trigger to make sure it stays synchronized, then
the following should work:
select large_field from stuff2 where some_expression(large_field) order by partial_field,
large_field;
The indexed sort will get "close" to the end sorting you want, and then a painfully slow sort will
happen to finish it. But you've already got the strings sorted enough that you're not doing all of
the sorting against the full field. Right?
As you have an upper limit on the field length, you can also do
create table stuff3
(
large_field varchar(1024),
partial_field1 varchar(250),
partial_field2 varchar(250),
partial_field3 varchar(250),
partial_field4 varchar(250),
partial_field5 varchar(250)
)
(not sure if you care enough to go all the way, but you *can*)
You then create a single index on (partial_field1, ... partial_field5) and use the same query as
above, but sort by all of the columns, in that order. If you've got all the chunks of the string in
the partial fields, you don't need to worry about sorting by large_field anymore.
As to triggers, I'm terribly at the stored-procedure language, so I'll let someone else talk about
keeping these extra fields synchronized. Yes, it -would- be nice for Firebird to support indexing
on fields of arbitrary length, but until then ...
Yes, this is equivalent to creating an index on a function, but Firebird doesn't support that
(yet?). I hear Yaffil had some or all of that working. Instead, you have to fake it. There are no
function-based indices in Firebird.
-Philip
---------- Original Message -----------
From: "robert_difalco" <rdifalco@...>
> Well, I have a UDF to perform regex searches on this column, so I------- End of Original Message -------
> can't really split them up as I have no idea where the wildcards may
> be in the search pattern. It's just a very problematic field.
>
> Also, there is a requirement that they be displayed -- fully
> qualified, sorted, and paged (i.e. 15-20 at a time).
>
> Can Firebird use a partial (i.e. substring) INDEX and have that
> index help it sort this table? I didn't think so, but maybe I'm not
> sure what you mean by "indexing just the first 250 or so chars".
>
> R.