Subject Re: [firebird-support] Can't index a 1k+ utf-8 column
Author Alexandre Benson Smith
woodsmailbox wrote:
> The question is: are there any plans for this limitation to go away?
>
> For max. page size of 16k => 4k index size => max. indexable utf8 size
> < 1k. Quite useless xcept for names and small titles and captions.
>
> Thanx.
>

well...

Of course there is exceptions.... But in general you could index a very
big varchar, if you create an expression index with just the "few" first
1k chars, I think in most cases the search would be by the beginning of
the filed value, not the entire value. Even if you need to search for
the entire value, you could use the index in the beginning to speedy the
query up, and another condition to tilter the entire value, like:

select * from foo Where BigBar = 'some big text' and Substring(BigBar
from 1 for 1000) = Substring('some big text' from 1 for 1000)

If you have an expression index on Substring(BigBar from 1 for 1000) it
would help.

An index on a hash of that column could even be more adequate like

select * from foo Where BigBar = 'some big text' and SHA1(BigBar) =
SHA1('some big text')

assuming SHA1 is an UDF to calculate the SHA-1 from a text field.

If you look for equality only, then a hash value for the column (SHA-1
for example) could be enough too.

Regarding your question... perhaps when bigger pages sizes is
implemented. But I think a very big index like this would not perform
very well.. too few entries per index page, the index will be too much deep.

see you !

--
Alexandre Benson Smith
Development
THOR Software e Comercial Ltda
Santo Andre - Sao Paulo - Brazil
www.thorsoftware.com.br