Subject | Re: [Firebird-Architect] UTF-8 Everywhere |
---|---|
Author | Jim Starkey |
Post date | 2014-01-17T17:49:51Z |
The 31 character identifiers were for
compatibility with the DEC Rdb products. The Rdb product
limitation, in turn, was based on the VAX/VMS calling standard
(both decisions were mine). Needless to say, this is hopelessly
obsolete.
But, rather than substituting one arbitrary restriction for another, wouldn't it make sense to change the ODS record encoding and eliminate the very concept of bounded string lengths? I believe this has been discussed at some length, but an "actual value" based encoding is about a third denser than the current run length encoding of "declared type" encoding. There is a slight increase in memory usage per in-memory record to optimize field access and a small number of additional cycles to find a particular field, but these are largely offset by a significant reduction is record size. And, of course, it allows a higher packing density on page, reducing the number of page reads, etc. etc. etc.
Bottom line: If you're going to make an incompatible change to the ODS, switch to something to you want rather than something only less bad that what is there now.
But, rather than substituting one arbitrary restriction for another, wouldn't it make sense to change the ODS record encoding and eliminate the very concept of bounded string lengths? I believe this has been discussed at some length, but an "actual value" based encoding is about a third denser than the current run length encoding of "declared type" encoding. There is a slight increase in memory usage per in-memory record to optimize field access and a small number of additional cycles to find a particular field, but these are largely offset by a significant reduction is record size. And, of course, it allows a higher packing density on page, reducing the number of page reads, etc. etc. etc.
Bottom line: If you're going to make an incompatible change to the ODS, switch to something to you want rather than something only less bad that what is there now.
On 1/17/2014 10:33 AM, Dalton Calford wrote:
I would agree in regards to UTF8, but I also think the length of SQL identifiers needs to be adjusted - ie 31 char identifiers only give 7 to 8 characters for some character sets.
Extending the maximum length of sql identifiers, along with adding sql schema support at the same time is needed.
Universal UTF8 support just further highlights the need for these changes.
On 17 January 2014 10:14, Jiri Cincura <diskuze@...> wrote:
I believe Firebird 3 should support only UTF8/UTF16. Rest is IMO obsolete.
--
Jiri {x2} Cincura (x2develop.com founder)
http://blog.cincura.net/ | http://www.ID3renamer.com