Subject | Re: [Firebird-Architect] Re: UTF-8 vs UTF-16 |
---|---|
Author | Nickolay Samofatov |
Post date | 2003-08-15T18:46:59Z |
Hello, Peter !
already present inside the engine, but not always applied (possibly
because of performance reasons valid at the point of writing of that
code). Point 3 us also possibly implemented, but not always enforced
(at least INTL API have enough functionality).
under name UNICODE in standard fbintl.dll ? It should already have all
problems including efficient on-page data compression solved.
> It's a hackish solution, but it has some plusses.This is really hackish solution. I don't like it.
> Implementation would be:
> a) map U+D800..U+DFFF to U+FFFD
> we don't support astral planes anyway
> b) map U+0000..U+00FF to U+2000..U+20FF
> will map space to U+2020 for easy compression,
> will make ISQL connnecting with charset NONE display
> ASCII (and ISO-8859-1) somewhat readable - connecting with
> charset NONE is nevertheless silly, IMHO
> c) map U+2000..U+20FF to U+D800..U+D8FF
> must make the place for b)
> d) map U+XX00 to U+D9XX (XX = 01..FF)
> will eliminate the other source of NUL bytes
> OK, this hack aside, let me restate that I wouldAs far as I remember the sources code for points 1 and 2 is
> be the first to welcome a correct UNICODE_FSS:
>> I think effort should be first directed to fixing UNICODE_FSS
>> implementation bugs namely:
>> 1) incorrect padding of CHAR(N) values
>> 2) lack of control for character string overfilling
> 3) prohibiting invalid UTF-8 sequences in UNICODE_FSS cols
already present inside the engine, but not always applied (possibly
because of performance reasons valid at the point of writing of that
code). Point 3 us also possibly implemented, but not always enforced
(at least INTL API have enough functionality).
> But whereas the UNICODE_FSS fixing costs valueable serverBTW, don't you remember that Firebird already implements UCS2 charset
> developer time, a UTF16BE charset would only need a change
> in fbintl.dll (and can be tested in fbintl2.dll).
under name UNICODE in standard fbintl.dll ? It should already have all
problems including efficient on-page data compression solved.
> Peter JacobiNickolay Samofatov