Subject | Re: [Firebird-devel] Need a Clue about Cyrillic |
---|---|
Author | Adriano dos Santos Fernandes |
Post date | 2004-09-10T01:04:38Z |
Jim Starkey wrote:
PXW_CYRL.
> I'm trying to get Vulcan past the Jaybird Junit test suite. I'm hangingI think that this testing is using the charset WIN1251 with collate
> up in the test FBEncodings doing an upcase to Cyrillic. The test stores
> a string consisting of the bytes from 0xE0 to 0xEF in a Cyrillic field
> then fetches them upcased. According to the test source
> (TestFBEncodings.java), the right answer is a string consisting of the
> bytes 0xC0 to 0xCF. Vulcan is returning the original string unchanged.
PXW_CYRL.
>0xE0 uppers to 0xC0 in ToUpperConversionTbl in file pw1251cyrr.h
> From the other end, the upcase operator does a lookup on the character
> set id 50, defined in intlnames.h as
>
> CHARSET("CYRL", CS_CYRL, 0, 1, 256, CS_cyrl, CYRL_c0_init)
> COLLATION("DB_RUS", CC_RUSSIA, CS_CYRL, 1, CYRL_c1_init)
> COLLATION("PDOX_CYRL", CC_RUSSIA, CS_CYRL, 2, CYRL_c2_init)
> END_CHARSET
>
> CYRL_c0_init, the character set initialization function, is defined in
> lc_ascii.cpp as:
>
> TEXTTYPE_ENTRY(CYRL_c0_init)
> {
> static const ASCII POSIX[] = "C.CYRL";
> FAMILY_ASCII(parm1, CYRL_c0_init, CS_CYRL, CC_C);
> TEXTTYPE_RETURN;
> }
>
> The cogent part of FAMILY_ASCII (a very large, very ugly macro) defining
> the string update function is:
>
> cache->texttype_fn_str_to_upper = (FPTR_short)
> famasc_str_to_upper; \
>
> And, finally, the key line of famasc_str_to_upper, also in lc_ascii.cpp, is:
>
> *pOutStr++ = ASCII7_UPPER(*pStr);
>
> where ASCII7_UPPER is (hold your breath):
>
> #define ASCII7_UPPER(ch) \
> ((((UCHAR) (ch) >= (UCHAR) ASCII_LOWER_A) && ((UCHAR) (ch) <=
> (UCHAR) ASCII_LOWER_Z)) \
> ? (UCHAR) ((ch)-ASCII_LOWER_A+ASCII_UPPER_A) \
> : (UCHAR) (ch))
>
> Now, counting on my fingers, 0xE0 is not between 'a' and 'z', suggesting
> strongly that it will be untouched by the upcase operation, suggesting,
> in turn, that the JUnit test is wrong. But Firebird 1.5 seems to pass it.
>Adriano
> So I'm stumped. Is:
>
> 1. The test wrong, i.e. upcase of 0xE0 should in fact be 0xE0 and not
> 0xC0?
> 2. The internationalization module coded wrong?
> 3. The internationalization module built wrong?
> 4. My understanding of how this whole corner of the world all screwed up?
> 5. All or some of the above
> 6. None of the above.
>
> I need a clue. The first person to successfully throw me a line will
> have his or her name enshrined in the Vulcan international module for
> all of posterity.
>
> Help!
>
> --
>
> Jim Starkey
> Netfrastructure, Inc.
> 978 526-1376
>