Subject | Re: [IBO] Newbie - TIB_ColumnMemo, blobs and character sets |
---|---|
Author | ericthorniley |
Post date | 2003-04-04T19:43:50Z |
I have fixed my app. I am now converting to and from UNICODE_FSS at
the client, instead of getting FB to do it for me. Many thanks for
your patience and help Helen.
I found a few issues that may be of interest:
Found on the Unicode site:
http://www.unicode.org/glossary/index.html
FSS-UTF.
Abbreviation for File System Safe UCS Transformation Format,
published by the X/Open Company Ltd., and intended for the UNIX
environment.
Now known as UTF-8.
So I used Windows calls WideCharToMultiByte and MultiByteToWideChar
with the CP parameter set to CP_UTF8.
I have done some checks on what is written to disk by various
methods. The Windows CP_UTF8 conversion matches FB's conversion to
UNICODE_FSS on all characters I tried.
For instance, using the Left Single Quotation Mark, Unicode U+2018,
WIN1252 value 0x91, I have the following results:
Column - WIN1252
Connection - WIN1252
Write using AsString from Delphi string or wide string
On disk - 1 byte 91
Column - UNICODE_FSS
Connection - WIN1252
Write using AsString from Delphi string or wide string
On disk - 3 bytes E2 80 98 (VARCHAR)
On disk - 1 byte 91 (BLOB)
Column - UNICODE_FSS
Connection - UNICODE_FSS
Write using AsString from Delphi string or wide string
On disk - 1 byte 91
Column - UNICODE_FSS
Connection - UNICODE_FSS
Write using AsString havind converted to UTF-8
On disk - 3 bytes E2 80 98
Column - UNICODE_FSS
Connection - UNICODE_FSS
Write using AsWideString (IBO only - no support in IBExpress) from
Delphi string or wide string
On disk character - 2 bytes 18 20 (i.e. raw windows little-endian
WideChar)
Eric
the client, instead of getting FB to do it for me. Many thanks for
your patience and help Helen.
I found a few issues that may be of interest:
Found on the Unicode site:
http://www.unicode.org/glossary/index.html
FSS-UTF.
Abbreviation for File System Safe UCS Transformation Format,
published by the X/Open Company Ltd., and intended for the UNIX
environment.
Now known as UTF-8.
So I used Windows calls WideCharToMultiByte and MultiByteToWideChar
with the CP parameter set to CP_UTF8.
I have done some checks on what is written to disk by various
methods. The Windows CP_UTF8 conversion matches FB's conversion to
UNICODE_FSS on all characters I tried.
For instance, using the Left Single Quotation Mark, Unicode U+2018,
WIN1252 value 0x91, I have the following results:
Column - WIN1252
Connection - WIN1252
Write using AsString from Delphi string or wide string
On disk - 1 byte 91
Column - UNICODE_FSS
Connection - WIN1252
Write using AsString from Delphi string or wide string
On disk - 3 bytes E2 80 98 (VARCHAR)
On disk - 1 byte 91 (BLOB)
Column - UNICODE_FSS
Connection - UNICODE_FSS
Write using AsString from Delphi string or wide string
On disk - 1 byte 91
Column - UNICODE_FSS
Connection - UNICODE_FSS
Write using AsString havind converted to UTF-8
On disk - 3 bytes E2 80 98
Column - UNICODE_FSS
Connection - UNICODE_FSS
Write using AsWideString (IBO only - no support in IBExpress) from
Delphi string or wide string
On disk character - 2 bytes 18 20 (i.e. raw windows little-endian
WideChar)
Eric