Subject | Firebird 2.0, Windows, and multibyte/unicode file paths. |
---|---|
Author | Greg At ACD |
Post date | 2007-03-08T21:51:57Z |
Hi all,
I've tried searching for info on this but no luck so far...
Environment: Windows XP, Local Firebird 2.0 Super Server, local
connection, Visual Studio 2005, using latest released IBPP interface,
plus some direct API calls as well.
Given that all interfaces to the Firebird API are single byte
character interfaces (i.e. char*), how would one open (or create) a
local database if the path happens to have some multibyte characters?
For example, if I'm on a Japanese O/S and we want to land the database
in the private user area (e.g. C:\Documents and
Settings\<username>\Application Data\<mycompany>\<myprod>\<mydb.fdb>)
but the <username> is all japanese characters...
Our application is a Windows Unicode application, so everything is 2
byte characters up to the point that it interfaces with the database.
For the actual VARCHAR/CHAR/BLOB TEXT data, we convert string to/from
UTF8 and make sure our database is using the UTF8 character set.
My concern is more with the database path itself. Do I need to use the
WideCharToMultiByte() function to get the path in a form that the
server will work with? I presume I have to use the current codepage
for this... but what if I lose characters along the way if a path
character isn't part of my current codepage?
Is this a problem I am overstressing about? We are at the planning and
prototyping stages right now, but this scenario was raised internally
as a 'risk' :(
Finally, is there any future consideration for making Firebird fully
unicode compliant at some point? By this I mean interfaces having full
unicode support (e.g. wchar_t and char flavours), and having the DB
store the data in true unicode format? One thing we notice is that
converting from UTF-8 to/from unicode (i.e. wchar_t) strings is
timeconsuming considering that every read/write has to go through
this. It would be more efficient just to leave it in 2 bytes. I know
that this means bigger database and heavier network traffic, but our
design is for a local database anyway; at least we'd have the choice
of selecting a true unicode character set when it makes sense.
thx!
greg
I've tried searching for info on this but no luck so far...
Environment: Windows XP, Local Firebird 2.0 Super Server, local
connection, Visual Studio 2005, using latest released IBPP interface,
plus some direct API calls as well.
Given that all interfaces to the Firebird API are single byte
character interfaces (i.e. char*), how would one open (or create) a
local database if the path happens to have some multibyte characters?
For example, if I'm on a Japanese O/S and we want to land the database
in the private user area (e.g. C:\Documents and
Settings\<username>\Application Data\<mycompany>\<myprod>\<mydb.fdb>)
but the <username> is all japanese characters...
Our application is a Windows Unicode application, so everything is 2
byte characters up to the point that it interfaces with the database.
For the actual VARCHAR/CHAR/BLOB TEXT data, we convert string to/from
UTF8 and make sure our database is using the UTF8 character set.
My concern is more with the database path itself. Do I need to use the
WideCharToMultiByte() function to get the path in a form that the
server will work with? I presume I have to use the current codepage
for this... but what if I lose characters along the way if a path
character isn't part of my current codepage?
Is this a problem I am overstressing about? We are at the planning and
prototyping stages right now, but this scenario was raised internally
as a 'risk' :(
Finally, is there any future consideration for making Firebird fully
unicode compliant at some point? By this I mean interfaces having full
unicode support (e.g. wchar_t and char flavours), and having the DB
store the data in true unicode format? One thing we notice is that
converting from UTF-8 to/from unicode (i.e. wchar_t) strings is
timeconsuming considering that every read/write has to go through
this. It would be more efficient just to leave it in 2 bytes. I know
that this means bigger database and heavier network traffic, but our
design is for a local database anyway; at least we'd have the choice
of selecting a true unicode character set when it makes sense.
thx!
greg