Subject Re: [firebird-support] FB Embedded: Table Field encryption questions
Author Geoff Worboys
Hi Chuck,

> I just finished reading Geoff Worboys' "Firebird File and
> Metadata Security" which raised a number of interesting
> points about my project and I really could use some help
> with these questions.

Sorry it is so long, I tend to be long winded once I get
started on a subject that interests me. :-)

> Now I'm working on another project in which the user will
> be making a number of searches of VarChar fields and even
> in BLOBS--the very same fields that should be encrypted to
> protect my intellectual property.
> I thought about writing a UDF that does the decryption for
> me so that the query results would be decrypted and then I
> could dataset.locate or something similar for BLOBs.

> i.e. SELECT FB_DECRYPT(hashed key, TEXT_FIELD),

You cannot unhash a value, so how does sending the hashed key

Will a UDF help? That depends. If your users are smart enough
to write their own UDF then what they would do is replace your
UDF with their own to output the key (or hashed key) sent by
your application (and optionally call your function to let
things work "as normal").

If you did anything with FB_DECRYPT in stored procedures then
its even easier. I would update that stored procedure to ouput
the given key value before calling your UDF.

> But a UDF declared in the database would be pretty much of a
> red flag to any half way tech person looking at the metadata,
> plus the dll would be right there. The only thing that keeps
> everything secure at that point would be keeping the key safe.

And here you highlight the main point. Security by obscurity
is a relative term. To whom is it obscure? As described above
anyone capable of writing a UDF can soon work things out and,
depending on how you use the UDFs, possibly anyone that can
do a bit of SQL DDL fiddling.

> At this point a field search would mean looping through the
> dataset one row at a time decrypting and seeing if the item
> matches. This seems horrendously slow!

I imagine it could be slow, especially if the blobs are
potentially quite large, although we are talking about embedded
so it will not be as bad as trying to do it over a network.
For non-blob fields the lack of ability to gain advantage from
indexing may also prove a problem for tables with large numbers
of rows.

> I was thinking about encryption/decryption schemes (albeit
> not particularly secure) that have a character by character
> replacement. Thus a search word or phrase could just be
> converted then looked up in the "encrypted" dataset. As you
> can imagine, that seems like no security at all (it would
> not take long to create a text conversion key and apply
> it to the entire set of tables and fields.)

If only interested in very vague security - to stop users
from casually browsing the data - then your suggestion of doing
character-by-character encryption may be sufficient. Indeed if
you simply perform (char = char + 1) or similar then you may
even be able to use indexes to correctly sort the data. We all
know how easy this would be to break - but that presumes that
someone is interested, and educated, enough to try.

That is; If the volume is large then using pen and paper would
be impractical. Thus I would have to write a program or script
to automate it. Not difficult, but it would take some effort
and some knowledge that may (or may not) be beyond your
anticipated users.

(Although, given your interest in encryption at all, I would
presume that at least some of your users would be capable of
such a task. One of my projects was for users that had trouble
finding the power switch, so I really was not concerned about
encryption and what I described above would have been more than
adequate. :-)

> Geoff suggests the use of an encrypted volume on the user's
> disk, but I'm not sure if I could control such a system during
> the install of my program or open the volume for the program
> to use without having the user become manually involved with
> the password.

This suggestion was really about users protecting their own
data. I doubt if it would be practical in your situation.
(The hassles of getting it mounted and discovering where it
was mounted and so on.)

However I guess that it may be possible to do if the software
has the necessary hooks. The commercial product BestCrypt
does have a development kit, but I do not know if it enables
this capability.

And anyway, once mounted the users could launch an differnt
program to interrogate the database on the now mounted drive.
(Maybe I am wrong here, with embedded the access is exclusive,
but I do wonder what would happen if I killed your program
with task mgr.)

> He also suggests the use of: "A simple XOR against some
> known string (the key) is sufficient - this will obscure,
> and while the key is not known to the thief it will be
> relatively difficult to break as long as some care is taken."

This is the about the encryption algorithm, and the idea that
secure algorithms can be expensive (in performance terms).
Where only obscurity is required then a simpler algorithm may
be sufficient.

It does not help you with where and how to implement encryption
whether it is weak or strong. You will obviously need to
decide that first and then decide on the strength required.

> What do you think? It still sounds like I'd need a UDF to
> implement this.

I am not convinced that a UDF will leave you is much a better
position than no encryption at all. It is really all about
key management. If you have some way of ensuring that the
UDF will get its key from some source that cannot be easily
or obviously copied along with everything else, then perhaps
it will be an option. Otherwise I doubt if it would be worth
the hassle.

> Is there another/better way? At this point I'll accept
> "security by obscurity," but I'm not even sure how to do that
> and still be able to search the dataset.

> Thank you for any thoughts or help,

I doubt if anything I have said so far really helps very much.

A wild card to throw into the pile...

Does the data need to be in the database? Being embedded you
could potentially have such data in a separate file in the
application directory. A file that you decrypt and load into
memory (and so also the OS swap file on disk!) all at once.

(Probably you want it in the database because it is high volume
and/or needs to be updatable and/or needs to be related to
other data in the database. If none of these are true then
perhaps a separate file might work.)

Presuming your key management is adequate (for your purposes)
at the client, then I suspect that the only mechanism to give
reasonable obscurity will be to bring the encrypted data back
to the client app for decryption - Firebird really is not the
place to try and install obscurity. Performance may be a
problem, but perhaps you can create various tricks to work
around the problems.

eg: Implement something like a full text search engine over
your secret data (where the index may be encrypted or not,
exist on the database or not). Appropriately implemented it
may allow you to map search requests into specific (primary key
based) row requests and so boost performance. A fair bit of of
work I suspect, but it depends how important the performance
issues are likely to be. (And if full-text search is already
a desirable feature then it may indeed be worth while.)


Geoff Worboys
Telesis Computing