Subject Re: [firebird-support] Advice requested on design pattern
Author Andrea Raimondi
Hi!

Here are my opinions; please feel free to point out any inaccuracies.

On Tue, May 19, 2015 at 11:35 PM, Mike Ro miker169@... [firebird-support] <firebird-support@yahoogroups.com> wrote:


I am creating a database that consists of 30 - 40 tables.

Each table stores a specific set of data which has a specific set of attributes. There are also attributes common to all tables for example:
  • UID (from generator)
  • Name
  • Description
  • DateTime Created (audit field)
  • DateTime Modified (audit field)
  • User Created (audit field)
  • User Modified (audit field)
Most tables will contain a few thousands of records, some of them may be largish blobs such as photos but mostly it will be plain text and HTML.

Normally insertions and updates would be infrequent but retrieval needs to be as fast as possible. The data is being displayed in a relatively simple client written in C++ and using IBPP.

Q1: I understand that Firebird does not support table inheritance. Therefore is it better to create a 'COMMON_FIELDS' table and then join the 'specialised' tables to it or include the common fields (i.e. the list above) in each and every table?

I think it really depends on what you are trying to do and what are your performance expectations.
You might, for instance, only need the description or the date created occasionally (for example, only when
checking the details of a photo) in which case having a base table and correlated ones could be beneficial if we are talking about
thousands or millions of users.
The downside of this approach is that you kind of lose referential integrity and may end up having bugs which
insert the photo but not its details. I do realise it is fairly uncommon, but it may still happen. Would that be ok to run the
risk of this happening?

On the other hand, having all fields in all tables means you do not have to do a join, which means you can have a richer and
more complete overview of something directly on your main screen without affecting the performance too much, especially if you have
some form of caching enabled.
The downside of this is that you really need caching especially if we are talking about blobs whose size is, so to speak, "accomplished".

Also, speaking about blobs: I do believe that - really - you should avoid putting them inside the tables as much as possible.
What you could do - at least for pictures... I am not entirely sure for other media - is to have a thumbnail directly stored in the DB and
then the full-size image stored on the HD. This is slightly slower in retrieval but it is mitigated by the fact that networks still lag relatively a lot,
hence any delay in loading and transmission is not really noticeable especially if you leave enough throttle on the server (which you have to do
anyway because you are transmitting blobs).
This will also pose a problem with backup, but I do not believe that to be a major problem quite frankly.
Another alternative (and one I would highly endorse in this day and age) would be to be smart and store the blobs into the
users' dropbox accounts and then use an URL reference to pick it up, having such reference stored in the DB). The advantage of this
solution would be to split the band requirements on your original server machine and also avoid having backup problems. As an
additional upside, the user can be comforted in knowing that he still has full access to his own data which, if you ask me, surely is
going to be a powerful advantage in the coming years as data ownership and privacy issues slowly gain the front page of users' concerns.
 

The 'created' and 'modified' audit fields should be updated automatically and obviously I will use a trigger to do this. One advantage I can see of using a single COMMON_FIELDS table is that I can write one stored procedure and call it from the 'before' trigger of each table. Alternatively in the case where the common fields are repeated in each table I could pass the table name as a parameter to the SP.

No, Just no. Each table its own trigger. Do not do that. Trust me, that's a *REALLY* bad idea. If anything, use numeric constants that cannot be hijacked as easily.


Q2: Are there any pitfalls (re-entrance) in calling a common stored procedure with a table name from a trigger and have the SP take care of the audit field updates (not exactly sure how to do this yet)?

Yes, you screw up the security. Doing that means you would need to use Dynamic SQL and that's 99% evil. Do not do that unless you are really obliged to for some reason.

It would be good if the audit fields were not easy to fiddle with (i.e. by someone using FlameRobin or isql). The obvious way would be for them to be 'lightly' encrypted (i.e. obfuscated) during write in the SP and decrypted in the application.

In my opinion, the only way to do that is to have them backed up on a different - inaccessible - database. This would also open up a business opportunity if your
product is commercial, i.e, "be able to retrieve the last audit details if you can't guarantee that the db has not been fiddled with" (such as with an intrusion of some
sort or suspicious internal activity).
 

Q3: Is it possible to hide the SP from a casual observer or would it be better to write a UDF for the obfuscation?

UDFs are not an obfuscation solution. Consider the following scenario: you use an open source UDF for encryption.
Now, a malicious user turns off the FB service and substitute your legittimate version of the UDF with one which also
contains a keyboard hook or worse, a global hook registering all calls made on a Windows server. Now you are in serious trouble.

Does it help?

Kind Regards,

A