Subject Re: [firebird-support] Newbie question on how to hold an unusual data type
Author Ann Harrison
On Mar 7, 2015, at 4:12 AM, 'Louis van Alphen' louis@... [firebird-support] <firebird-support@yahoogroups.com> wrote:
>
> For storage purposes I would use normal column types, ints, whatever.

OK.

> For duplicate checking, I would in the app, calculate a unique hash
> from all the necessary fields taking part in the duplicate check.

Much less OK. Hash functions do not produce unique results - even hashing to a range much larger than the intitial string will produce duplicates. See the birthday problem. So by hashing you're taking what may well be a unique value and mashing it into something that's likely to have duplicates.

> That hash goes into a varchar column that has a unique constraint.

Why store everything twice? Firebird handles unique and primary key constraints on multiple columns reasonably efficiently. What this process does is double the storage while introducing accidental duplicates. At the same time, it leaves in place all the processing Firebird does to create an easy to compare key from the given data. Which is good, but all the work done in the application is wasted.


> The db engine then enforces uniqueness..
>

True and very important. Uniqeness can only be determined by the database engine. If two transactions attempt to store the same value in a column, neither can see the other's entry and each will see its action as preserving uniqueness. The Firebird server sees all and knows all, and it will prevent concurrent transactions from violating unique or primary key constraints.

For those of you who follow the dev list and saw a request for a READ UNCOMMITTED mode, don't leap up and down saying "There it is! There's the reason for READ UNCOMMITTED! If I only could read uncommitted records, I could (at great expense and inconvenience) maintain uniqueness in my application without relying on the Firebird engine."

A moment's calm reflection will reveal the flaw in that thinking. Let us assume that you want to enter "Arbuthnot" into a column that should not contain duplicates. When your transaction starts, there's already an Arbuthnot committed, but a concurrent transaction has deleted it without committing. You read the uncommitted deletion, conclude that your Arbuthnot is the only Arbuthnot, store it and commit. The concurrent transaction rolls back its deletion and now there are two Arbuthnots, both created by committed transactions.

Here endeth the lesson.

Cheers,

Ann