Subject | Re: [firebird-support] Re: Normalisation + performance |
---|---|
Author | David Johnson |
Post date | 2005-07-05T04:23:29Z |
Adam covered this nicely.
However, there is one caveat he missed - one that most databases have a
problem with. Many records require both current state and history.
The example of an employee who changes their name has this issue:
Last week, Miss Jones worked for me. Over the weekend she got married,
and now she is Mrs Smith.
Your design needs to be able to properly represent and reproduce that
Miss Jones worked last week, Mrs Smith worked this week, and that they
are, in fact, the same person. Which name is used depends on the time
frame of the query and what you need to do with it.
Every person may have one or more names depending on time, one or more
roles relative to the company both independent of and depending on time
(they may be employee, customer, and supplier all at once), and so on.
Ultimately, many entities end up being very abstract nexuses (nexii?)
that contain only keys for tying information from other tables together.
In the case of Adam's example, a "person" is simply a token on which you
can key for other information. Given a person token and a time frame,
you can establish name and other personal details. Normally, you use
current time, but for historical reporting you use the time of the
historical record. In some cases, you might want both the as-of and the
current information.
However, there is one caveat he missed - one that most databases have a
problem with. Many records require both current state and history.
The example of an employee who changes their name has this issue:
Last week, Miss Jones worked for me. Over the weekend she got married,
and now she is Mrs Smith.
Your design needs to be able to properly represent and reproduce that
Miss Jones worked last week, Mrs Smith worked this week, and that they
are, in fact, the same person. Which name is used depends on the time
frame of the query and what you need to do with it.
Every person may have one or more names depending on time, one or more
roles relative to the company both independent of and depending on time
(they may be employee, customer, and supplier all at once), and so on.
Ultimately, many entities end up being very abstract nexuses (nexii?)
that contain only keys for tying information from other tables together.
In the case of Adam's example, a "person" is simply a token on which you
can key for other information. Given a person token and a time frame,
you can establish name and other personal details. Normally, you use
current time, but for historical reporting you use the time of the
historical record. In some cases, you might want both the as-of and the
current information.
On Mon, 2005-07-04 at 23:40 +0000, Adam wrote:
> > I know that one should go for normalization.
>
> Absolutely!!!!
> The fact that many databases are not designed properly and are not
> normalised means that it is possible to create inconsistent data in
> the table. By this I mean that you can add data to the table that
> does not make sense in the real world, and if you allow the data to
> be added that way, it is likely to cause a headache for you in the
> future when you need to process that data.
>
> > But how does it affect
> > performance?
>
> Two ways. Generally speaking, you need to store less data. For
> example, in a system I worked on, the employees name was stored in
> the shift table. That means that every time the employee worked, it
> costed the database around an additional 50 bytes.
>
> On the other hand, with a normalised database, you may need to create
> additional key fields, but these are almost always worth the expense.
> For example, what if one of the employees changed their name? Well
> the poor old system had to then update 200 records in that table,
> yuck. It also meant that if someone was editing one of those 200
> shifts then the name could not be changed because of the record locks.
>
> So normalised databases make it easier because you
> a) Store less duplicate information.
> b) Do not have to cascade anywhere near as many updates.
> c) Help developers maintain the data.
> d) To a large degree it removes need for complex triggers to prevent
> illegal data from being added.
>
> When all of this comes together, you get a performance benefit in
> both the running of the system, the maintenance of the system, and
> the development of the system. If need be, you can create stored
> procedures and views to simultate the old tables and assist
> developers who may not immediately understand the new structure.
>
> Adam
>
>
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> Visit http://firebird.sourceforge.net and click the Resources item
> on the main (top) menu. Try Knowledgebase and FAQ links !
>
> Also search the knowledgebases at http://www.ibphoenix.com
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> Yahoo! Groups Links
>
>
>
>
>