Subject Re: Long table or many tables?
Author epedder
Hello Helen,

> In principle, there's nothing to gain by isolating
> each user's set in its own table.

That's what I want to know. But why is there nothing to gain? I don't doubt that it is true but it would be helpful if I could better understand this. If there are n users and each user has m messages, then surely each operation has to cover nm rows. Does that make a difference?

Now I realise that this project is not necessarily suited to a relational database (and most of our implementations use flat-file systems), but my job in this case is to implement the system for Firebird.

> The thing to focus on with regard to your
> problem is *why* you should be seeing slowdown
> as data increases.

You are of course correct, but I'm not surprised that operations become slower when the dataset becomes larger. We probably will get more of an improvement out of improving housekeeping as you suggest.

> > Currently we have a single table that
> > stores each message in a row, and a field
> > identifies the user. Most of the fields
> > are indexed too.
>
> In saying "most of the fields are indexed"
> should we infer that your records have more
> than just the UserID and a message?

My apologies for not being clear. Each row has a field for message subject, sender, recipient, a folder identifier, date, size, and several others. Each of these is indexed (ascending and descending) because users need to be able to page through the data in order. Not a good example of a relational database.

Each index is a combination of the user id and the target field. So, for example, the subject index is userid+subject, because each user only has access to their own data so all SELECT operations have a WHERE clause that includes the userid.

But getting back to my question. Given that every conceivable operation will be a subset of the data specific to a user, with users (and all their messages) being added and deleted from time to time: what are the technical problems with creating one table per user?

Is it that handling many tables imposes a performance penalty? Is it that the data is effectively stored the same way anyway? Or perhaps the cache system is less efficient with this design? Any of this information would help me in the early stages of planning a possible replacement design in the future.

Thanks for your help.

Also, I have a copy of your book and it has been a valuable help and I think an important contribution to the community.