Subject Re: [firebird-support] What the best big table or several small table
Author Ann Harrison
Dear Stephane,

> 1. ok, let say i m a real estate website and i have properties to sell
> in all the world.
> For this i must permit user to see all properties in one location (for
> exemple New York)
> but also in the location inside (Exemple Manhattan)
>
> For this i setup my location ID as a string like this :
>
> for US > NEW YORK(State) > NEW YORK(City) the id will be
> US-0001234-0001245
>
> for US > NEW YORK(State) > NEW YORK(City) > Manhattan the id will be
> US-0001234-0001245--0001298

Stupid question, maybe, but why not use postal codes? - they're predefined
and denser. At least in the US, codes tend to be well-localized, meaning
that 02139 is adjacent to 02138 and both are in Cambridge Mass.
>
> now in my query if an user want to see all the property in manhattan i
> simply do
> Where Location like 'US-0001234-0001245--0001298%';
>

OK. This query has one of two problems. The more likely is that you're
not using parameterized queries. Parameterized queries are queries where
you prepare the query without the input values then execute sending just
the values. That saves the cost of parsing, compiling, and optimizing the
query over and over again.

The second, and less likely, is that you're using LIKE with a parameter,
which means that the condition can't use an index because the parameter
could start with a '%'. Instead of LIKE use STARTING WITH (or is it
STARTS WITH?).


> and for query like
>
> where Location like 'US%' and nb_bedroom > 2 and nb_bedroom < 3
>
> i m in trouble because of the "like" i can not use the multiple column
> index on (LOCATION, NB_BEDROOM) ...
> i can use only the single column index on location + single column index
> on nb_bedroom (and believe me speed is not the same)

Right. But you could use an index on (NB_BEDROOM, LOCATION), if
you were prepared to build the query as NB_BEDROOM = x and LOCATION
starting with y OR NB_BEDROOM = x+1 and ....
>
> so for this i simply "duplicate" the row in the datase!
>

Arrggg!!! It's well known that if you've got two copies of the same data, at least one is wrong. Having four copies of every piece of data is ... well, not generally considered good design.


How about having separate fields with each part of the location, so if you know you're
looking for a specific block in Manhattan, you look for

LOCATION_COUNTRY = 'US' and
LOCATION_CITY = <n> and
LOCATION_ARRONDISMENT = <m> and
LOCATION_CARRE = <q>

If you need less detail, ask for fewer fields. You'd have more indexes, but if the read load is substantially heavier than the update load - which I'd guess it is by 100:1 or more - then more indexes aren't that much of a problem.


Good luck,


Ann