Subject Re: What the best big table or several small table
Author karolbieniaszewski
Hi

creating all combination of index is not good solution
maybe my tracker request will be good for you
if yes vote on it http://tracker.firebirdsql.org/browse/CORE-2795

but in your scenario will be good also shardering
simple way for simple shardering solution will be divide data between tables Region 1 , Region 2 ...

Karol Bieniaszewski

--- In firebird-support@yahoogroups.com, Vander Clock Stephane <svanderclock@...> wrote:
>
>
> > > For this i setup my location ID as a string like this :
> > >
> > > for US > NEW YORK(State) > NEW YORK(City) the id will be
> > > US-0001234-0001245
> > >
> > > for US > NEW YORK(State) > NEW YORK(City) > Manhattan the id will be
> > > US-0001234-0001245--0001298
> >
> > Stupid question, maybe, but why not use postal codes? - they're predefined
> > and denser. At least in the US, codes tend to be well-localized, meaning
> > that 02139 is adjacent to 02138 and both are in Cambridge Mass.
> >
>
> it's not postcode but just geographical ID
> every location have a unique integer ID
> and we add secondary string ID based on the id of their parent
>
> Manhattan iD is 0001298 (integer)
> manhattan secondary ID is US-0001234-0001245--0001298 (string)
>
>
> > >
> > > now in my query if an user want to see all the property in manhattan i
> > > simply do
> > > Where Location like 'US-0001234-0001245--0001298%';
> > >
> >
> > OK. This query has one of two problems. The more likely is that you're
> > not using parameterized queries. Parameterized queries are queries where
> > you prepare the query without the input values then execute sending just
> > the values. That saves the cost of parsing, compiling, and optimizing the
> > query over and over again.
> >
>
> yes but i think this is negligeable no ?
>
>
> >
> > The second, and less likely, is that you're using LIKE with a parameter,
> > which means that the condition can't use an index because the parameter
> > could start with a '%'. Instead of LIKE use STARTING WITH (or is it
> > STARTS WITH?).
> >
>
> it's always STARTNG WITH
>
>
> >
> > > and for query like
> > >
> > > where Location like 'US%' and nb_bedroom > 2 and nb_bedroom < 3
> > >
> > > i m in trouble because of the "like" i can not use the multiple column
> > > index on (LOCATION, NB_BEDROOM) ...
> > > i can use only the single column index on location + single column
> > index
> > > on nb_bedroom (and believe me speed is not the same)
> >
> > Right. But you could use an index on (NB_BEDROOM, LOCATION), if
> > you were prepared to build the query as NB_BEDROOM = x and LOCATION
> > starting with y OR NB_BEDROOM = x+1 and ....
> >
>
> yes but unfortunatly it's can work with nb bedroom but it's will not work
> with fields like price or surface for exemple because the delta min vs
> max is much
> more bigger :(
>
>
> > >
> > > so for this i simply "duplicate" the row in the datase!
> > >
> >
> > Arrggg!!! It's well known that if you've got two copies of the same
> > data, at least one is wrong. Having four copies of every piece of data
> > is ... well, not generally considered good design.
> >
>
> i know, i know ... but except "denormalyzing" the database i don't see
> any solution to my probleme :(
>
>
> > How about having separate fields with each part of the location, so if
> > you know you're
> > looking for a specific block in Manhattan, you look for
> >
> > LOCATION_COUNTRY = 'US' and
> > LOCATION_CITY = <n> and
> > LOCATION_ARRONDISMENT = <m> and
> > LOCATION_CARRE = <q>
> >
> > If you need less detail, ask for fewer fields. You'd have more
> > indexes, but if the read load is substantially heavier than the update
> > load - which I'd guess it is by 100:1 or more - then more indexes
> > aren't that much of a problem.
> >
>
> i actually think about this, but dedpendly of the country sometime their
> is more than 10 level !
> ex: United States > Alaska > Aleutian Islands > Aleutian Islands > Fox
> Islands > Krenitzin Islands > Akutan Island > Akutan
>
> so i will need to add at least all this field :
>
> location_level1 (country)
> location_level2
> location_level3
> location_level4
> location_level5
> location_level6
> location_level7
> location_level8
> location_level9
> location_level10
>
> but this will not help my query like
>
> where
> location_level3 = 0023954 and
> Price > 100000 and
> price < 200000
> order by
> location_level3, CREATION_date
>
> with index to use ?? multi column index on
> (location_level3, Price) OR
> (location_level3, Creation_date) ?
>
> that ok, i have actually the same probleme in my design
> but here i will need also to create too much of index to answer all the
> possible case !
>
> (location_level1, Price)
> (location_level1, Creation_date)
> (location_level2, Price)
> (location_level2, Creation_date)
> ...
> (location_level10, Price)
> (location_level10, Creation_date)
>
>
> in my design where i duplicate the row i can have query like
>
> where
> location = 0023954 and
> Price > 100000 and
> price < 200000
> order by
> CREATION_date
>
> and only need to create index like
>
> (location, Price)
> (location, Creation_date)
>
>
> [Non-text portions of this message have been removed]
>