Subject | Re: What the best big table or several small table |
---|---|
Author | karolbieniaszewski |
Post date | 2011-12-29T07:04:47Z |
Hi
creating all combination of index is not good solution
maybe my tracker request will be good for you
if yes vote on it http://tracker.firebirdsql.org/browse/CORE-2795
but in your scenario will be good also shardering
simple way for simple shardering solution will be divide data between tables Region 1 , Region 2 ...
Karol Bieniaszewski
creating all combination of index is not good solution
maybe my tracker request will be good for you
if yes vote on it http://tracker.firebirdsql.org/browse/CORE-2795
but in your scenario will be good also shardering
simple way for simple shardering solution will be divide data between tables Region 1 , Region 2 ...
Karol Bieniaszewski
--- In firebird-support@yahoogroups.com, Vander Clock Stephane <svanderclock@...> wrote:
>
>
> > > For this i setup my location ID as a string like this :
> > >
> > > for US > NEW YORK(State) > NEW YORK(City) the id will be
> > > US-0001234-0001245
> > >
> > > for US > NEW YORK(State) > NEW YORK(City) > Manhattan the id will be
> > > US-0001234-0001245--0001298
> >
> > Stupid question, maybe, but why not use postal codes? - they're predefined
> > and denser. At least in the US, codes tend to be well-localized, meaning
> > that 02139 is adjacent to 02138 and both are in Cambridge Mass.
> >
>
> it's not postcode but just geographical ID
> every location have a unique integer ID
> and we add secondary string ID based on the id of their parent
>
> Manhattan iD is 0001298 (integer)
> manhattan secondary ID is US-0001234-0001245--0001298 (string)
>
>
> > >
> > > now in my query if an user want to see all the property in manhattan i
> > > simply do
> > > Where Location like 'US-0001234-0001245--0001298%';
> > >
> >
> > OK. This query has one of two problems. The more likely is that you're
> > not using parameterized queries. Parameterized queries are queries where
> > you prepare the query without the input values then execute sending just
> > the values. That saves the cost of parsing, compiling, and optimizing the
> > query over and over again.
> >
>
> yes but i think this is negligeable no ?
>
>
> >
> > The second, and less likely, is that you're using LIKE with a parameter,
> > which means that the condition can't use an index because the parameter
> > could start with a '%'. Instead of LIKE use STARTING WITH (or is it
> > STARTS WITH?).
> >
>
> it's always STARTNG WITH
>
>
> >
> > > and for query like
> > >
> > > where Location like 'US%' and nb_bedroom > 2 and nb_bedroom < 3
> > >
> > > i m in trouble because of the "like" i can not use the multiple column
> > > index on (LOCATION, NB_BEDROOM) ...
> > > i can use only the single column index on location + single column
> > index
> > > on nb_bedroom (and believe me speed is not the same)
> >
> > Right. But you could use an index on (NB_BEDROOM, LOCATION), if
> > you were prepared to build the query as NB_BEDROOM = x and LOCATION
> > starting with y OR NB_BEDROOM = x+1 and ....
> >
>
> yes but unfortunatly it's can work with nb bedroom but it's will not work
> with fields like price or surface for exemple because the delta min vs
> max is much
> more bigger :(
>
>
> > >
> > > so for this i simply "duplicate" the row in the datase!
> > >
> >
> > Arrggg!!! It's well known that if you've got two copies of the same
> > data, at least one is wrong. Having four copies of every piece of data
> > is ... well, not generally considered good design.
> >
>
> i know, i know ... but except "denormalyzing" the database i don't see
> any solution to my probleme :(
>
>
> > How about having separate fields with each part of the location, so if
> > you know you're
> > looking for a specific block in Manhattan, you look for
> >
> > LOCATION_COUNTRY = 'US' and
> > LOCATION_CITY = <n> and
> > LOCATION_ARRONDISMENT = <m> and
> > LOCATION_CARRE = <q>
> >
> > If you need less detail, ask for fewer fields. You'd have more
> > indexes, but if the read load is substantially heavier than the update
> > load - which I'd guess it is by 100:1 or more - then more indexes
> > aren't that much of a problem.
> >
>
> i actually think about this, but dedpendly of the country sometime their
> is more than 10 level !
> ex: United States > Alaska > Aleutian Islands > Aleutian Islands > Fox
> Islands > Krenitzin Islands > Akutan Island > Akutan
>
> so i will need to add at least all this field :
>
> location_level1 (country)
> location_level2
> location_level3
> location_level4
> location_level5
> location_level6
> location_level7
> location_level8
> location_level9
> location_level10
>
> but this will not help my query like
>
> where
> location_level3 = 0023954 and
> Price > 100000 and
> price < 200000
> order by
> location_level3, CREATION_date
>
> with index to use ?? multi column index on
> (location_level3, Price) OR
> (location_level3, Creation_date) ?
>
> that ok, i have actually the same probleme in my design
> but here i will need also to create too much of index to answer all the
> possible case !
>
> (location_level1, Price)
> (location_level1, Creation_date)
> (location_level2, Price)
> (location_level2, Creation_date)
> ...
> (location_level10, Price)
> (location_level10, Creation_date)
>
>
> in my design where i duplicate the row i can have query like
>
> where
> location = 0023954 and
> Price > 100000 and
> price < 200000
> order by
> CREATION_date
>
> and only need to create index like
>
> (location, Price)
> (location, Creation_date)
>
>
> [Non-text portions of this message have been removed]
>