Subject | Re: [Firebird-Architect] IN LIST |
---|---|
Author | = m. Th = |
Post date | 2006-12-03T10:27:37Z |
Dmitry Yemanov wrote:
from the list. (But also, at the end of his message I didn't understand
very clear what he's saying...). On my understand the query:
select * from t where id in list (1, 3, 5-10, 12);
will be equivalent with
select * from t where id in(1, 3, 5,6,7,8,9,10, 12);
...personally I would prefer instead an extension of IN syntax, along to
making the number of IN members 'unlimited' (as you know there is a
easy, fast and safe way for this ;) ). IMHO, is very clear for the
engine what to do when it reads in (1, 3, 5-10, 12,...) IMHO, I don't
think that another keyword (ie. 'LIST') will help.
This would be a great advantage when a developer has already the members
on the client (as you know ;) ) and must pass many IN members in a 64k
query.
A great example here would be data synchronization/replication.
For example, we have two systems 'src' and 'dest' on which we must
replicate the data from 'src' to 'dest'. On our discussion, we have two
cases:
1. The actions/data changes must be replicated exactly in the order in
which they occurred in the source (due to constrains, generators aso.),
case in which a log of changes embedded in a EXECUTE BLOCK structure
will be, by far, IMHO the best solution, especially if you remove the
'oddity' explained in (CORE-1028) (no more than 255 contexts...) and
introduce the REPLACE command in 2.1
2. The actions/data changes can be replicated in any order (due to db
and/or application design, nature etc.). So in this case we have:
-for deleted items: DELETE FROM <table> WHERE PK IN (<the PK of
deleted items>) - which will execute in almost no-time
-for updated items: Create cursor MyCrsr As Select * from <table>
where PK IN (<the PK of updated items>) order by PK - and updating using
this cursor every line from stream. IMHO, this is very fast because the
table and the indexes are scanned only once for all the items which must
be updated not once for _evey_ item.
-for inserted items: Of course, here is a little bit harder to do a
'Select'... :) We must insert them in a 'normal' way.
not so rare today, especially in information management side of things,
in which the IN clause cannot be expressed or built efficiently (ie. in
a reasonable amount of time) on a server-side stream (table/view/join
etc.). Ie. the primary keys are not so many (buy more than 1500) and can
be processed with speed and ease on the client-side. For ex.:
non-relational structures (the most common ones: trees in which the user
checks a node and all the sub-nodes are selected), the replication
example above, other OLAP, (star and snow-flake db schemes) and
user-selection issues that you, most probably, know :)
my 2c,
m. th.
> Adriano dos Santos Fernandes wrote:IMHO, I think that he speaks about a way to 'compress' the IN members
>
>> Hence, a way to pass indeterminate number of intervals is very great:
>> select * from t where id in list '5-10,20-30,50'
>>
>
> Do we speak about lists or ranges or both? IMO, the syntax should be
> clear about that.
>
from the list. (But also, at the end of his message I didn't understand
very clear what he's saying...). On my understand the query:
select * from t where id in list (1, 3, 5-10, 12);
will be equivalent with
select * from t where id in(1, 3, 5,6,7,8,9,10, 12);
...personally I would prefer instead an extension of IN syntax, along to
making the number of IN members 'unlimited' (as you know there is a
easy, fast and safe way for this ;) ). IMHO, is very clear for the
engine what to do when it reads in (1, 3, 5-10, 12,...) IMHO, I don't
think that another keyword (ie. 'LIST') will help.
This would be a great advantage when a developer has already the members
on the client (as you know ;) ) and must pass many IN members in a 64k
query.
A great example here would be data synchronization/replication.
For example, we have two systems 'src' and 'dest' on which we must
replicate the data from 'src' to 'dest'. On our discussion, we have two
cases:
1. The actions/data changes must be replicated exactly in the order in
which they occurred in the source (due to constrains, generators aso.),
case in which a log of changes embedded in a EXECUTE BLOCK structure
will be, by far, IMHO the best solution, especially if you remove the
'oddity' explained in (CORE-1028) (no more than 255 contexts...) and
introduce the REPLACE command in 2.1
2. The actions/data changes can be replicated in any order (due to db
and/or application design, nature etc.). So in this case we have:
-for deleted items: DELETE FROM <table> WHERE PK IN (<the PK of
deleted items>) - which will execute in almost no-time
-for updated items: Create cursor MyCrsr As Select * from <table>
where PK IN (<the PK of updated items>) order by PK - and updating using
this cursor every line from stream. IMHO, this is very fast because the
table and the indexes are scanned only once for all the items which must
be updated not once for _evey_ item.
-for inserted items: Of course, here is a little bit harder to do a
'Select'... :) We must insert them in a 'normal' way.
>There are some very concrete situations, not in a very big number, but
> 2) Why is it better than a simple:
>
> for
> select * from t
> where id in ( select ... )
> ?
>
>
not so rare today, especially in information management side of things,
in which the IN clause cannot be expressed or built efficiently (ie. in
a reasonable amount of time) on a server-side stream (table/view/join
etc.). Ie. the primary keys are not so many (buy more than 1500) and can
be processed with speed and ease on the client-side. For ex.:
non-relational structures (the most common ones: trees in which the user
checks a node and all the sub-nodes are selected), the replication
example above, other OLAP, (star and snow-flake db schemes) and
user-selection issues that you, most probably, know :)
my 2c,
m. th.