Subject Re: [firebird-support] LIST gives inconsistent results
Author Mark Rotteveel
On 2017-09-28 16:40, 'Daniel Miller' dmiller@...
[firebird-support] wrote:
> But that doesn't make sense. Right from the docs you quoted - "If
> ordering is important,

It is no guarantee as documented: "The ordering of the list values is
**undefined**—the order in which the strings are concatenated is
determined by **read order from the source set** which, in tables, is
**not generally defined**.". The final sentence, "If ordering is
important, the source data can be pre-sorted using a derived table or
similar. ", is just a trick that works by arranging the desired 'read
order from the source set' with the implementation as it currently is,
it is not actually guaranteed and it can be tricky to get right.

> the source data can be pre-sorted using a derived table or similar." So
> if I create the derived table using ORDER BY - why is the LIST not
> sorted?

In the mail I replied to, you applied the ORDER BY in the subquery that
was then subject to a join, this can apply a different order because it
might apply an index to perform the joining. In my reply, I hoisted the
order by up out of the join, and before the group by.

> But...if somehow the join is messing up the LIST processing I can work
> around that - generate the LIST first and then join after. Here's the
> next issue:
>
> Having generated a table with LIST results - why can I not sort or
> compare on the LISTS? I first create a view:
> create view TEMP_ROUTE_HISTORY_LISTED (NODE, ROUTE_UPDATED,
> LISTED_ROUTE)
> as
> select H.NODE, H.ROUTE_UPDATED,
> list(H.ROUTE_INDEX||H.LINK_NODE||H.QUALITY) LISTED_ROUTE
> from ROUTE_HISTORY H
> group by H.NODE, H.ROUTE_UPDATED;
>
> This (seems) to work - I get exactly what I think I want now.
> Except...the sort order of the rows. So...
> select TRHL.LISTED_ROUTE
> from TEMP_ROUTE_HISTORY_LISTED TRHL
> where NODE=5003
> order by TRHL.LISTED_ROUTE
>
> This does...exactly the same as the view. LISTED_ROUTE is obviously not
> sorted. What am I doing wrong now?

The execution plan of a view is not fixed, it is influenced by how it is
used. And this execution plan dictates the read order, and thus the
order of values in LIST. The use of "where NODE=5003" can cause a
different execution plan, for example - assuming an index on
ROUTE_HISTORY.NODE - it could access the rows in index order instead of
natural (storage) order. Or it might apply a different optimization
based on "order by TRHL.LISTED_ROUTE" that leads to a different access
order.

In any case, you need to apply an explicit order by **before** the LIST,
and not rely on some accidental sort order. Try to create the view as:

select NODE, ROUTE_UPDATED,
list(ROUTE_INDEX||LINK_NODE||QUALITY) LISTED_ROUTE
from (
H.NODE, H.ROUTE_UPDATED, H.ROUTE_INDEX, H.LINK_NODE, H.QUALITY
from ROUTE_HISTORY H
order by H.NODE, H.ROUTE_UPDATED, H.ROUTE_INDEX
) a
group by NODE, ROUTE_UPDATED;

but again, this might still fail, as this trick relies on implementation
artefacts of list and execution plans: this behavior is not
guaranteed(!). Even worse, selecting from the view with a where might
still screw this up, a single query with the where pushed down as far as
possible into the subquery would probably have a greater chance of
success.

But the only real solution would be for Firebird to add something like
LIST(... ORDER BY ...), which it currently doesn't have.

Mark