Subject Re: [ib-support] Re: UDF Failure
Author Claudio Valderrama C.
""alex_vnru"" <ded@...> wrote in message
> Claudio, why not to use maximum possible size of buffer at
> compilation time if sum of data declarations exceed it? And check
> overflow on execution time by real data size?

I personally won't make the internal code a mess just to please you, sorry.
There would be too many places to mangle. But you may want to sponsor
someone that wants to do that, of course. Assignments are checked at
run-time, but maximum capability is checked at compile time or we would need
lots of internal tests.

> Note that FB allows to
> have in table chars in size more than declared (after alter column to
> less size).

Example, please, using normal DDL.

> If concentrate on declarations only, what will occure on
> execution time? Will such a table became unusable?

If the person wants to change system tables directly or use one of those
funky tools that fiddle directly with system tables (here I applaud Martijn
that mostly doesn't do that with his tool), then almost anything can happen.
But if the person wants to make a table unusable for himself this way, let's
please him. There's no silliest person than the one sabotaging himself,
unless we are speaking about a beta tester or security researcher.

> create table t (a varchar(10), b varchar(20))
> insert into t (a,b) values ('1234567890', '0987654321')
> update t set a = b
> shouldn't it work? Declaration of B is greater than declaration of A.

Is there some genetic reason why you can tackle complex SW design but can't
get a simple idea? The engine knows the semantics (or can potencially know)
the semantics of everything that's built-in, because it's in the design.
CHAR, VARCHAR, UPDATE, etc, are not plug-ins but internal code.
However, UDFs are the programmer's thing and they can do anthing. How do you
know if a UDF that declares a return of cstring(100) will use all its space?
Your previous example
is fine if you know what the UDF does. For the engine to reach your level of
human knowledge and synthesis, it would have to infer behavior from the
UDF's name and otherwise throw error "can't guess behavior". What if I write
the name in Spanish? Oh, well, we put a translator in the engine and failing
that, the engine asks the user if it's okay to assume that such function
never can return a string longer than its argument just to optimize the
case. We could design a language to express the constraints a UDF has. Would
someone want to use or develop UDFs with such complex requirements?
Otherwise, FB will have to be able to analyze the sources or the compiled
code in a sandbox previously. Do you want the engine to include full
Artificial Intelligence or a sandbox like an antivus? Furthermore, I can
name a function TRIM but have it do padding internally, it's all up to me to
give it a meaningful name. Hence, the only safe assumption is that if the
UDF says it returns cstring(100), this is the worst case. You should cater
for the worse case, not for the better and crash royally after.

Your "update" example proves nothing regarding buffer limitations for me.
Let's not mix topics. Both your fields are withing sensible and legal
limits. The engine checks for assignments at run-time, but that's different
than exceeding a buffer. When you exceed the buffer, overrun already
happened. When counters already wrapped, you may not realize such fact,
because they are unsigned. Using your example, but modified:

create table t (a varchar(10), b varchar(32765));

This statement won't pass at compiling, because it's invalid syntax:
insert into t (a,b) values (a, a || b || b);

This one is rejected due to buffer limits, too:
insert into t (a,b) select a, a || b || b from t;

This statement is accepted:
insert into t (a,b) select a, b || b from t;

This statement is rejected due to buffer limits and it's not a bug:
select a, b || b from t;

It's not inconsistent WRT the previous example because a SELECT that has to
return rows to DSQL uses a couple of flags (message parameters for NULL,
record version, dbkey and EOF) that add to the maximum row length, whereas
the insert from select handles those conditions internally, it doesn't
return to DSQL to insert a table's records to itself (that would be an
infinite loop if my example table had some rows).

This example won't pass buffer limits check:
update t set a = a || b || b;

We were speaking about buffer limits. Once the UDF has returned, even if
it's declared cstring(100), it may produce a cstring(1) and will be handled
dynamically same than your "update", IE if it fits in the target variable,
it will be assigned, otherwise run-time "truncation error" happens. There's
a rational difference between max size handled and dynamic sizes between the
allowed limits (that always has worked). The max static size is now 64K (and
has been lifted for concatenation, too) but it doesn't negate the fact the
engine should continue enforcing this limit at compile time, because at this
time the internal processing buffers (called impure areas) are created.
Almost anything was checked, except concatenation. Then we discover that a
lot of users practice engine abuse. Well, if you know that your
concatenation will produce a value that's not representable in system tables
and it's not representable in the external DSQL API, then it's your
responsability to manage your Frankenstein properly.

Claudio Valderrama C. - -
Independent developer
Owner of the Interbase® WebRing