Subject Right Trimming Again
Author Geoff Worboys
Hi Folks,

I just discovered something that perhaps everyone already knows, but
it adds a new dimension to the discussion about whether trailing
whitespace characters should be treated as significant or not. I had
Claudio help confirm this, so it is not my imagination. Apparently,
if you want different behaviour, you need to go back to Interbase v4.


Run this script...
- - - - - - -
CREATE TABLE MYDATA
(
TESTCHAR CHAR(31),
TESTVARCHAR VARCHAR(31)
);
COMMIT;
INSERT INTO MYDATA (TESTCHAR, TESTVARCHAR)
VALUES( 'ABC', 'DEF' );
INSERT INTO MYDATA (TESTCHAR, TESTVARCHAR)
VALUES( 'GHI', 'JKL' );
INSERT INTO MYDATA (TESTCHAR, TESTVARCHAR)
VALUES( 'MNO', 'PQR' );
INSERT INTO MYDATA (TESTCHAR, TESTVARCHAR)
VALUES( 'UV ', 'XYZ ' );
- - - - - - -

You will agree I hope that the table now contains...

TESTCHAR TESTVARCHAR
'ABC ', 'DEF'
'GHI ', 'JKL'
'MNO ', 'PQR'
'UV ', 'XYZ '


Now try...


SELECT * FROM MYDATA
WHERE TESTCHAR = 'ABC'

Success!!! This returns the record, ignoring the trailing spaces!

You could argue that this is an illogical thing to do anyway. Since
you know the length of the fixed length field is greater than your
test data, how can it possibly be equal. But then you try...


SELECT * FROM MYDATA
WHERE TESTVARCHAR = 'XYZ'

Success!!! This returns the record, ignoring the trailing spaces!

Now this is really interesting, since the three spaces on the end of
the 'XYZ ' record should really be considered significant since they
were obviously specifically specified and not just added as padding by
IB itself. But still they have been ignored.


The only way to get an exact comparison seems to be to use LIKE.

SELECT * FROM MYDATA
WHERE TESTCHAR LIKE 'ABC'

and

SELECT * FROM MYDATA
WHERE TESTVARCHAR LIKE 'XYZ'

both fail (note the lack of wildcards). As Claudio explained it, LIKE
is more paranoid about the underlying data and makes the trailing
spaces significant.


So if you are building an application in which spaces are significant
BE VERY CAREFUL! If trailing spaces are significant in your data,
there is a good chance that '_' and '%' will be significant as well,
which makes using LIKE for exact comparisons rather more complex.
(You need to parse for the wildcards and insert an escape character
where required.) Perhaps building your own UDF for string comparisons
on such fields may be simpler.


This lack of trailing space significance to equality tests was news to
me (probably because I almost never treat trailing spaces as
significant). Having found this out, I thought that there may be a
few here that were also unaware of this issue.

Although you may not agree, to me this adds weight to the argument
that IBO should default to removing trailing spaces. It also adds
weight to the idea that whereever possible you should avoid using
trailing spaces as significant characters.

If you need binary data in a (VAR)CHAR field, use CHARACTER SET
OCTETS.


Geoff Worboys
Telesis Computing