Subject Re: ODP: [firebird-support] substring similar - "Invalid SIMILAR TOpattern"
Author Mark Rotteveel
On 5-12-2018 18:00, Karol Bieniaszewski liviuslivius@...
[firebird-support] wrote:
> Is this sql standard concept that i must do this in this crap way?

Yes, it is specified in SQL:2016, section 6.32 <string value function>",
"<regular expression substring function>".

> SELECT substring('ab11c' similar '[[:ALPHA:]]+#"[0-9]+#"[[:ALPHA:]]+'
> escape '#') FROM RDB$DATABASE
>
> Also strange that i must consume whole string by reg expression not only
> part of it.

The syntax defines which part you want to obtain in terms of its
position in the rest of the string.

> This can be as an option but as default it is strange for me.
>
> Why not simply do:
>
> SELECT substring('ab11c12bcd' similar '[0-9]+' itemNumber 1) FROM
> RDB$DATABASE
>
> Will simply return „11”
>
> SELECT substring('ab11c22bcd' similar '[0-9]+' itemNumber 2) FROM
> RDB$DATABASE
>
> Will simply return 22

"Why not simply" because that is not the behavior defined by the
standard for this specific function.

The SQL:2016 standard also has "<regex substring function>"
(SUBSTRING_REGEX) and a number of related functions like LIKE_REGEX,
OCCURRENCES_REGEX, TRANSLATE_REGEX and POSITION_REGEX, which use the
XQuery fn:matches() regex syntax.

This function allows you to specify the occurrence and capturing group
to return (and some more things like start position in string), but
Firebird doesn't provide this yet.

> Now i have 2 udf like this:
>
> REG_MATCH
>
> REG_MATCH_COUNT
>
> and i supposed that i can replace it with built in one, but i see that
> this is really terrible.
>
> Above udfs i can use in this way
>
> SELECT REG_MATCH(‘ab11c22bcd’, ‘[0-9]+’, 1) FROM RDB$DATABASE
>
> Return 11
>
> SELECT REG_MATCH(‘ab11c22bcd’, ‘[0-9]+’, 2) FROM RDB$DATABASE
>
> Return 22
>
> SELECT REG_MATCH_COUNT(‘ab11c22bcd’, ‘[0-9]+’) FROM RDB$DATABASE
>
> Return 2
>
> I can use it in the where clause:
>
> SELECT * FROM MY_TABLE T WHERE REG_MATCH_COUNT(T.FIELD, ‘[0-9]+’)>2
>
> Or
>
> SELECT * FROM MY_TABLE T WHERE REG_MATCH(T.FIELD, ‘[0-9]+’, 1)=’11’

If I understand the SQL:2016 SUBSTRING_REGEX correctly, the equivalent
for that would be SUBSTRING_REGEX('[0-9]+' IN T.FIELD) or - explicitly
specifying the occurrence - SUBSTRING_REGEX('[0-9]+' IN T.FIELD
OCCURRENCE 1).

The equivalent of that REG_MATCH_COUNT would be
OCCURRENCES_REGEX('[0-9]+' IN T.FIELD)

Unfortunately we don't have that yet in Firebird.

Mark
--
Mark Rotteveel