Postgresql – using regexp substitution part in postgres string functions

postgresqlregular expression

I hit this question trying to answer one in SO. When I pass \1 as first argument to lpad(string text, length int [, fill text]) function, it looks like it interprets \1 as regexp substitution part looking for string to pad chars, but then interprets same \1 as text (it is literally – '\1') calculating its length to substract from second argument: lpad('aaa',5,'b') calculates 5-length('\1'), not 5-length('aaa').

So the question is – are we meant to use SQL string functions with regexp substitution part meta syntax? What are the restrictions?
Eg: concat, format, repeat, substr understand it, while length, initcap, ascii interpret it as just text.

Now I see that length, initcap, ascii accept string as argument, while format, repeat accept text, but then ltrim(text... and lpad(text... seem to be mix – they "understand" regexp metasyntax to find position and then fail to "convert string to text" to count input length?..

Here an example:

t=# with a as (select 'a 2 b'::text s)
t-# select
t-#   regexp_replace(s,' (\d){1} ','')
t-# , regexp_replace(s,' (\d){1} ','\1') _regular
t-# , regexp_replace(s,' (\d){1} ',concat('.','\1','.')) concat_regular
t-# , regexp_replace(s,' (\d){1} ',length('.\1')::text) length_string
t-# , regexp_replace(s,' (\d){1} ',reverse('\1')) reverse_string
t-# , regexp_replace(s,' (\d){1} ',initcap('\1')) initcap_broken
t-# , initcap(regexp_replace(s,' (\d){1} ','\1')) initcap_orig
t-# , regexp_replace(s,' (\d){1} ',ltrim('\1','a')) ltrim_broken
t-# , regexp_replace(s,' (\d){1} ',lpad('\1',5,'.')) lpad_broken
t-# , regexp_replace(s,' (\d){1} ',ascii('\1')::text) ascii_broken
t-# , regexp_replace(s,' (\d){1} ',format('\1')::text) format_regular
t-# , regexp_replace(s,' (\d){1} ',repeat('\1',3)) repeat_regular
t-# from a
t-# ;
 regexp_replace | _regular | concat_regular | length_string | reverse_string | initcap_broken | initcap_orig | ltrim_broken | lpad_broken | ascii_broken | format_regular | repe
at_regular
----------------+----------+----------------+---------------+----------------+----------------+--------------+--------------+-------------+--------------+----------------+-----
-----------
 ab             | a2b      | a.2.b          | a3b           | a1\b           | a2b            | A2b          | a2b          | a...2b      | a92b         | a2b            | a222
b
(1 row)

Time: 0.420 ms

Best Answer

You got it all wrong.
As in any other function execution, the regexp function parameters are evaluated before the execution and not during the execution.

This are the replace strings you are actually using:

with a as (select 'a 2 b'::text s)
select
     ''                     as empty_string
    ,'\1'                   as _regular
    ,concat('.','\1','.')   as concat_regular
    ,length('.\1')::text    as length_string
    ,reverse('\1')          as reverse_string
    ,initcap('\1')          as initcap_broken
    ,'\1'                   as initcap_orig
    ,ltrim('\1','a')        as ltrim_broken
    ,lpad('\1',5,'.')       as lpad_broken
    ,ascii('\1')::text      as ascii_broken
    ,format('\1')::text     as format_regular
    ,repeat('\1',3)         as repeat_regular
from a

+--------------+----------+----------------+---------------+----------------+----------------+--------------+--------------+-------------+--------------+----------------+----------------+
| empty_string | _regular | concat_regular | length_string | reverse_string | initcap_broken | initcap_orig | ltrim_broken | lpad_broken | ascii_broken | format_regular | repeat_regular |
+--------------+----------+----------------+---------------+----------------+----------------+--------------+--------------+-------------+--------------+----------------+----------------+
|              | \1       | .\1.           | 3             | 1\             | \1             | \1           | \1           | ...\1       | 92           | \1             | \1\1\1         |
+--------------+----------+----------------+---------------+----------------+----------------+--------------+--------------+-------------+--------------+----------------+----------------+