PostgreSQL Pattern Matching – regexp_replace() vs. replace()

pattern matchingpostgresqlreplace

By default, in PostgreSQL, regexp_replace() replaces the first occurrence of a substring in a string, while replace() replaces all occurrences of a substring in a string. Why is the default behavior different?

(I know that with regexp_replace one can use the g option to replace all occurrences of a substring in a string.)

Examples:

SELECT regexp_replace('hello world', 'o', 'z');  -- returns "hellz world"
SELECT regexp_replace('hello world', 'o', 'z', 'g'); -- returns "hellz wzrld"
SELECT replace('hello world', 'o', 'z'); -- returns: "hellz wzrld"

Best Answer

Why?

Because Postgres replace() is a standard SQL function that works the same as in other RDBMS. Example: replace() in SQL Server:

Replaces all occurrences of a specified string value with another string value.

While regexp_replace() is used to ...

Replace substring(s) matching a POSIX regular expression.

The handling of regular expressions is obviously guided by the POSIX standard and works the same as other tools implementing it. The manual:

POSIX regular expressions provide a more powerful means for pattern matching than the LIKE and SIMILAR TO operators. Many Unix tools such as egrep, sed, or awk use a pattern matching language that is similar to the one described here.

Related Solutions

PostgreSQL – String Replace Using Concatenated Columns

Depending on how you would want to deal with possible NULL values, concat_ws() is probably your safest and simplest way to go:

UPDATE tbl
SET filed1 = replace(field1, concat_ws(' ', field2, field3, field4), '')
WHERE filed1 IS DISTINCT FROM replace(field1, concat_ws(' ', field2, field3, field4), '')

concat_ws() ignores NULL values. With plain concatenation (||), one NULL field would make the whole pattern NULL. It was introduced with Postgres 9.1.

The added WHERE clause prevents empty updates. This enhances performance a lot if many rows wouldn't change anyway.

PostgreSQL Aggregate Functions – Longest Matching Suffix

What is wrong with the (maybe too obvious?):

select * from noun n, noun_inflection ni
where
  n.label = 'handlebar' and
  n.label ilike '%'||ni.label_singular
order by
  char_length(ni.label_singular) DESC
limit 1;

Best Answer

Why?

Related Solutions

PostgreSQL – String Replace Using Concatenated Columns

PostgreSQL Aggregate Functions – Longest Matching Suffix

Related Question