Postgresql – How to pattern match against Unicode characters by PostgreSQL `SIMILAR TO` expressions

pattern matchingpostgresqlunicode

In a PostgreSQL database, I have a column named metaid of type text whose value contains digits as specified in the third edition of XML 1.0 standard:

Digit ::= [#x0030-#x0039] | [#x0660-#x0669] | [#x06F0-#x06F9] | [#x0966-#x096F] | [#x09E6-#x09EF] | [#x0A66-#x0A6F] | [#x0AE6-#x0AEF] | [#x0B66-#x0B6F] | [#x0BE7-#x0BEF] | [#x0C66-#x0C6F] | [#x0CE6-#x0CEF] | [#x0D66-#x0D6F] | [#x0E50-#x0E59] | [#x0ED0-#x0ED9] | [#x0F20-#x0F29]

In the documentation of PostgreSQL for SIMILAR TO expressions, matching against such Unicode characters by their range is not mentioned.

I need to put a CHECK constraint on the column for ensuring the pattern of the entered value conforms to the required format.

How can I accomplish this?

Best Answer

Here is a check constraint:

CHECK (col ~ '^[0-9\u0660-\u0669\u06F0-\u06F9\u0966-\u096F]+$')

The pattern starts with "beginning of string" (^) and ends with "end of string" ($)and allows one or more "digits" in between.

^ and $ are necessary, otherwise the pattern would match abc123efg.