Data redaction with regex in Oracle

data-maskingoracleoracle-12c

I would like to redact data with regex in oracle database. I know the procedure but I get unpredictable results.

I use the following script to apply redaction to a certain column with card numbers.

BEGIN
DBMS_REDACT.ADD_POLICY ( 
   object_schema          => 'my_schema',
   object_name            => 'my_table',
   column_name            => 'card_number',
   policy_name            => 'redact_my_table', 
   function_type          => DBMS_REDACT.REGEXP,
   function_parameters    => null,
   regexp_pattern         => '(............)(.+)(........)',--DBMS_REDACT.RE_PATTERN_CC_L6_T4,
   regexp_replace_string  => '\1****\3', --dbms_redact.re_redact_ccn, --DBMS_REDACT.RE_REDACT_CC_MIDDLE_DIGITS,
   regexp_position        => DBMS_REDACT.RE_BEGINNING,
   regexp_occurrence      => DBMS_REDACT.RE_FIRST,
   regexp_match_parameter => 'i',
   policy_description     => 'Regular expressions to redact credit card numbers',
   expression             => 'SYS_CONTEXT(''USERENV'',''SESSION_USER'') != ''my_schema'''
   );
END;
/

Expected result would be first 6 and last 4 unmasked like 123456****1234, but I get this instead:

enter image description here

From the documentation I would suppose the below script would work by default, but it performs a full redaction instead of just masking some chars. I get a whitespace per row.

BEGIN
DBMS_REDACT.ADD_POLICY ( 
   object_schema          => 'my_schema',
   object_name            => 'my_table',
   column_name            => 'card_number',
   policy_name            => 'redact_my_table', 
   function_type          => DBMS_REDACT.REGEXP,
   function_parameters    => null,
   regexp_pattern         => DBMS_REDACT.RE_PATTERN_CC_L6_T4,
   regexp_replace_string  => DBMS_REDACT.RE_REDACT_CC_MIDDLE_DIGITS,
   regexp_position        => DBMS_REDACT.RE_BEGINNING,
   regexp_occurrence      => DBMS_REDACT.RE_FIRST,
   regexp_match_parameter => 'i',
   policy_description     => 'Regular expressions to redact credit card numbers',
   expression             => 'SYS_CONTEXT(''USERENV'',''SESSION_USER'') != ''my_schema'''
   );
END;
/

Database used is Oracle Database 12c Enterprise Edition Release 12.1.0.2.0

This are the constants in DBMS_REDACT package:

RE_PATTERN_CC_L6_T4        CONSTANT   VARCHAR2(33) := '(\d\d\d\d\d\d)(\d\d\d*)(\d\d\d\d)';
RE_REDACT_CC_MIDDLE_DIGITS CONSTANT   VARCHAR2(10) := '\1XXXXXX\3';

If i put the \d in regex patter i always get full redaction. Using dot i managed to get the scrambled data from the image.

If anyone could try here is also the drop statement:

exec dbms_redact.drop_policy('my_schema','my_table','redact_my_table');

Any ideas what am I doing wrong?

Best Answer

Thanks to @mustaccio's question got me thinking about datatype issues.

NVarchar2 seems to be the issue in this problem.

Creating a copy of the column in varchar2 fixed the issue. Data redaction on varchar2 is working as expected.