Postgresql – Postgres copy data with \xYY as plain string instead of interpreting as encoded string

bulkcopyencodingimportpostgresql

I have a log file full of URLs, generated by Bro IDS.

When Bro logs a URL with non-ascii characters in it, it inserts \xYY where YY is the hexadecimal character code. Also, some URLs contain "\x".

Is there a setting or flag I can use with the COPY or \copy command to stop postgres from trying to interpret these string sequences and just put them in the text field as is?

Best Answer

The backslash \ is the default escape character in the (default) text format of COPY. The format you describe is recognized as (quoting the manual here)

backslash sequences are recognized by COPY FROM
...
\xdigits | Backslash x followed by one or two hex digits specifies the character with that numeric code

.. which seems to be just as intended.
To avoid this effect you could use the CSV format instead, where the backslash has no special meaning:

COPY tbl FROM '/path/to/file/log.sql' (FORMAT csv);

Be sure to read the manual to learn about additional differences.

Alternatively you could replace every \ with \\ in your file with a sed script or something.

If the above doesn't solve your problem, please supply:

  • PostgreSQL version
  • Database encoding
  • Demo data
  • Definition of the table you COPY to
  • The exact COPY invocation you used