Postgresql – trim values on COPY from csv

copypostgresql

I used csvsql to generate CREATE statements, and it looks like it assumed trimmed fields though the data has some spaces padding the field:

ERROR:  value too long for type character varying(11)
CONTEXT:  COPY example, line 1314, column state: " South Korea"

I'm not entirely sure why the data has padding — it definitely shouldn't but about half the values in the state column have some padding around them.

I could use ALTER to make more space in the field, and then UPDATE example SET state=trim(state); after I COPY the csv in, I'm wondering if there's a clean way to just trim the padding as I import the data?

Best Answer

Possibly the --no-constraints option to csvsql may solve that problem simply.

Otherwise COPY has a PROGRAM clause that can pre-process the data. It receives the lines it on its standard input and must output the filtered result on its standard output.

https://www.postgresql.org/docs/current/static/sql-copy.html

PROGRAM

A command to execute. In COPY FROM, the input is read from standard output of the command, and in COPY TO, the output is written to the standard input of the command.

When called from psql with \copy tablename FROM PROGRAM ... the program refers to a client-side command to launch by psql. When COPY is used, it's a server-side command run by the backend.

But the trimming itself is non-trivial. For instance, see How to manipulate a CSV file with sed or awk? and Trim leading and trailing spaces from a string in awk.