Basics:
- The 1st row of the CSV file has column names of defined format.
- The
PROGRAM
clause of COPY
and GET DIAGNOSTICS
after COPY
require Postgres 9.3+.
format()
requires Postgres 9.1+
- This works with pure standard Postgres - except for the
head
command that the shell is expected to provide. For Windows versions consider:
Full automation
This function copies any table structure completely dynamically:
CREATE OR REPLACE FUNCTION f_dynamic_copy(_file text
, _tbl text = 'tmp1'
, _delim text = E'\t'
, _nodelim text = chr(127)) -- see below!
RETURNS text AS
$func$
DECLARE
row_ct int;
BEGIN
-- create staging table for 1st row as single text column
CREATE TEMP TABLE tmp0(cols text) ON COMMIT DROP;
-- fetch 1st row
EXECUTE format($$COPY tmp0 FROM PROGRAM 'head -n1 %I' WITH (DELIMITER %L)$$ -- impossible delimiter
, _file, _nodelim);
-- create actual temp table with all columns text
EXECUTE (
SELECT format('CREATE TEMP TABLE %I(', _tbl)
|| string_agg(quote_ident(col) || ' text', ',')
|| ')'
FROM (SELECT cols FROM tmp0 LIMIT 1) t
, unnest(string_to_array(t.cols, E'\t')) col
);
-- Import data
EXECUTE format($$COPY %I FROM %L WITH (FORMAT csv, HEADER, NULL '\N', DELIMITER %L)$$
, _tbl, _file, _delim);
GET DIAGNOSTICS row_ct = ROW_COUNT;
RETURN format('Created table %I with %s rows.', _tbl, row_ct);
END
$func$ LANGUAGE plpgsql;
Call variants:
SELECT f_dynamic_copy('/path/to/file.csv');
SELECT f_dynamic_copy('/path/to/file2.csv', 'tmp_file2');
SELECT f_dynamic_copy(_file => '/path/to/file2.csv'
, _tbl => 'tmp_file2');
, _delim => E'\t'); -- using assignment operator since pg 9.5
Answer:
Created table tmp_file2 with 123 rows.
Before the main COPY
, run a preliminary COPY ... TO tmp0
to fetch the first row with column names, which are expected to be unquoted, case-sensitive strings like COPY ... TO ... (FORMAT csv, HEADER)
would export them.
The structure of the actual target table is derived from it, all columns with data type text
. The default name of the resulting table is tmp1
- or provide your own as 2nd function parameter.
Then COPY
is executed. The default delimiter is a tab character - or provide your delimiter as 3rd function parameter.
Use any single-byte character for the non-delimiter _nodelim
which does not appear in the first line of your CSV file. I am arbitrarily picking the control character "Delete" (ASCII 127). That character would be swallowed here on SO, so I generate with chr(127)
instead, which is also valid. Assuming the character won't pop up - or provide your non-delimiter as 4th function parameter.
The function returns table name and number of imported rows.
Remember, a temporary table dies with the end of the session.
The manual:
Executing a command with PROGRAM
might be restricted by the operating
system's access control mechanisms, such as SELinux.
Related answer on SO:
Postgres 8.4
That version is too old, I am not going to back-port that far.
GET DIAGNOSTICS
is an optional feature. You can just leave it away or replace it with a full count on the table
A primitive (expensive) alternative for the PROGRAM
clause of COPY
in pg 9.3 would be to import the complete table instead:
EXECUTE format($$COPY tmp0 FROM %L WITH (DELIMITER %L)$$, _file, _delim);
Or you prepare a second input file, or you can make it work by piping from the shell: COPY tablename FROM STDIN
is available in pg 8.4.
format()
can be replaced with plain string concatenation. Be wary of SQL injection though!
Removing the explicit cast to tstrange
worked. The new CSV file looks like
t_range|field1
'[2014-06-02 07:00:00,infinity)'|'sometext1'
'[2014-06-02 07:00:00,infinity)'|'sometext2'
It seems to me that \copy
is fine with automatically attempting to cast to the appropriate datatypes, but the explicit cast causes it to choke.
One can also simplify the CSV file a bit in this case by removing the quotes:
t_range|field1
[2014-06-02 07:00:00,infinity)|sometext1
[2014-06-02 07:00:00,infinity)|sometext2
and modifying the import command:
\copy my_table FROM 'my.csv' DELIMITER '|' CSV HEADER;
Best Answer
If format is a string, the string is case sensitive,
But, the format doesn't have to be a string,
You're probably use to using the non-stringified version. The syntax for the
copy_options
is pretty hectic,A lot of them only accept strings, but some of them accept regular tokens which get lowercased.
This specific thing isn't documented explicitly but you can find more information about
COPY
here