COPY
runs as the PostgreSQL server, which usually runs as NETWORKSERVICE
on Windows. This is a local service account that does not have access to your user account's network login credentials for shared drives.
Use the psql
command \copy
, which reads the file from the client application, running under your user ID and with access to your shared drives, then streams it over the client/server network connection to the postgres server.
I think PgAdmin-III has a similar CSV import command that uses COPY ... FROM STDIN
like psql's \copy
does, too.
If you're on a Windows AD Domain and you really need PostgreSQL to have direct access to shares, you can run the PostgreSQL server under a domain account instead. But that's usually unnecessary.
What you're trying to do is normally referred as an ETL (Extract, Transform, Load) process.
You want to do the transformation within the database. This is normally possible by using an intermediate table, where you store (as text, most probably) the data from STDIN
via libpq
and COPY
.
COPY intermediate_table (orig_col1, orig_col2)
FROM stdin
WITH FORMAT CSV ; -- Choose your parameters
Then transfer it to the second table by means of
INSERT INTO
dest(col1, col2)
SELECT
f1(orig_col1, orig_col2), f2(orig_col1, orig_col2)
FROM
intermediate_table ;
That is, you have some transforming functions on the original data to produce the data you want to store. For instance f1
could just be your TO_TIMESTAMP(orig_col_1 ::double precision / 100000)
.
After transforming the data from the intermediate table(s), it is common to truncate
them (or drop
them, and recreate them next time).
Best Answer
As with all open source projects, the internals of the Postgres
COPY
statement can be easily discovered. As one can read in the source code comments, Postgres attempts to use batch inserts by default, except in the cases where it is not possible:BEFORE
orINSTEAD OF
triggers on the table.DEFAULT
expressions.WHERE
clause.If none of the above is true, insert will happen in batches of up to 1000 rows (the default can be changed by recompiling the Postgres server).
For partitioned tables the decision to use batch inserts is made for each partition individually.
While most of the
COPY
processing happens on the server, the client (psql
) responsibility is to send data from a local file one 8K buffer at a time, which is evident from thepsql
\copy
command handler source.