Postgresql – Importing CSV into multiple tables using PostgreSQL

csvpostgresql

I have a very big CSV file with the following format:

TAG,TIME,VALUE

as an example row:

footag,2014-06-25 08:00:00.0,3400.0

I used to import it easily inside PostgreSQL 9.3 using the following:
COPY datapoints FROM '/home/foo/my.csv' DELIMITER ',' CSV; where datapoints is a table with the three corresponding columns.

Now, the database into which I need to import has changed. It has two tables. Here are \d+ outputs:

                                          Table "public.tags"
 Column |       Type        |                     Modifiers                     | Storage  | Stats target | Description
--------+-------------------+---------------------------------------------------+----------+--------------+-------------
 id     | integer           | not null default nextval('tags_id_seq'::regclass) | plain    |              |
 tag    | character varying | not null                                          | extended |              |
Indexes:
    "tags_pkey" PRIMARY KEY, btree (id)
    "tags_tag_idx" btree (tag)
Referenced by:
    TABLE "tag_values" CONSTRAINT "tag_values_tag_fkey" FOREIGN KEY (tag_id) REFERENCES tags(id)
Has OIDs: no

                                Table "public.tag_values"
 Column |            Type             | Modifiers | Storage | Stats target | Description
--------+-----------------------------+-----------+---------+--------------+-------------
 tag_id | integer                     | not null  | plain   |              |
 time   | timestamp without time zone | not null  | plain   |              |
 value  | double precision            |           | plain   |              |
Indexes:
    "tag_values_tag_time_idx" btree (tag_id, "time")
Foreign-key constraints:
    "tag_values_tag_fkey" FOREIGN KEY (tag_id) REFERENCES tags(id)
Has OIDs: no

So essentially, the original table is split into two. As you see, there's a foreign key constraint now (between tags(id) and tag_valued(tag_id)) Also, tags(tag) are supposed to be distinct. What is the easiest way to ingest the same .CSV files into such structure?

Thanks!

Best Answer

I find the most reliable way to import CSV data into PostgreSQL is to create a table just for the import (you can delete it after). Often there are wonky values in some columns, so I make all the column types text and then convert them on insert into the "real" table.

The \copy command in psql (different than copy) makes it easy to copy files to a remote server, and insert into ... select makes it easy to copy to another table.