PostgreSQL insert into table (not origin) based on a condition on fields on different tables

Lets say you have 4 different tables:

table_1 is a table holding the original data:

table_1

id1  |  id2  |  score1  |  score2  |  name_1         |  name_2
-----+-------+----------+----------+-----------------+---------
123  | 89898 | 0.5      | 0.8      |Jack             |Joe
129  | 89999 | 0.55     | 0.3      |Chris            |Don
150  | 90098 | 0.8      | 0.1      |Susan            |Anna
170  | 99898 | 0.7      | 0.14     |Ben              |Lisa

table_2 is a table where we would like to insert fields from table_1 (id1, id2, score1,score2) plus a comment string based on a condition on table_3 and table_4 (where both fields name_1 and name_2 are empty)

table_2 (the result table) should look like:

id1  |  id2  |  score1  |  score2  |  comment_string
-----+-------+----------+----------+-----------------
129  | 89999 | 0.55     | 0.3      | 'removed_because_no_value_in_fields_name_1_and_name_2'

table_3 and table_4 are tables which are holding data of two colomns from table_1 and some rows that have been modified previously. The modified fields are empty. The amount of rows is the same like in the origin table table_1.

table_3:

name_1  |
--------+
Jack
[empty]
Susan
Ben

table_4:

name_2  |
--------+
Joe
[empty]
Anna
Lisa

Note:
[empty] means that the field is empty. ids are not representing primary keys. The result table (table_2) should only contain the fields id1, id2, score1,score2 plus a comment string if the fields with the same row number in table_3 and table_4 are empty.

How to achieve this using PostgresSQL 9.1?

INSERT INTO table2 (id1, id2, score1, score2, comment_string) SELECT t1.id1, t1.id2, t1.score1, t1.score2 ,CASE WHEN t3.name_1 IS NULL AND t4.name_2 IS NULL THEN 'removed_because ...' ELSE END AS comment_string FROM table_1 t1 LEFT JOIN table_3 t3 USING (name_1) LEFT JOIN table_4 t4 USING (name_2) ORDER BY id1; -- undeclared in Q

Best Answer

You seem to be under the impression that some kind of automatic "row number" would exist. That is not the case. Unlike rows in a spreadsheed, tables in a relational database have no natural order.

This query should do the job, but it relies on the the contents of name_1 and name_2 to make the connection. If you rely on a row number, you have to add an actual column for that.

Based on the assumption that table_3.name_1 and table_4.name_2 are unique. Else, the query could create a "proxy cross join", possibly multiplying rows, if there are several matches.
More about this caveat in this related answer on SO.

Best Answer

Related Solutions

Postgresql – How to insert binary data into a PostgreSQL BYTEA column using libpqxx

Postgresql – SPI postgresql – insert data into table from postgres.c

Related Question