I have 2 tables that looks like this:
Table A:
CREATE TEMP TABLE table_a (
Column_1 text,
ID_number int
);
INSERT INTO table_a VALUES
('foo,bar,baz', 123),
('qux,quux,quuz',456),
('corge,grault,garply',789),
('qux,bar,grault', 101);
Table B:
CREATE TEMP TABLE table_b (
Column_1 text,
Column_2 text,
ID_number int
);
INSERT INTO table_b VALUES
('foo','baz',null),
('qux','quzz',null),
('corge','garply',null);
I'm trying to copy across values from the ID_number column in Table A, where the values in Column 1 & 2 of table B can be found in the same row of Column 1 in Table A.
This is the kind of thing I was thinking of:
UPDATE table_b AS B
SET id_number = A.id_number
FROM table_a AS A
WHERE A.column_1 LIKE B.column_1
AND A.column_1 LIKE B.column_2
.. but obviously this doesn't work.
How can I translate this into a proper query?
Additional info
table_a.Column_1
contains UK addresses, for example:
'47 BOWERS PLACE, GREAT YARMOUTH, NORFOLK, NR20 4AN'
In table_b
I have the first line of the address in Column_1
(so, '47 BOWERS PLACE'
) and the postcode ('NR20 4AN'
) in Column_2
.
I thought it would be best to simplify things, but maybe the actual data has some relevance in this situation.
table_a
has about 30 million addresses. table_b
has around 60k rows.
Performance is relevant, the faster this runs the better, and it will likely be repeated in the future.
Best Answer
Assuming Postgres 9.6, performance is relevant, big tables, "words" composed of characters, no whitespace or punctuation, no stemming or stop words, no phrases, all columns
NOT NULL
.Full Text search backed by an index should be among the fastest solutions:
With a matching expression index on
a.column_1
: