I've got an odd db design in place that was here when I got on my new work space. We have a column named params
which contains some string data which appears that it belongs in more than one column, this is the example value of column params :
"---
:body: "Hello\r\nContacting you that I'm ok with your plan.\r\nPlease send me regular updates\r\nThanks"
:emp_id: 4818
"
So I was planning to separate body
and emp_id
in their own columns so that this table can behave like normal one so we can query the data like normal. Having the data in this format is very hard to query, yet they have silly queries to get/match values.
Anyways I wrote some of my regexes to extract those values and save them to these colums (body) :
select a[1]
from (
select regexp_matches(params, ':emp_id: ([0-9]+)', 'g') a
from exchange where id = 483193
) s
for emp_id
select b[1]
from (
select regexp_matches(params, 'body: (.+):emp', 'g') b
from exchange where id = 483193
) s
Where id is an id of one exchange. I want to do this for all rows, I tried something but was spinning in the circles, this looked to be most promising but doesn't work :
update exchange
set emp_id = (
select a[1]
from (
select regexp_matches(parameters, ':emp_id: ([0-9]+)', 'g') a
from exchange where id in (select id from exchange)
) s
) where
How can I update all rows? To fill the new columns emp_id
and body
from existing values in params
?
Update:
I've got a massive performance issue here. My db has 500k+ records. And doing update like this :
UPDATE exchange
SET emp_id=cast(subquery.a[1] as int)
FROM (select regexp_matches(params, ':emp_id: ([0-9]+)', 'g') a
from exchange) AS subquery
Takes really long time. Can this be optimized in some manner?
Best Answer
The solution is very simple. If you do
it will do exactly what you want: pick the
params
from the row to be updated, calculate the expression and updateemp_id
with its value. As there is noWHERE
clause, it will go through the whole table. You don't need the ID of the row or anything else.