I have a table with approximately 2.1 million tuples. Which has latitude and longitude columns. I am trying to convert this into a geographic type (point with SRID).
The function(procedure) I have written, works fine when I limit the entries (say : SELECT id,longitude,latitude FROM list_of_location limit 50
).
CREATE OR REPLACE FUNCTION convertlatlon() RETURNS VOID AS $$
DECLARE rec RECORD;
BEGIN
FOR rec IN SELECT id,longitude,latitude FROM list_of_location
LOOP
UPDATE list_of_location SET location= concat('SRID=4326;POINT(',rec.longitude,' ',rec.latitude,')') WHERE id=rec.id;
END LOOP;
END;
$$ LANGUAGE 'plpgsql' ;
- When I try to run it on the entire table, PostgreSQL seems to do nothing. Have waited for an hour and a half.
- Consumes 99% of CPU on the core it is running.
- Does not spring any other instance of PostgreSQL to utilize other cores(since the request is from a single user?).
- Is this because of locks(row level)?
- How to circumvent this?
Best Answer
Running a DML statement inside a loop is never a good idea. You are multiplying the amount of work to be done. Relational databases are best when operating on sets, when you do a loop you are operating on a single row at a time.
You can achieve the same by doing the update in a single statement:
I'm not 100% about the syntax for calculating the actual point, as I don't use geometric stuff, but I guess you'll get the idea.