I'm using Postgres 9.5 on Amazon RDS (2vCPU, 8 GB RAM).
I use pganalyze to monitor my performance.
I have around 200K records in the database.
In my Dashboard I see the following queries are taking 28 and 11 seconds in average to execute:
UPDATE calls SET ... WHERE calls.uuid = ? telephonist 28035.41 0.01 100% 0.03%
UPDATE calls SET sip_error = ? WHERE calls.uuid = ? telephonist 11629.89 0.44 100% 0.69%
I already tried VACUUM
, found and clean 7,670 dead rows.
Any ideas how to improve UPDATE
performance? This is the query:
UPDATE calls SET X=Y WHERE calls.uuid = 'Z'
How can I improve the query above? Can I add another field? Example:
UPDATE calls SET X=Y WHERE calls.uuid = 'Z' AND calls.campaign = 'W'
The column uuid
is not indexed.
https://www.tutorialspoint.com/postgresql/postgresql_indexes.htm suggests that indexes are not recommended for UPDATE
operations.
CREATE TABLE public.calls (
id int4 NOT NULL DEFAULT nextval('calls_id_seq'::regclass),
callsid varchar(128),
call_start timestamp(6) NOT NULL,
call_end timestamp(6) NULL,
result int4 DEFAULT 0,
destination varchar(256),
campaign varchar(128),
request_data varchar(4096),
uuid varchar(128) NOT NULL,
status varchar(64),
duration int4,
recording_file varchar(256),
recording_url varchar(256),
recording_duration int4,
recording_text varchar(4096),
recording_download bool DEFAULT false,
description varchar(4096),
analysis varchar(4096),
is_fax bool DEFAULT false,
is_test bool,
hangup_cause varchar(128),
media_detected bool DEFAULT false,
sip_callid varchar(256),
hangup_cause_override varchar(256),
is_blacklisted bool DEFAULT false,
sip_error varchar(256),
hangup_cause_report varchar(128),
summary varchar(1024)
);
EXPLAIN ANALYZE
SELECT * FROM calls
WHERE calls.uuid='e2ce9eb4-v1lp-p14u-7kkk-lruy-e2ceaae46d';
Seq Scan on calls (cost=0.00..16716.25 rows=1 width=3301) (actual time=81.637..81.637 rows=0 loops=1) Filter: ((uuid)::text = 'e2ce9eb4-v1lp-p14u-7kkk-lruy-e2ceaae46d'::text) Rows Removed by Filter: 99970 Planning time: 0.482 ms Execution time: 81.683 ms
Best Answer
Assuming the column
uuid
is supposed to beUNIQUE
, this table definition should save some space and help performance:The most important feature here is the
UNIQUE
constraint, which is implemented with a unique index, and an index is what you need more than anything else (like @ypercube already commented).If
uuid
is not unique, create a plain btree index on it.If
uuid
is not a valid uuid, leave it as character type (varchar
ortext
), but still create that index.Size and performance considerations for data type
varchar
vs.uuid
:All my other suggested changes are minor improvements. Detailed explanation here:
If you don't need to enforce a particular maximum length I would just use
text
for all your character columns. But that has hardly any immediate effect on performance. Some of the columns might be converted to a more fitting type (with an actual performance benefit).Indexes and
UPDATE
So why does that tutorial page say:
That's misleading by omission. You desperately need the one index on
uuid
to support the predicate for your update. All other indexes slow your update down, because they require additional work to keep them current after an update. So if you have no use for thePRIMARY KEY
onid
(to allow FK constraints to it for instance), you can drop that (and makeuuid
the PK instead - also indexed automatically).