Made some progress with the optimization, I just scrapped the view and replaced it with a stored function instead. I applied increased statistics on answer, checkpoint, and homeprofile tables(inreased from 100-500) and I don't notice any execution time decrease unfortunately. :
CREATE OR REPLACE FUNCTION func_checkpoint_aggregated_data(cp_id int) RETURNS
TABLE(cp_date_modified timestamp, cp_status varchar, cp_answer_count int) AS $$
DECLARE
v_date_modified timestamp;
v_answer_count int;
v_cp_status varchar;
v_cp_is_mandatory boolean;
v_cp_acceptable_flag boolean;
ans RECORD;
BEGIN
SELECT is_mandatory FROM checkpoint WHERE id = cp_id INTO v_cp_is_mandatory;
v_answer_count := 0;
v_cp_status := '';
v_cp_acceptable_flag := true;
FOR ans IN SELECT answer.checkpoint_id, answer.date_modified, answer.answer_type, answer_file.id, answer_file.answer_id, answer_file.filename FROM answer LEFT JOIN answer_file ON answer_file.answer_id = answer.id WHERE answer.checkpoint_id = cp_id ORDER BY answer.date_modified LOOP
v_answer_count := v_answer_count + 1;
/* since it's sorted the date_modified will always be the first*/
IF v_answer_count = 1 THEN
v_date_modified := ans.date_modified;
END IF;
IF ans.answer_type IS NULL OR ans.answer_type != 'Acceptable' THEN
v_cp_acceptable_flag := false;
END IF;
END LOOP;
IF v_cp_is_mandatory = true AND v_answer_count = 0 THEN
v_cp_status := 'Needs Answers';
ELSEIF (v_cp_is_mandatory = false OR v_cp_is_mandatory IS NULL) AND v_answer_count = 0 THEN
v_cp_status := 'None';
ELSEIF v_answer_count > 0 AND v_cp_acceptable_flag = true THEN
v_cp_status := 'Acceptable';
ELSE
v_cp_status := 'Not Acceptable';
END IF;
RETURN QUERY SELECT v_date_modified, v_cp_status, v_answer_count;
END;
$$ LANGUAGE plpgsql;
Updated query
SELECT checkpoint.ID, checkpoint.DESCRIPTION,
CASE WHEN category.NAME IS NULL THEN 'NONE' ELSE category.NAME END AS category_name,
task.DESCRIPTION, checkpoint.PHOTOS_REQUIRED,
subcontracting_company.NAME_COMPANY, checkpoint.RISK_VALUE, checkpoint.RISK_FACTOR, checkpoint.RISK_LEGAL,
homeprofile.STREET_LINE1, homeprofile.STATE_CODE, homeprofile.POSTAL_CODE,
project.NAME, checkpoint.ID, func_checkpoint_aggregated_data(checkpoint.ID)
FROM checkpoint
INNER JOIN task ON (checkpoint.TASK_ID=task.ID)
LEFT JOIN category ON (task.SUBCONTRACTOR_CATEGORY_ID=category.ID)
LEFT JOIN subcontracting_company_category ON (checkpoint.SUBCONTRACTOR_CATEGORY_ID=subcontracting_company_category.ID)
LEFT JOIN subcontracting_company ON (subcontracting_company_category.SUBCONTRACTING_COMPANY_ID=subcontracting_company.ID)
LEFT JOIN homeprofile ON (task.HOMEPROFILE_ID=homeprofile.ID)
LEFT JOIN project ON (homeprofile.project_ID=project.ID)
LEFT JOIN (SELECT answer.checkpoint_id FROM answer LEFT JOIN answer_referral ON answer.id = answer_referral.answer_id WHERE answer.answer_type IN ('New','Not Acceptable','Corrected','Acceptable') GROUP BY checkpoint_id) ans_sub ON checkpoint.ID = ans_sub.checkpoint_id
WHERE homeprofile.BUILDING_COMPANY_ID=800 AND homeprofile.DIVISION_ID=1206 AND homeprofile.HOMEPROFILE_STATUS_TYPE_CODE<>'CLOSD' AND (homeprofile.ID IN (48206,45399,48205,48207,48412,47677,48113,48379,48343,47947,45400,48118,48119,47656,47657,47415,47658,47235,48414,48415)
OR homeprofile.project_ID IS NULL)
AND ans_sub.checkpoint_id IS NOT NULL
Query execution time reduced by approx. 80% but it's still not good enough because exec time is 6079 ms,
for the records I'm querying(approx. 50,000+ checkpoints)
Link to explain analyze results:
http://explain.depesz.com/s/lq8
Best Answer
You actually have two different questions here. With respect to the numbers shown in
pg_stat_database
, most of them are cumulative since the last time they were reset, e.g. bypg_stat_reset()
or friends. So, for example, if you frequently query a one-row table you will eventually accumulate high numbers oftup_fetched
andblk_read_time
.