Since I like replacing aggregate functions by old-fashioned self-joins and NOT EXISTS clauses, here is my attempt:
SET search_path='tmp';
DROP TABLE tmp.changes CASCADE;
CREATE TABLE tmp.changes
( id integer NOT NULL PRIMARY KEY
, fullname varchar
, issuer varchar
, rsymbol varchar
, industry varchar
, activity INTEGER NOT NULL
, shareschange FLOAT
, sharespchange FLOAT
, mfiled FLOAT
);
-- lacking information from the OP
-- I can only presume a flat distribution.
INSERT INTO tmp.changes(id, activity, shareschange,sharespchange,mfiled )
SELECT nm.*
, (random() *20)::integer -- mfiled
, random() *10000
, random() *100
, random() *100000
FROM generate_series(1,1000000) nm
;
ALTER TABLE tmp.changes
ALTER shareschange
SET STATISTICS 1000
;
ALTER TABLE tmp.changes
ALTER mfiled
SET STATISTICS 1000
;
VACUUM ANALYZE tmp.changes
;
CREATE INDEX changes_mfiled_shareschange
ON tmp.changes(mfiled,shareschange)
;
EXPLAIN ANALYZE
SELECT initcap(ch.fullname) AS some_name1
, initcap(ch.issuer) AS some_name2
, upper(ch.rsymbol) AS some_name3
, initcap(ch.industry) AS some_name4
, ch.activity
, to_char(ch.shareschange,'FM9,999,999,999,999,999') AS some_name5
, ch.sharespchange || '%' AS some_name6
FROM changes ch
WHERE ch.activity IN (4,5)
-- NOTE: the subquery is *not* correlated.
-- [I had expected a subselect of nx.activity IN (4,5)
-- like in the main query. ]
AND NOT EXISTS (SELECT * FROM changes nx
WHERE nx.mfiled > ch.mfiled
)
ORDER BY ch.shareschange ASC
LIMIT 15
;
It helps to format a query properly to see what's going on. I studied your query and found suspicious SQL:
CREATE VIEW "Swiss_client_wise_minutes_and_profit" AS
SELECT ROW_NUMBER() OVER (ORDER BY rp.country, rp.destination) AS id
, (SELECT company_id
FROM "Corporate_companyalias" AS cc
WHERE cc."Alias_name" = client_name_id) AS client_name
, (SELECT company_id
FROM "Corporate_companyalias" AS cc
WHERE cc."Alias_name" = vendor_name_id) AS vendor_name
, cs.c_prefix_id AS c_prefix
, cs.v_prefix_id AS v_prefix
, rp.country
, rp.destination
, cs.c_total_calls
, cs.v_total_calls
, cs.successful_calls
, cs.billed_duration
, cs.v_billed_amount AS cost
, cs.c_billed_amount AS revenue
, cs.c_pdd AS pdd
, cs.profit
, cs.start_time
, cs.end_time
, cs.switch_name
FROM "Swiss_buz_scdr" AS cs
LEFT JOIN "Corporate_companyalias" AS cc ON cs.client_name_id = cc."Alias_name"
LEFT JOIN "RateManagement_prefix_and_client_wise_destinationgroup" AS rp
ON rp.client_name = cc.company_id AND rp.prefix = cs.c_prefix_id
WHERE cs.customer_name = 'SSP Root';
Don't use the same table alias cc
in the outer and inner SELECT
. While that's not illegal, it helps to confuse you.
Without table qualification for the reference to the outer query I am not sure, where the columns client_name_id
and vendor_name_id
bind. Would need the table definitions to know, but I suspect it results in CROSS JOIN
s - which is probably not what you intended and the root of the problem.
I suspect the correlated subqueries can be rewritten as plain expressions. Maybe it needs another JOIN
. Here is my ...
Educated guess what you actually want:
CREATE VIEW "Swiss_client_wise_minutes_and_profit" AS
SELECT ROW_NUMBER() OVER (ORDER BY r.country, r.destination) AS id
, c.company_id AS client_name
, v.company_id AS vendor_name
, s.c_prefix_id AS c_prefix
, s.v_prefix_id AS v_prefix
, r.country
, r.destination
, s.c_total_calls
, s.v_total_calls
, s.successful_calls
, s.billed_duration
, s.v_billed_amount AS cost
, s.c_billed_amount AS revenue
, s.c_pdd AS pdd
, s.profit
, s.start_time
, s.end_time
, s.switch_name
FROM "Swiss_buz_scdr" s
LEFT JOIN "Corporate_companyalias" c ON c."Alias_name" = s.client_name_id
LEFT JOIN "RateManagement_prefix_and_client_wise_destinationgroup" r
ON r.client_name = c.company_id AND r.prefix = s.c_prefix_id
LEFT JOIN "Corporate_companyalias" v ON v."Alias_name" = s.vendor_name_id
WHERE s.customer_name = 'SSP Root';
Aside: I would aim for shorter names than "RateManagement_prefix_and_client_wise_destinationgroup"
. And preferably legal, lower-case names that don't need double-quotes.
Best Answer
At the key node, it is expecting 83 rows but finding 46,547. Unless your statistics are way out of date, that means there is a correlation between inventory_id and location_id. You could try to CREATE STATISTICS on dependencies between those two columns, but it not really clear what it would do instead if it had correct estimates. Maybe a hash join to the order_items table, rather than poorly cached nested loop, but that would depend on how big the table is.
A more direct route to address this would be to create an index on
(inventory_id, location_id, id DESC)
. This would allow it to jump in the index to the correct values of inventory_id and location_id, and walk it in "id" order until it collects 10 rows. The "DESC" is probably not necessary, you could try without it and see if it works. You could also add "reference_type" to the index just before "id", but it would only be a minor improvement and might also make the index less general for application to other queries.