Doing a query like this:
SELECT fruit, avg(cost) FROM fruit_table GROUP BY fruit
results in a query plan like this:
Sort (cost=3713873.41..3713874.46 rows=420 width=16)
Output: fruit, (avg(cost))
Sort Key: fruittable.fruit
-> HashAggregate (cost=3713849.87..3713855.12 rows=420 width=16)
Output: fruit, (avg(cost))
-> Seq Scan on public.fruit (cost=0.00..3677849.91 rows=7199991 width=16)
Output: fruit, weight, cost, ...
The result is correct, but ideally, for performance reasons I would like to have weight omitted in the output of index/seq scan – what am I doing wrong?
This is the output of EXPLAIN ANALYZE VERBOSE:
Sort (cost=3713873.41..3713874.46 rows=420 width=16) (actual time=31156.940..31156.958 rows=612 loops=1)
Output: fruit, (avg(cost))
Sort Key: fruittable.fruit
Sort Method: quicksort Memory: 53kB
-> HashAggregate (cost=3713849.87..3713855.12 rows=420 width=16) (actual time=31156.272..31156.729 rows=612 loops=1)
Output: fruit, avg(cost)
-> Seq Scan on public.fruittable (cost=0.00..3677849.91 rows=7199991 width=16) (actual time=0.004..15050.636 rows=7199991 loops=1)
Output: fruit, weight, cost, ...
Total runtime: 31157.097 ms
1. EDIT
I run the following queries based on Craig's comment:
SELECT
pg_stat_get_live_tuples('fruittable'::regclass),
pg_stat_get_dead_tuples('fruittable'::regclass),
pg_relation_size('fruittable'::regclass),
pg_total_relation_size('fruittable'::regclass);
-- OUTPUT:
pg_stat_get_live_tuples: 0
pg_stat_get_dead_tuples: 0
pg_relation_size: 29539123200
pg_total_relation_size: 32118743040
and
SELECT avg(pg_column_size(x)) FROM (SELECT * FROM fruittable LIMIT 1000) AS x;
-- OUTPUT
avg: 1354.68
2. EDIT
This is the table definition
CREATE TABLE x_2009
(
c_a character varying(70),
c_b character varying(70),
c_c character varying(70),
c_d character varying(70),
c_e character varying(70),
c_f double precision,
v_g character varying(70),
v_h character varying(70),
c_i double precision,
c_j double precision,
d_k character varying(70),
d_l character varying(70),
c_m character varying(70),
c_n character varying(70),
d_o character varying(70),
d_p character varying(70),
d_q character varying(70),
v_r character varying(70),
c_s character varying(70),
v_t double precision,
v_u double precision,
c_v character varying(70),
v_w double precision,
v_x double precision,
v_y double precision,
v_z double precision,
c_aa double precision,
v_ab double precision,
v_ac double precision,
v_ad double precision,
v_ae double precision,
v_af double precision,
v_ag double precision,
v_ah double precision,
v_ai double precision,
v_aj double precision,
v_ak double precision,
v_al double precision,
v_am double precision,
v_an double precision,
v_ao double precision,
v_ap double precision,
v_aq double precision,
v_ar double precision,
v_as double precision,
v_at double precision,
v_au double precision,
v_av double precision,
v_aw double precision,
v_ax double precision,
v_ay double precision,
v_az_cost double precision,
v_ba double precision,
v_bb double precision,
v_bc double precision,
v_bd double precision,
v_be double precision,
v_bf double precision,
v_bg double precision,
v_bh double precision,
v_bi double precision,
v_bj double precision,
v_bk double precision,
v_bl double precision,
v_bm double precision,
v_bn double precision,
v_bo double precision,
v_bp double precision,
v_bq double precision,
v_br double precision,
v_bs double precision,
v_bt double precision,
v_bu double precision,
v_bv double precision,
v_bw double precision,
v_bx double precision,
v_by double precision,
v_bz double precision,
v_ca double precision,
v_cb double precision,
v_cc double precision,
v_cd double precision,
v_ce double precision,
v_cf double precision,
v_cg double precision,
v_ch double precision,
v_ci double precision,
v_cj double precision,
v_ck double precision,
v_cl double precision,
v_cm double precision,
v_cn double precision,
v_co double precision,
v_cp double precision,
v_cq double precision,
v_cr double precision,
v_cs double precision,
v_ct double precision,
v_cu double precision,
v_cv double precision,
v_cw double precision,
v_cx double precision,
v_cy double precision,
v_cz double precision,
v_da double precision,
v_db double precision,
v_dc double precision,
v_dd double precision,
v_de double precision,
v_df double precision,
v_dg double precision,
v_dh double precision,
v_di double precision,
v_dj double precision,
v_dk double precision,
v_dl double precision,
c_dm character varying(70),
c_dn double precision,
v_do double precision,
v_dp double precision,
v_dq double precision,
v_dr double precision,
v_ds double precision,
v_dt double precision,
v_du double precision,
v_dv double precision,
v_dw double precision,
v_dx double precision,
v_dy double precision,
v_dz double precision,
v_ea double precision,
v_eb double precision,
v_ec double precision,
v_ed double precision,
v_ee double precision,
v_ef double precision,
v_eg double precision,
v_eh double precision,
v_ei double precision,
v_ej double precision,
v_ek double precision,
v_el double precision,
v_em double precision,
v_en double precision,
v_eo double precision,
v_ep double precision,
v_eq double precision,
v_er double precision,
v_es double precision,
v_et double precision,
v_eu double precision,
v_ev double precision,
v_ew double precision,
v_ex character varying(70),
c_ey character varying(70),
c_ez character varying(70),
c_fa character varying(70),
c_fb character varying(70),
d_fc character varying(70),
v_fd character varying(70),
v_fe double precision,
v_ff double precision,
v_fg double precision,
v_fh double precision,
v_fi double precision,
v_fj double precision,
c_fk double precision,
c_fl character varying(70),
v_fm character varying(70)
)
WITH (
OIDS=FALSE
);
ALTER TABLE x_2009
OWNER TO xxx;
GRANT ALL ON TABLE x_2009 TO xxx;
-- Index: x_2009_idx_c_a
CREATE INDEX x_2009_idx_c_a
ON x_2009
USING btree
(c_a);
-- Index: x_2009_idx_c_a_and_v_g
CREATE INDEX x_2009_idx_c_a_and_v_g
ON x_2009
USING btree
(v_g, c_a);
-- Index: x_2009_idx_v_g
CREATE INDEX x_2009_idx_v_g
ON x_2009
USING btree
(v_g);
Best Answer
"Seq scans" means full table scan. It's always going to read all of the columns on your table. It's a full table scan. PostgreSQL uses row-based storage, so it can't just read one column at a time.
The reason your query takes so long is that you're reading 7.2 million rows in the table and aggregating them. Apparently, on a fairly slow computer.