The default Postgresql configuration will be used.
The default PostgreSQL configuration is meant to be adequate for smallish servers. If you are worried about scalability, which you seem to be, then restricting yourself to using the default configuration is going to be a poor plan.
WHERE p.name ILIKE ANY(potential_parts)
You are probably going to want a trgm_ops index on p.name
, and similarly on T.name
You might also need to add user-space code which prevents people from searching on things of less than 3 or 4 characters.
We can consider that each table will have more than a million of row.
Make a million rows of dummy data, and see what happens. No use speculating when you can just do it.
My first question would be to ask why you're actually using incremental in the first place. Here's an answer that I posted regarding incremental statistics, a blog post by Erin Stellato that illuminates one of the primary complaints and pitfalls with incremental statistics (they're not used at the partition level by the optimizer), and two blog posts by me that work through evaluating any potential use case for incremental statistics.
Having said that, to know when a partition's stats have been sampled you can use an undocumented DMF (sys.dm_db_stats_properties_internal()
) to get partition-level information. I have a comment on this blog post that describes how to understand the hierarchy at a fairly high level.
select
sysdatetime(),
schema_name = sh.name,
table_name = t.name,
stat_name = s.name,
index_name = i.name,
leading_column = index_col(quotename(sh.name)+'.'+quotename(t.name),s.stats_id,1),
s.stats_id,
parition_number = isnull(sp.partition_number,1),
s.has_filter,
s.is_incremental,
s.auto_created,
sp.last_updated,
sp.rows,
sp.rows_sampled,
sp.unfiltered_rows,
modification_counter = coalesce(sp.modification_counter, n1.modification_counter)
from sys.stats s
join sys.tables t
on s.object_id = t.object_id
join sys.schemas sh
on t.schema_id = sh.schema_id
left join sys.indexes i
on s.object_id = i.object_id
and s.name = i.name
cross apply sys.dm_db_stats_properties_internal(s.object_id, s.stats_id) sp
outer apply sys.dm_db_stats_properties_internal(s.object_id, s.stats_id) n1
where n1.node_id = 1
and (
(is_incremental = 0)
or
(is_incremental = 1 and sp.partition_number is not null)
)
and t.name = '<<TABLENAME>>'
and s.name like '<<STATNAME>>%'
order by s.stats_id,isnull(sp.partition_number,1);
Best Answer
Yes, use the statistics collector. The manual:
Bold emphasis mine.
So this is off by default. If all relevant functions are
LANGUAGE plpgsql
, it's enough to set it topl
inpostgresql.conf
and reload. No need to restart the server for this setting.Reload with
pg_ctl reload
. Or on a typical Debian(-related) installation:Postgres starts to gather information. After an adequate period of time, check track records. Careful though, some important functions may not be called in weeks. Your call ...
How to access gathered statistics? The manual again:
Basically:
Or you can easily look it up with pgAdmin (or another GUI). In pgAdmin (3 or 4), select the function and look at the "Statistics" tab top right: "Number of Calls".
Aside:
There are no true "stored procedures" before Postgres 11, just functions. The term "procedure" is often used synonymously.