PostgreSQL Size Quota on Table or Schema

disk-spacelimitspostgresql

How do one limit the size of a PostgreSQL table? (or schema)

Limit in either bytes or rows would be OK.

I'm starting to think that there's no easy and obvious solution.

I have a lot of identical tables in multiple schemas (my take on multi-tenancy) and I would like to restrict each schema to a certain max size.

I have yet to find anything in Postgres that allows me to turn on any kind of quota. So I suppose you need to build this functionality yourself.

The naive 'solution' would be to do something like:

insert only if select count(*) < max quota.

But that does not feel right.

Anyone with better solutions in mind?

Best Answer

You can create a trigger that checks the number of records in your destination table when an INSERT is used.

The trigger function would look like this:

CREATE OR REPLACE FUNCTION check_number_of_row() RETURNS TRIGGER AS $body$ BEGIN -- replace 100 by the number of rows you want IF (SELECT count(*) FROM your_table) > 100 THEN RAISE EXCEPTION 'INSERT statement exceeding maximum number of rows for this table' END IF; END; $body$ LANGUAGE plpgsql;

And trigger will be like this : -

CREATE TRIGGER tr_check_number_of_row BEFORE INSERT ON your_table FOR EACH ROW EXECUTE PROCEDURE check_number_of_row();

There is no other way i think to set table size in postgresql but u can restrict row limit on table

Q2: `way to measure page size`

PostgreSQL provides a number of Database Object Size Functions. I packed the most interesting ones in this query and added some Statistics Access Functions at the bottom. (The additional module pgstattuple provides more useful functions, yet.)

This is going to show that different methods to measure the "size of a row" lead to very different results. It all depends on what you want to measure, exactly.

This query requires Postgres 9.3 or later. For older versions see below.

Using a VALUES expression in a LATERAL subquery, to avoid spelling out calculations for every row.

Replace public.tbl with your optionally schema-qualified table name to get a compact view of collected row size statistics. You could wrap this into a plpgsql function for repeated use, hand in the table name as parameter and use EXECUTE ...

SELECT l.metric, l.nr AS bytes
     , CASE WHEN is_size THEN pg_size_pretty(nr) END AS bytes_pretty
     , CASE WHEN is_size THEN nr / NULLIF(x.ct, 0) END AS bytes_per_row
FROM  (
   SELECT min(tableoid)        AS tbl      -- = 'public.tbl'::regclass::oid
        , count(*)             AS ct
        , sum(length(t::text)) AS txt_len  -- length in characters
   FROM   public.tbl t                     -- provide table name *once*
   ) x
CROSS  JOIN LATERAL (
   VALUES
     (true , 'core_relation_size'               , pg_relation_size(tbl))
   , (true , 'visibility_map'                   , pg_relation_size(tbl, 'vm'))
   , (true , 'free_space_map'                   , pg_relation_size(tbl, 'fsm'))
   , (true , 'table_size_incl_toast'            , pg_table_size(tbl))
   , (true , 'indexes_size'                     , pg_indexes_size(tbl))
   , (true , 'total_size_incl_toast_and_indexes', pg_total_relation_size(tbl))
   , (true , 'live_rows_in_text_representation' , txt_len)
   , (false, '------------------------------'   , NULL)
   , (false, 'row_count'                        , ct)
   , (false, 'live_tuples'                      , pg_stat_get_live_tuples(tbl))
   , (false, 'dead_tuples'                      , pg_stat_get_dead_tuples(tbl))
   ) l(is_size, metric, nr);

Result:

              metric               | bytes    | bytes_pretty | bytes_per_row
-----------------------------------+----------+--------------+---------------
 core_relation_size                | 44138496 | 42 MB        |            91
 visibility_map                    |        0 | 0 bytes      |             0
 free_space_map                    |    32768 | 32 kB        |             0
 table_size_incl_toast             | 44179456 | 42 MB        |            91
 indexes_size                      | 33128448 | 32 MB        |            68
 total_size_incl_toast_and_indexes | 77307904 | 74 MB        |           159
 live_rows_in_text_representation  | 29987360 | 29 MB        |            62
 ------------------------------    |          |              |
 row_count                         |   483424 |              |
 live_tuples                       |   483424 |              |
 dead_tuples                       |     2677 |              |

For older versions (Postgres 9.2 or older):

WITH x AS (
   SELECT count(*)               AS ct
        , sum(length(t::text))   AS txt_len  -- length in characters
        , 'public.tbl'::regclass AS tbl      -- provide table name as string
   FROM   public.tbl t                       -- provide table name as name
   ), y AS (
   SELECT ARRAY [pg_relation_size(tbl)
               , pg_relation_size(tbl, 'vm')
               , pg_relation_size(tbl, 'fsm')
               , pg_table_size(tbl)
               , pg_indexes_size(tbl)
               , pg_total_relation_size(tbl)
               , txt_len
             ] AS val
        , ARRAY ['core_relation_size'
               , 'visibility_map'
               , 'free_space_map'
               , 'table_size_incl_toast'
               , 'indexes_size'
               , 'total_size_incl_toast_and_indexes'
               , 'live_rows_in_text_representation'
             ] AS name
   FROM   x
   )
SELECT unnest(name)                AS metric
     , unnest(val)                 AS bytes
     , pg_size_pretty(unnest(val)) AS bytes_pretty
     , unnest(val) / NULLIF(ct, 0) AS bytes_per_row
FROM   x, y

UNION ALL SELECT '------------------------------', NULL, NULL, NULL
UNION ALL SELECT 'row_count', ct, NULL, NULL FROM x
UNION ALL SELECT 'live_tuples', pg_stat_get_live_tuples(tbl), NULL, NULL FROM x
UNION ALL SELECT 'dead_tuples', pg_stat_get_dead_tuples(tbl), NULL, NULL FROM x;

Same result.

Q1: `anything inefficient?`

You could optimize column order to save some bytes per row, currently wasted to alignment padding:

integer                  | not null default nextval('core_page_id_seq'::regclass)
integer                  | not null default 0
character varying(255)   | not null
character varying(64)    | not null
text                     | default '{}'::text
character varying(255)   | 
text                     | default '{}'::text
text                     |
timestamp with time zone |
timestamp with time zone |
integer                  |
integer                  |

This saves between 8 and 18 bytes per row. I call it Column Tetris. See:

Also consider:

Would index lookup be noticeably faster with char vs varchar when all values are 36 chars

PostgreSQL Multi-Tenant Schema Design – Handling User Logins

Your options are:

Use roles, and SET search_path on each role so it looks in the appropriate organization's schema. While in many ways the cleanest approach from a database perspective, this is a pain when giving orgs the right to manage their own users.
Require users to specify their organisation at login time, probably with user@domain style logins, or using mycompany.myhostedapp.com style URLs. Then look up the org by @domain suffix before authing the user against that org. That's probably what I'd do.
Create a view that's a union of all company user tables and look users up in this. Likely to be a performance nightmare, and how will you handle duplicates?
Create a parent table for all user tables, make all the user tables inherit it, and query the parent table. Just a slightly more efficient version of the view approach, not recommended.

I strongly suggest an authentication design that carries the company identity of the user along with their login request.

Best Answer

Related Solutions

PostgreSQL – Measure the Size of a Table Row

Q2: way to measure page size

Q1: anything inefficient?

PostgreSQL Multi-Tenant Schema Design – Handling User Logins

Related Question

Q2: `way to measure page size`

Q1: `anything inefficient?`