I've never found a recommendation to run Postgresql on a privileged port in production. What type of port should be used in production regarding security and best practices?
PostgreSQL – Should Privileged Ports Be Used in Production?
linuxpostgresql
Related Solutions
Strictly speaking, the term "stored procedures" points to SQL procedures in Postgres, introduced with Postgres 11. Related:
There are also functions, doing almost but not quite the same, and those have been there from the beginning.
Functions with LANGUAGE sql
are basically just batch files with plain SQL commands in a function wrapper (and therefore atomic, always run inside a single transaction) accepting parameters. All statements in an SQL function are planned at once, which is subtly different from executing one statement after the other and may affect the order in which locks are taken.
For anything more, the most mature language is PL/pgSQL (LANGUAGE plpgsql
). It works well and has been improved with every release over the last decade, but it serves best as glue for SQL commands. It is not meant for heavy computations (other than with SQL commands).
PL/pgSQL functions execute queries like prepared statements. Re-using cached query plans cuts off some planning overhead and makes them a bit faster than equivalent SQL statements, which may be a noticeable effect depending on circumstances. It may also have side effects like in this related question:
This carries the advantages and disadvantages of prepared statements - as discussed in manual. For queries on tables with irregular data distribution and varying parameters dynamic SQL with EXECUTE
may perform better when the gain from an optimized execution plan for the given parameter(s) outweighs the cost of re-planning.
Since Postgres 9.2 generic execution plans are still cached for the session but, quoting the manual:
This occurs immediately for prepared statements with no parameters; otherwise it occurs only after five or more executions produce plans whose estimated cost average (including planning overhead) is more expensive than the generic plan cost estimate.
We get best of both worlds most of the time (less some added overhead) without (ab)using EXECUTE
. Details in What's new in PostgreSQL 9.2 of the PostgreSQL Wiki.
Postgres 12 introduces the additional server variable plan_cache_mode
to force generic or custom plans. For special cases, use with care.
You can win big with server side functions that prevent additional round-trips to the database server from your application. Have the server execute as much as possible at once and only return a well defined result.
Avoid nesting of complex functions, especially table functions (RETURNING SETOF record
or TABLE (...)
). Functions are black boxes posing as optimization barriers to the query planner. They are optimized separately, not in the context of the outer query, which makes planning simpler, but may result in less than perfect plans. Also, cost and result size of functions cannot be predicted reliably.
The exception to this rule are simple SQL functions (LANGUAGE sql
), which can be "inlined" - if some preconditions are met. Read more about how the query planner works in this presentation by Neil Conway (advanced stuff).
In PostgreSQL a function always automatically runs inside a single transaction. All of it succeeds or nothing. If an exception occurs, everything is rolled back. But there is error handling ...
That's also why functions are not exactly "stored procedures" (even though that term is used sometimes, misleadingly). Some commands like VACUUM
, CREATE INDEX CONCURRENTLY
or CREATE DATABASE
cannot run inside a transaction block, so they are not allowed in functions. (Neither in SQL procedures, yet, as of Postgres 11. That might be added later.)
I have written thousands of plpgsql functions over the years.
If you only have a few variables I would consider keeping separate boolean
columns.
- Indexing is easy. In particular, indexes on expressions are easy.
- Conditions for queries and partial indexing are easy to write and read and meaningful.
- A boolean column occupies 1 byte. For only a few variables this occupies the least space.
- Unlike the other options boolean columns allow
NULL
values for individual bits if you should need that. You can always define columnsNOT NULL
if you don't.
Optimizing storage
If you have more than a hand full variables but less than 33, an integer
column may serve you best. (Or a bigint
for up to 64 variables.)
- Occupies 4 bytes on disk.
- Very fast indexing for exact matches (
=
operator). - Handling individual values may be slower / less convenient than with
bit string
orboolean
.
With even more variables, or if you want to manipulate the values a lot, or if you don't have huge tables and disk space / RAM is no issue, or if you are not sure what to pick, I would consider bit(n)
or bit varying(n)
.
- Occupies at least 5 bytes (or 8 for very long strings) plus 1 byte for each group of 8 bits (rounded up).
- You can use bit string functions and operators directly.
Examples
For just 3 bits of information, individual boolean
columns get by with 3 bytes, an integer
needs 4 bytes and a bit string
6 bytes (5 + 1).
For 32 bits of information, an integer
still needs 4 bytes, a bit string
occupies 9 bytes for the same (5 + 4) and boolean
columns occupy 32 bytes.
Further reading
Related Question
- Postgresql – How should I tune Postgresql for 20 GB of RAM
- Linux kernel parameters for oracle
- PostgreSQL – Update Staging Data with Production Data
- PostgreSQL – Fastest Query for Selecting Arrays with Duplicates
- Postgresql – How to change Postgresql to be listening on port 5432
- Postgresql: Streaming Replication – determining wal sender process client port
Best Answer
Running PostgreSQL under 1024 requires some hacking. It's almost impossible outside of win32. From the
backend/main/main.c
,After which the backend calls
exit(1)
. It also doesn't run as asetuid
script. From the source,The only way to even set this up on Linux that I know of is
setcap
withcap_net_bind_service
capability (never tried it)However, if you are worried about server spoofing (another server taking over the non-privilege connection before the PG executable) you can still be secure if you take some pre-cautions, as explained in Preventing Server Spoofing:
requirepeer
to specify the required owner of the server process connected to the socket.