Actually, since
there can be n
number of values for a single record of table first
.
the number of columns to return is not exactly arbitrary. There is a maximum of columns, and this has a clean solution - unless you have more columns than Postgres allows for a table:
250 - 1600 depending on column types
BTW, typically, you would also have a third
table listing all possible values of value
, the whole of it implementing a classical many-to-many relationship.
You can use CASE
statements, or more elegantly, the crosstab()
function of the additional module tablefunc
. If you are unfamiliar with it, read basic instructions here first:
Columns to the right of actual values are filled with NULL.
Assuming a maximum of 5 possible values and building on this setup:
CREATE TABLE first (id int, name text);
CREATE TABLE second (id int, value "char", fid int);
INSERT INTO first VALUES
(1,'Ahmad')
, (2,'Sami')
, (3,'Khan')
, (4,'Nobody'); -- Added to demonstrate difference
INSERT INTO second VALUES
(1,'a',1)
, (2,'b',1)
, (3,'c',2)
, (4,'d',1)
, (5,'e',2)
, (6,'f',3);
Either use crosstab(text)
(1 parameter form) and join to table first
another time:
SELECT id, f.name, value1, value2, value3, value4, value5
FROM crosstab(
'SELECT f.id, 1 AS dummy_category, s.value
FROM first f
JOIN second s ON s.fid = f.id
ORDER BY f.id, s.value'
) ct (id int
, value1 "char", value2 "char", value3 "char", value4 "char", value5 "char")
JOIN first f USING (id);
Or use crosstab(text, text)
(2 parameter form) and generate dummy categories for your values:
SELECT *
FROM crosstab(
'SELECT f.id, f.name
, row_number() OVER (PARTITION BY f.id ORDER BY s.value) AS dummy_category
, s.value
FROM first f
JOIN second s ON s.fid = f.id
ORDER BY f.id, s.value'
, ('SELECT generate_series(1,5)')
) ct (id int, name text
, value1 "char", value2 "char", value3 "char", value4 "char", value5 "char");
Result is the same either way:
id | name | value1 | value2 | value3 | value4 | value5
----+-------+--------+--------+--------+--------+--------
1 | Ahmad | a | b | d | |
2 | Sami | c | e | | |
3 | Khan | f | | | |
If you want to include all rows from table first, make it a LEFT [OUTER] JOIN
:
...
LEFT JOIN second s ON s.fid = f.id
...
Then we get one additional result row for the above example:
4 | Nobody | | | | |
I'd write the query like this, using LIMIT (3)
instead of DISTINCT ON
.
The generate_series(0, 9)
is used to get all the distinct bins. You could use (SELECT DISTINCT line % 10 FROM foo) AS g (bin)
instead, if the "bins" are not all the integers from 0 up to 9:
SELECT
g.bin,
ROW_NUMBER() OVER (PARTITION BY g.bin ORDER BY d.x) AS i,
d.*
FROM
generate_series(0, 9) AS g (bin),
LATERAL
( SELECT f.*, random() x
FROM foo AS f
WHERE f.line % 10 = g.bin
ORDER BY x
LIMIT 3
) AS d
ORDER BY
bin, x ;
Also, if you don't need the random()
number in the output, you could use ORDER BY random()
in the subquery and remove x
from the select and order by clauses - or replace ORDER BY d.x
with ORDER BY d.line
.
Best Answer
Interesting question - which has many possibilities/permutations (this answer has been extensively revised).
Basically, this problem can be divided into two main streams.
A single random record
Multiple random records (not in the question - see reference and discussion at bottom)
Having researched this, I believe that the fastest solution to the single record problem is via the
tsm_system_rows
extension to PostgreSQL provided by Evan Carroll's answer.If you're using a binary distribution, I'm not sure, but I think that the
contrib
modules (of whichtsm_system_rows
is one) are available by default - at least they were for the EnterpriseDB Windows version I used for myWindows
testing (see below). My main testing was done on 12.1 compiled from source onLinux
(make world
andmake install-world
).The reason why I feel that it is best for the single record use case is that the only problem mentioned concerning this extension is that:
however, since you are only interested in selecting 1 row, the block-level clustering effect should not be an issue. This article from 2ndQuadrant shows why this shouldn't be a problem for a sample of one record! It is a major problem for small subsets (see end of post) - OR if you wish to generate a large sample of random records from one large table (again, see the discussion of
tsm_system_rows
andtsm_system_time
below).Then I created and populated a table like this:
So, I now have a table with 100,000,000 (100 million) records. Then I added a
PRIMARY KEY
:So, now to
SELECT
random records:Notice that I have used a slightly modified command so that I could "see" the randomness - I also set the
\timing
command so that I could get empirical measurements.I used the
LENGTH()
function so that I could readily perceive the size of thePRIMARY KEY
integer being returned. Here is a sample of records returned:So, as you can see, the
LENGTH()
function returns 6 most of the time - this is to be expected as most records will be between 10,000,000 and 100,000,000, but there are a couple which show a value of 5 (also have seen values of 3 & 4 - data not shown).Now, notice the timings. The first is 30 milliseconds (ms) but the rest are sub millisecond (approx. 0.6 - 0.7ms). Most of the random samples are returned in this sub-millisecond range, but, there are results returned in 25 - 30 ms (1 in 3 or 4 on average).
From time to time, this multi-millisecond result can occur twice or even three times in a row, but, as I said, the majority of results (approx. 66 - 75%) are sub-millisecond. None of the response times for my solution that I have seen has been in excess of 75ms.
During my research I also discovered the
tsm_system_time
extension which is similar totsm_system_rows
. Now, I also benchmarked this extension as follows:Note that the time quantum is 1/1000th of a millisecond which is a microsecond - if any number lower than this is entered, no records are returned. However, interestingly, even this tiny quantum always returns 120 rows.
Quite why it's 120 is a bit above my pay grade - the PostgreSQL page size is 8192 (the default)
and the
file system block size
is 4096A record should be (1
INTEGER
(4 bytes) + 1UUID
(16 bytes)) (= 20 bytes) + the index on theseq
field (size?). 4096/120 = 34.1333... - I hardly think that each index entry for this table takes 14 bytes - so where the 120 comes from, I'm not sure.I'm not quite sure if the
LIMIT
clause will always return the first tuple of the page or block - thereby introducing an element of non-randomness into the equation.The performance of the
tsm_system_time
query is identical (AFAICS - data not shown) to that of thetsm_system_rows
extension. The same caveat about not being sure whether there is an element of non-randomness introduced by how these extensions choose their first record also applies to thetsm_system_rows
queries. See discussion and bench-testing of the (so-called) randomness of these two methods below.With respect to performance, just for reference, I'm using a Dell Studio 1557 with a 1TB HDD (spinning rust) and 8GB of DDR3 RAM running Fedora 31). This is a 10 year old machine!
I also did the same thing on a machine (Packard Bell, EasyNote TM - also 10 years old, 8GB DDR3 RAM running Windows 2019 Server) that I have with an SSD (SSD not top of the range by any means!) and the response times are typically (strangely enough) a bit higher (~ 1.3 ms), but there are fewer spikes and the values of these are lower (~ 5 - 7 ms).
There could well be a lot of stuff running in the background with 2019 Server - but if you have a modern laptop with a decent SSD, there's no reason that you can't expect sub-millisecond response times as a matter of course!
All tests were run using PostgreSQL 12.1.
To check out the true "randomness" of both methods, I created the following table:
and then ran (3 times each):
and also using (in the inner loop of the above function)
Then after each run, I queried my
rand_samp
table:And got the following counts:
For
TABLESAMPLE SYSTEM_ROWS
, I got 258, 63, 44 dupes, all with a count of 2. ForTABLESAMPLE SYSTEM_TIME
, I got 46, 54 and 62, again all with a count of 2.Now, my stats are a bit rusty, but from a random sample of a table of 100M records,from a sample of 10,000, (1 ten-thousandth of the number of records in the
rand
table), I'd expect a couple of duplicates - maybe from time to time, but nothing like the numbers I obtained. Furthermore, if there was true randomness, I'd expect (a small number of) 3's and 4's also.I ran two tests with 100,000 runs using
TABLESAMPLE SYSTEM_ROWS
and obtained 5540 dupes (~ 200 with 3 dupes and 6 with 4 dupes) on the first run, and 5465 dupes on the second (~ 200 with 3 and 6 with 4). The most interesting query was this however:where I compare dupes in both runs of 100,000 with respect to each other - the answer is a whopping 11,250 (> 10%) are the same - which for a sample of 1 thousandth (1/1000) is WAY to much to be down to chance!
Results 100,000 runs for
SYSTEM_TIME
- 5467 dupes, 215 with 3, and 9 with 4 on the first group, 5472, 210 (3) and 12 (4) with the second. The number of matching records is 11,328 (again > 10%).There's clearly (a LOT of) non-random behaviour going on. I'll leave it to the OP to decide if the speed/random trade-off is worth it or not!
Benchmark of other answers.
I decided to benchmark the other proposed solutions - using my 100 million record table from above. I ran all tests 5 times - ignoring any outliers at the beginning of any series of tests to eliminate cache/whatever effects. All the outlier values were higher than those reported below.
I'm using the machine with the HDD - will test with the SSD machine later. The
.mmm
reported means milliseconds - not significant for any answer but my own.Daniel Vérité's answer:
Ran 5 times - all times were over a minute - typically 01:00.mmm (1 at 01:05.mmm).
Typical run:
Swav's answer:
Ran 5 times - all times were over a minute - from 01:03 to 01:29
Typical run:
Colin 't Hart's answer:
Ran 5 times - times varied between 00:06.mmm and 00:14.mmm (Best of the Rest!)
Typical run:
Colin 't Hart's 2nd answer (adapted by me):
Response time is between ~ 30 - 45ms with the odd outlier on either side of those times - it can even drop to 1.xxx ms from time to time. All I can really say is that it appears to be more consistent than either of the
SYSTEM_TIME
andSYSTEM_ROWS
methods.There is a major problem with this method however. If the underlying field that one is choosing for randomness is sparse, then this method won't return a value all of the time - this may or may not be acceptable to the OP? You can do something like (end of query):
(note
>=
andLIMIT 1
). This can be very efficient, (1.xxx ms), but seems to vary more than just theseq =...
formulation - but once the cache appears to be warmed up, it regularly gives response times of ~ 1.5ms.Another advantage of this solution is that it doesn't require any special extensions which, depending on the context (consultants not being allowed install "special" tools, DBA rules...) may not be available.
One really WEIRD thing about the above solution is that if the
::INT
CAST is removed, the query takes ~ 1 minute. This happens even though theFLOOR
function should return anINTEGER
. I only discovered that this was an issue by runningEXPLAIN (ANALYZE BUFFERS)
.With ::INT
Without ::INT
Note the (without
::INT
)Parallel Seq Scan (with a high cost), filter on (seq)::double
(WHY double??).
And
compared to (with
::INT
)Finally, my own answer again (same machine, time & cache):
(this is now redundant in the light of the benchmarking performed above).
Ran my own benchmark again 15 times - typically times were sub-millisecond with the occasional (approx. 1 in 3/4) run taking approx. 25 milliseconds.
Typical run:
So, it would appear that my solution's worst times are ~ 200 times faster than the fastest of the rest of the pack's answers (Colin 't Hart).
My analysis is that there is no perfect solution, but the best one appears to be the adaptation of Colin 't Hart's solution.
Finally, a GRAPHIC demonstration of the problem associated with using this solution for more than one record is shown below - taking a sample of 25 records (performed several times - typical run shown).
The tsm_system_rows method will produce 25 sequential records. This may be suitable for certain purposes where the fact that the random sample is a number of sequential records isn't a problem, but it's definitely worth keeping in mind.
Time: 29.348 ms
A similar state of affairs pertains in the case of the
SYSTEM_TIME
method. As mentioned above, even with a minimum time of 1μs, it gives 120 records. Just as withSYSTEM_ROWS
, these give sequential values of thePRIMARY KEY
.returns:
Our sister site, StackOverflow, treated this very issue
here
. Good answers are provided by (yet again) Erwin Brandstetterhere
and Evan Carrollhere
. That whole thread is worth reading in detail - since there are different definitions ofrandom
(monotonically increasing/decreasing,Pseudorandom number generators
...) andsampling
(with or without replacement...).