Every row of the result depends on the previous row. A recursive CTE comes to mind, I tried that. But one would need to refer to the worktable in an OUTER JOIN
or a subquery expression which is not allowed. This does not work (building on the table layout in my fiddle):
WITH RECURSIVE
t0 AS (SELECT *, COALESCE(array_length(opp_log,1), 0) AS len FROM tbl)
, t1 AS (
SELECT t1.player_id AS pl, t2.player_id AS p2
,t1.len AS len1, t2.len AS len2
FROM t0 t1, t0 t2
WHERE t2.player_id <> t1.player_id
AND t2.player_id <> ALL (t1.opp_log)
)
, cte AS (
(
SELECT pl, p2
FROM t1
ORDER BY len1 DESC, len2 DESC
LIMIT 1
)
UNION ALL
(
SELECT pl, p2
FROM t1
LEFT JOIN cte c ON t1.p1 IN (c.p1, c.p2)
OR t1.p2 IN (c.p1, c.p2)
WHERE c.p1 IS NULL
ORDER BY len1 DESC, len2 DESC
LIMIT 1
)
)
SELECT *
FROM cte;
> ERROR: recursive reference to query "cte" must not appear within an outer join
I don't think there is a halfway decent way to solve this with pure SQL. I suggest:
Procedural solution with PL/pgSQL
CREATE OR REPLACE FUNCTION f_next_round()
RETURNS TABLE (player_id1 int, player_id2 int) AS
$func$
DECLARE
rows int := (SELECT count(*)/2 FROM tbl); -- expected number of resulting rows
ct int := 0; -- running count
BEGIN
CREATE TEMP TABLE t ON COMMIT DROP AS -- possible combinations
SELECT t1.player_id AS p1, t2.player_id AS p2
, COALESCE(array_length(t1.opp_log,1), 0) AS len1
, COALESCE(array_length(t2.opp_log,1), 0) AS len2
FROM tbl t1, tbl t2
WHERE t2.player_id <> t1.player_id
AND t2.player_id <> ALL (t1.opp_log)
AND t1.player_id <> ALL (t2.opp_log)
ORDER BY len1 DESC, len2 DESC; -- opportune sort order
LOOP
SELECT INTO player_id1, player_id2 p1, p2 FROM t LIMIT 1;
EXIT WHEN NOT FOUND;
RETURN NEXT;
ct := ct + 1; -- running count
DELETE FROM t -- remove obsolete pairs
WHERE p1 IN (player_id1, player_id2) OR
p2 IN (player_id1, player_id2);
END LOOP;
IF ct < rows THEN
RAISE EXCEPTION 'Could not find a solution';
ELSIF ct > rows THEN
RAISE EXCEPTION 'Impossible result!';
END IF;
END
$func$ LANGUAGE plpgsql VOLATILE;
How?
Build a temporary table with remaining possible pairs. This kind of cross join produces a lot of rows with big tables, but since we seem to be talking about tournaments, numbers should be reasonably low.
Players with the longest list of opponents are sorted first. This way, players that would be hard to match come first, increasing the chance for a solution.
Pick the first row and delete related pairings now obsolete. Do need to sort again. Logically any row is good, practically we get the player with the longest list of opponents first due to initial sort (which is not reliable without ORDER BY
, but good enough for the case).
Repeat until no match is left.
Keep count and raise an exception if the count is not as expected. PL/pgSQL conveniently allows to raise an exception after the fact, which cancels any previous return values. Details in the manual.
Call:
SELECT * FROM f_next_round();
Result:
player_id1 | player_id2
-----------+-----------
1 | 7
2 | 3
4 | 8
5 | 6
SQL Fiddle.
Note
This does not guarantee to calculate the perfect solution. I just returns a possible solution and uses some limited smarts to improve the chances to find one. The problem is a bit like solving a Sudoku, really and is not trivially solved perfectly.
According to the docs PL/pgSQL Under the Hood, you can use the configuration parameter plpgsql.variable_conflict
, either before creating the function or at the start of the function definition, declaring how you want such conflicts to be resolved.
The 3 possible settings are error
(the default), use_variable
and use_column
:
CREATE OR REPLACE FUNCTION f_merge_foobar()
RETURNS TABLE(ts int, foo text, bar text)
LANGUAGE plpgsql AS
$func$
#variable_conflict use_column -- how to resolve conflicts
BEGIN
FOR ts, foo, bar IN
SELECT ts, f.foo, b.bar
FROM foo f
FULL JOIN bar b USING (ts)
LOOP
-- do something
RETURN NEXT;
END LOOP;
END
$func$;
Best Answer
In other RDBMS (like SQL Server before 2008 - as per Paul's comment) one might cross join to a subquery with
UNION ALL SELECT
, but there are more convenient and efficient options in Postgres.And you don't need a CTE for this. You can use it, but it has no performance benefit.
Provide a set with
VALUES
:Provide an array and
unnest()
2a. with an array constructor:
2b. With an array literal:
Add
ORDER BY i, m.col1
if you need the sort order in your result.About row and array syntax: