Mysql – Fetch one value from random row where value is not x

innodbjoin;MySQLrandom

I have a games table in my db, which has id column(pk) and slug column(unique) – other columns are not relevant. Now, what I'd like to get is one row, or it's slug value only, which would be random row, but the slug must not match myslug. I've come up with:

SELECT slug, nameslug FROM games
JOIN
  (SELECT ROUND(MIN(id) + RAND() * (MAX(id) - MIN(id))) AS pk
   FROM games) AS rand
WHERE id >= rand.pk AND slug <> '20piaedfl6ah'
LIMIT 1;

but it sometimes returns false as it doesn't check if that row's slug column value is the same at the point where it searches for random row.

Best Answer

At a glance, I'm not convinced that your original query is too far from wrong.

Unless, I'm greatly mistaken, the only time it should be unable to return a results is when the target slug happens to have the MAX(id) as its id, and when that row is the only valid row to consider. If the only id greater than or equal to the randomized id value you've calculated is the one eliminated by the other WHERE condition, there are no rows to return.

If that's the case, then something like this may work:

SELECT slug, id, rand.pk FROM games
JOIN
  (SELECT ROUND(MIN(id) + RAND() * (MAX(id) - MIN(id))) AS pk
   FROM games WHERE slug <> '20piaedfl6ah') AS rand
WHERE slug <> '20piaedfl6ah'
  AND id >= rand.pk
LIMIT 1;

Here's a fork of @Vérace's DBFiddle with your original query and my modification of it. I made sure that the row to avoid has the highest ID

Here, we've ensured that neither end of our range matches our row to avoid. As long as that's true, then we should be OK with any value in our range. If the rand.pk matches the slug to avoid (or matches no rows at all - I assume there may be gaps in the ID sequence), then we know there's at least one row available that's OK - the one that matches our MAX(id)value (which cannot be either a gap row, or the row to avoid).

In numerous runs of the DBfiddle, I never saw the second query (which is the one protected against not finding a match) take longer than the first - however, I must note that the time to set-up the schema is included there.

And, of course, if there's only one row available, and its slug matches, you would still find no rows - however, I'd hold that in that case, finding no rows is the correct behavior.

UPDATE 2014-01-23 11:01 EST

The reason it works because of the goal I set for the query : Retrieve one value.

First look at the UNION

    SELECT id FROM tbl_play WHERE id = 1
    UNION
    (SELECT id FROM tbl_play WHERE gameid = 0 LIMIT 1)

The first SELECT can return at most one row.

The second SELECT can return at most one row because of the LIMIT 1.

In the worst case scenario, the UNION will have two rows.

Now look at the whole query:

SELECT id FROM
(
    SELECT id FROM tbl_play WHERE id = 1
    UNION
    (SELECT id FROM tbl_play WHERE gameid = 0 LIMIT 1)
) A ORDER BY id LIMIT 1;

The outer part of the query will receive either 0, 1, or 2 rows.

Then, the part ORDER BY id LIMIT 1 forces the outer query to choose the first value.

Postgresql – Self-join unique pairs

Every row of the result depends on the previous row. A ~~recursive CTE~~ comes to mind, I tried that. But one would need to refer to the worktable in an OUTER JOIN or a subquery expression which is not allowed. This does not work (building on the table layout in my fiddle):

WITH RECURSIVE
   t0 AS (SELECT *, COALESCE(array_length(opp_log,1), 0) AS len FROM tbl)
,  t1 AS (
   SELECT t1.player_id AS pl, t2.player_id AS p2
         ,t1.len AS len1, t2.len AS len2
   FROM   t0 t1, t0 t2 
   WHERE  t2.player_id <> t1.player_id
   AND    t2.player_id <> ALL (t1.opp_log)
   )
, cte AS (
   (
   SELECT pl, p2
   FROM   t1
   ORDER  BY len1 DESC, len2 DESC
   LIMIT  1
   )

   UNION ALL
   (
   SELECT pl, p2
   FROM   t1
   LEFT   JOIN cte c ON t1.p1 IN (c.p1, c.p2)
                        OR t1.p2 IN (c.p1, c.p2)
   WHERE  c.p1 IS NULL
   ORDER  BY len1 DESC, len2 DESC
   LIMIT  1
   )
   )
SELECT *
FROM   cte;

> ERROR:  recursive reference to query "cte" must not appear within an outer join

I don't think there is a halfway decent way to solve this with pure SQL. I suggest:

Procedural solution with PL/pgSQL

CREATE OR REPLACE FUNCTION f_next_round()
  RETURNS TABLE (player_id1 int, player_id2 int) AS
$func$
DECLARE
   rows int := (SELECT count(*)/2 FROM tbl);  -- expected number of resulting rows
   ct   int := 0;                             -- running count
BEGIN

CREATE TEMP TABLE t ON COMMIT DROP AS         -- possible combinations
SELECT t1.player_id AS p1, t2.player_id AS p2
     , COALESCE(array_length(t1.opp_log,1), 0) AS len1
     , COALESCE(array_length(t2.opp_log,1), 0) AS len2
FROM   tbl t1, tbl t2 
WHERE  t2.player_id <> t1.player_id
AND    t2.player_id <> ALL (t1.opp_log)
AND    t1.player_id <> ALL (t2.opp_log)
ORDER  BY len1 DESC, len2 DESC;               -- opportune sort order

LOOP
   SELECT INTO player_id1, player_id2  p1, p2 FROM t LIMIT 1;

   EXIT WHEN NOT FOUND;
   RETURN NEXT;
   ct := ct + 1;                              -- running count

   DELETE FROM t                              -- remove obsolete pairs
   WHERE  p1 IN (player_id1, player_id2) OR 
          p2 IN (player_id1, player_id2);
END LOOP;

IF ct < rows THEN
   RAISE EXCEPTION 'Could not find a solution';
ELSIF ct > rows THEN
   RAISE EXCEPTION 'Impossible result!';
END IF;

END
$func$  LANGUAGE plpgsql VOLATILE;

How?

Build a temporary table with remaining possible pairs. This kind of cross join produces a lot of rows with big tables, but since we seem to be talking about tournaments, numbers should be reasonably low.

Players with the longest list of opponents are sorted first. This way, players that would be hard to match come first, increasing the chance for a solution.

Pick the first row and delete related pairings now obsolete. Do need to sort again. Logically any row is good, practically we get the player with the longest list of opponents first due to initial sort (which is not reliable without ORDER BY, but good enough for the case).

Repeat until no match is left.
Keep count and raise an exception if the count is not as expected. PL/pgSQL conveniently allows to raise an exception after the fact, which cancels any previous return values. Details in the manual.

Call:

SELECT * FROM f_next_round();

Result:

player_id1 | player_id2
-----------+-----------
1          | 7
2          | 3
4          | 8
5          | 6

SQL Fiddle.

Note

This does not guarantee to calculate the perfect solution. I just returns a possible solution and uses some limited smarts to improve the chances to find one. The problem is a bit like solving a Sudoku, really and is not trivially solved perfectly.

Best Answer

Related Solutions

MySQL Subquery returns more than one row

UPDATE 2014-01-23 11:01 EST

Postgresql – Self-join unique pairs

Procedural solution with PL/pgSQL

How?

Note

Related Question