Sql-server – How to assign different random values to each row in a SELECT statement

sql serversql-server-2005

Please look at this code:

create table #t1(
  id int identity (1,1),
  val varchar(10)
);


insert into #t1 values ('a');
insert into #t1 values ('b');
insert into #t1 values ('c');
insert into #t1 values ('d');

Now, whenever you execute this

select *, 
    ( select top 1 val from #t1 order by NEWID()) rnd 
from #t1 order by 1;

you will get a result with where all rows have the same random value. e.g.

id          val        rnd
----------- ---------- ----------
1           a          b
2           b          b
3           c          b
4           d          b

I know a way using a cursor to loop throw the rows and get different random values, but that is not performant.

A clever solution to this is

select t1.id, t1.val, t2.val
from #t1 t1
    join (select *, ROW_NUMBER() over( order by NEWID()) lfd from #t1) as t2 on  t1.id = t2.lfd 

But I simplified the query. The real query looks more like

select *, 
    ( select top 1 val from t2 where t2.x <> t1.y order by NEWID()) rnd 
from t1 order by 1;

and the simple solution doesn't fit. I'm looking for a way to force repeated evaluation of

( select top 1 val from #t1 order by NEWID()) rnd 

without the use of cursors.

Edit:
Wanted output:

perhaps 1 call

id          val        rnd
----------- ---------- ----------
1           a          c
2           b          c
3           c          b
4           d          a

and a second call

id          val        rnd
----------- ---------- ----------
1           a          a
2           b          d
3           c          d
4           d          b

The value for each row just should be a random value independent from the other rows

Here is the cursor version of the code:

CREATE TABLE #res ( id INT, val VARCHAR(10), rnd VARCHAR(10));

DECLARE @id INT
DECLARE @val VARCHAR(10)
DECLARE c CURSOR FOR
SELECT id, val
FROM #t1
OPEN c
FETCH NEXT FROM c INTO @id, @val
WHILE @@FETCH_STATUS = 0
BEGIN
    INSERT INTO #res
    SELECT @id, @val, ( SELECT TOP 1 val FROM #t1 ORDER BY NEWID()) rnd 
    FETCH NEXT FROM c INTO @id, @val
END
CLOSE c
DEALLOCATE c

SELECT * FROM #res

Best Answer

A subquery is evaluated once if possible. I can't recall what the "feature" is called (folding?) sorry.

The same applies to GETDATE and RAND functions. NEWID is evaluated row by row because it in intrinsically a random value and should never generate the same value twice.

The usual techniques are to use use NEWID as input to CHECKSUM or as a seed to RAND

For random values per row:

SELECT
   co1l, col2,
   ABS(CHECKSUM(NEWID())) AS Random1,
   RAND(CHECKSUM(NEWID())) AS Random2
FROM
   MyTable

If you want random order:

SELECT
   co1l, col2
FROM
   MyTable
ORDER BY
   NEWID()

If you want random order with a row order too. ActualOrder order here is preserved regardless of the order of the resultset

SELECT
   id, val,
   ROWNUMBER() OVER (ORDER BY id) AS id
FROM
   #t1
ORDER BY
   NEWID()

Edit:

In this case, we can state the requirement as:

  1. return any random value from the set for each row in the set
  2. the random value will be different from the actual value in any row

This is different to what I offered above which simply re-orders rows in various ways

So, I'd consider CROSS APPLY. The WHERE clause force row by row evaluation and avoids the "folding" issue and ensures that val and rnd are always different. CROSS APPLY can scale quite well too

SELECT
   id, val, R.rnd
FROM
   #t1 t1
   CROSS APPLY
   (SELECT TOP 1 val as rnd FROM #t1 t2 WHERE t1.val <> t2.val ORDER BY NEWID()) R
ORDER BY
   id