Mysql – Update using the value from the closest row

MySQLmysql-5.6self-joinupdate

My table looks like this:

-----------------------------------------
   id  |  name  | housename    | houseno 
-----------------------------------------
   1   | John   | John's House |  2/22
-----------------------------------------
   2   | Rachel | John's House |  2/22
-----------------------------------------
   3   | Joy    | John's House |  
-----------------------------------------
   4   | Alex   | Alex's House |  
-----------------------------------------
   5   | Rachel | Alex's House |  3/22
-----------------------------------------
   6   | Ben    | Alex's House |  3/22
-----------------------------------------

I want to update missing houseno values by matching with previous or next rows with the same housename. Using the example above, for id 3 the houseno should be 2/22 (the houseno of John's house) and for id 4 it should be 3/22. The id column is the primary key and is not continuous.

Which of these two approaches – min(id) > id or id < id order by id desc limit 1 – would work fine for retrieving the previous row?

Update

Sorry for making the question not clear , DISTINCT cannot be used as there is a chance for the housename to recur . The table can be like the one below ……

-----------------------------------------
   id  |  name  | housename    | houseno 
-----------------------------------------
   1   | John   | John's House |  2/22
-----------------------------------------
   2   | Rachel | John's House |  2/22
-----------------------------------------
   3   | Joy    | John's House |  
-----------------------------------------
   4   | Alex   | Alex's House |  
-----------------------------------------
   5   | Rachel | Alex's House |  3/22
-----------------------------------------
   6   | Ben    | Alex's House |  3/22
-----------------------------------------
   7   | John Doe   | John's House |  
-----------------------------------------
   8   | Betty Doe | John's House |  4/22
-----------------------------------------
   9   | Sophy Doe    | John's House | 4/22  
-----------------------------------------

Is there a way to group rows without houseno with previous and next ?

Best Answer

The following approach will only work as expected if we can safely assume that rows with the same housename will have the same houseno where houseno is populated.

As the first step, you need to get the list of all distinct housename/houseno pairs where houseno has a value. If my assumption above is correct, you will get a result set where all housename values are unique, and each will have a corresponding houseno – a kind of reference table:

SELECT DISTINCT
  housename,
  houseno
FROM
  YourTable
WHERE
  houseno <> ''

Use the above as a derived table to join it back to the original table in the UPDATE statement, so that you can populate the missing values from the derived table:

UPDATE
  YourTable AS t
  INNER JOIN
  (
    SELECT DISTINCT
      housename,
      houseno
    FROM
      YourTable
    WHERE
      houseno <> ''
  ) AS ref ON t.housename = ref.housename
SET
  t.houseno = ref.houseno
WHERE
  t.houseno = '' OR t.houseno IS NULL
;

Note that rather than "fixing" your table like this, it would be better to normalise it by creating an actual reference table with columns housename and houseno that you would then join back to your data table whenever you need to retrieve houseno. Consequently, the houseno column in the original table would no longer be needed.

Related Solutions

MySQL 5.6 – Update Row Cell with 0 Value from Closest Non-Zero Row

SET @var :=0;
UPDATE records
SET time= (@var :=CASE WHEN time = '0' THEN @var ELSE time END)
ORDER BY id;

Fiddle

Add DESC after ORDER BY if you want to update time with the time value of the next id.

PostgreSQL – How to Access Row Value from UPDATE Subquery Function

When using a query of this form: UPDATE a SET columns=values FROM b WHERE ..., the b subquery is evaluated independently of a, and it's the WHERE block at the end of the query that is supposed to join the rows of a and b.

The query in the question attempts to join inside the subquery rather than in a WHERE clause at the uppermost level (in fact it doesn't have such a WHERE clause), but it's not allowed and that's what the error message indicates.

The SQL-standard way of updating each row with a correlated subquery would be:

UPDATE a SET (col1,col2) =
   (SELECT val1,val2 FROM b WHERE expression-needing-values-from-a)

but in older versions of PostgreSQL (9.4 or before), the manual explicitly tells that this form is not implemented. Starting with 9.5, it is implemented.

The typical way to emulate it with postgres 9.4 is to use a copy of table a in the subquery and join back to the updated table in the upper level's WHERE clause, like this:

 UPDATE a SET (col1,col2)=(s.col1,s.col2)
  FROM (SELECT a2.ctid,col1,col2 AS ctid_of_row FROM b, a AS a2 WHERE...) AS s
   WHERE s.ctid_of_row = a.ctid;

knowing that ctid is the system pseudo-column that indicates the physical location of the row version within its table. Alternatively, the primary key can be used if there is one.

In your case, the transformation is not obvious because your subquery is meant to produce only one row with its ORDER BY... LIMIT 1 at the end. It should be redesigned to produce all the target rows in a single resultset, presumably (verticeid,cost,gid) where cost and gid are determined per offer.verticeid.

Update

Best Answer

Related Solutions

MySQL 5.6 – Update Row Cell with 0 Value from Closest Non-Zero Row

PostgreSQL – How to Access Row Value from UPDATE Subquery Function

Related Question