Sql-server – Computed Column Update when Table Value changes

database-designsql server

I have environmental sensors collecting some raw data from environmental stations. In addition to the sensors, there is a physical gauge measuring water height (staff gauge).

I'm trying to set up the staff gauge calculation, so that it will take the most recent offset reading from Table 2 and apply it to any new measurements coming in, without changing the previously calculated values.

Table 1

station_time
Air_temp_celcius
Rainfall_mm
pressure_mH20
water_temp_celcius
sensor_depth_mH20
staff_gauge_height_m (calculated, sensor_depth_mH20 + Offset from Table2)

Table 2

Observed_time
Offset_m
comments

My thinking was to have a function to pull the offset

SELECT TOP 1(Offset)
FROM Table2
ORDER BY Observed_time DESC

Is there a way to have it only update new values coming into Table 1 without updating previous values when a new offset is entered? New offsets are entered into Table 2 very sporadically as field crews visit the site.

_{Additional information from comments:}

There is no direct relationship between the two tables, other than the latest offset is used to calculate the staff gauge value. It should always use the most recent value.

I have specific stations and specific station offset tables. I generalized the table structure: table1_stream1 and table2_stream1_offset. As for the offset storage on the table, redundancy. Always was taught you normalize as much as possible and don't duplicate data in multiple tables if you can avoid it. I no longer work on coding and databases as much as I'd like, but some stuff sticks.

Are you saying if you have a station called "Antartica" and a station called "Africa" then you have the following tables: Antartica_stream1, Antartica_stream1_offset, Africa_stream1, and Africa_stream1_offset, etc.?

Correct. Given stations are polling hourly and not all locations carry the exact same sensor package (or sensors are in different order). Also individual stations may go offline due to technical or environmental issues.

Best Answer

Given the additional information that Aaron Bertrand didn't have access to when he posted his answer I would suggest a different tack.

Instead of putting logic/business significance in table names I would have general table names and put the logic/business significance in attributes/data in the tables. This should make it easier to expand functionality, and maintain your data. Furthermore you can extract useful information much easier.

The following is a rough schema that captures the direction I recommend and will probably need to be adapted to your exact needs:

CREATE TABLE dbo.WeatherStation
(
    WeatherStationId INT NOT NULL PRIMARY KEY IDENTITY(1,1),
    Name NVARCHAR(50) NOT NULL -- This is where you put the name of the station instead of in the table.
)

CREATE TABLE dbo.SensorReading
(
    SensorReadingId INT NOT NULL PRIMARY KEY IDENTITY(1,1),
    WeatherStationId INT NOT NULL FOREIGN KEY REFERENCES dbo.WeatherStation(WeatherStationId), -- Match a reading to the station
    ReportedTime DATETIME2(2) NOT NULL DEFAULT SYSUTCDATETIME(), -- When the time was reported to the database
    <Other columns like temp, pressure, etc.>
)

CREATE TABLE dbo.SensorOffset
(
    SensorOffsetId INT NOT NULL PRIMARY KEY IDENTITY(1,1),
    WeatherStationId INT NOT NULL FOREIGN KEY REFERENCES dbo.WeatherStation(WeatherStationId), -- Match a reading to the station like you do now
    Offset DECIMAL(20, 10) NOT NULL -- Adjust precision/datatype as needed
    Comment NVARCHAR(500) NULL,
    Created DATETIME2(2) NOT NULL DEFAULT SYSUTCDATETIME() -- This would need to be unique per weather station
)

Now you can add a new station without duplicating table schema, you can easily compare data from related stations, etc.

Even if you didn't want/can't change your schema, I would recommend putting the calculation in a view. That is more obvious in my opinion than a trigger, and it would be easier to trouble shoot for me. Something like the following should work with my schema above:

;WITH CurrentOffset_CTE AS
(
    SELECT
        WeatherStationId
        , MAX(Created) AS Created
    FROM dbo.SensorOffset
    GROUP BY
        WeatherStationId
)
SELECT
    WS.Name
    , SR.ReportedTime
    , CASE WHEN SR.<reading> IS NOT NULL THEN SR.<reading> + SO.Offset ELSE NULL END AS <reading>
    , <repeat same pattern as above for the various readings>
FROM dbo.WeatherStation WS
    INNER JOIN dbo.SensorReading SR ON SR.WeatherStationId = WS.WeatherStationId
    INNER JOIN CurrentOffset_CTE CO ON CO.WeatherStationId = WS.WeatherStationId
    INNER JOIN dbo.SensorOffset SO ON SO.WeatherStationId = CO.WeatherStationId AND SO.Created = CO.Created

This will be easy to troubleshoot, hard to miss, and obvious to future maintainers. You could modify this code to work for your current schema too, but would have to duplicate it for each station. In that case I would still recommend this approach for the above stated reasons.

Related Solutions

How to create a train schedule in SQLite3

I think you need a track_segment table to track the segments between stations. The problem with your Station table is that you include distance there, but distance to/from what?

So let's try this:

Train
-----
  ID
  Train_number
  Days_of_operation
  other_details

Station
-------
  ID
  Name
  other_data

Track_Segment
-------------
  From_Station_ID
  To_Station_ID
  Length
  Line_ID

Line
----
  ID
  Name
  other_data

Train_Run
---------
  ID
  Train_ID
  from_Station_ID
  to_station_ID
  Depart_datetime  
  Arrive_datetime
  other_details

This allows a Train_run to reference a specific train, as well as the two endpoints of the journey. Of course, there might be multiple segments of track between these two stations, so the total length of the journey could be had by looking at all the segments between from_station_id and to_station_id in track_segment. I also included the line table, since you had a refernce to it in your tables. I assume that a "Line" is just a descriptive term for a specific train journey that might have multiple stops along the way, like "Orient Express" or "Rocky Mountain Route".

Bus time schedule database design

Since you really only have to keep track of times between stations on each route, you only need to keep the start time of each route, the rest can be calculated easily by storing the time delta value for each line stop (the time between the current station and the last station ), instead of keeping time data for each route_stop. You also need to maintain the order of the stops on the line, and if it's a circular route you simply put the stops twice into the chain with different ordering numbers ( so each stop on a circular route is inserted twice into the route_station relation table with different order number ).

You can of course keep the time for each stop if you want, but that seems redundant and makes it harder adding stops to a route, since you'd then have to recalculate all bus stop times, instead of simply adding the new stop and updating the delta time of the next stop.

I'd probably start with a data model like this ( but of course this needs to be expanded if you want to add information about the buses and drivers etc ) :

lines (id, name, ...)

routes (id, name, line_id, ...)

stops (id, location)

line_stops (id, line_id, stop_id, order, time_delta)

route_start_times (id, route_id, start_time)