PostgreSQL JSON – Count Hours When Value Exceeds Threshold

jsonpostgresql

i have this JSON that is contained in "data" column, the table also have: insert time, pk and some fk as reported below:

PK  INSERT_TIME                     DATA   SENSOR
1   2020-01-04 15:39:44.19364+00    {}       7

The table is populated with an external application that, about every minute, pushes new data, this is the JSON that i have typically in the table:

{
   "dev":11,
   "kw":43.0,
   "lux":200.2,
   "temp":20.3,
}

My goal is to build a query that, having all rows in the table is able to count how many hours in total and hours per day the device was up and running and producing kW, so, when the kW value is greater than zero (those measurements comes from a solar inverter system)

This is the query that i've built, actually i'm able to select only the rows that have the kw value greater than 1 and group them by month, but i'm not able to figure out how to calculate the production hours:

SELECT mt.insert_time, 
       date_part('month', mt.insert_time) as mon
FROM mytable mt 
WHERE sensor = 7 
  AND date_part('month', mt.insert_time) = 5 
  AND (mt.data->'kw')::int > 1

What i'm expecting:

I need to get the exact amount of working hour of my solar panel production system, i'm assuming that the solar panel "works" only when the kW value is greater than zero.

I want to see, day by day, and month by month the working hours of my panel.

Here guys i'm reporting an example of my data rows

+-----------------------------------------------------------------------------------------------------+
|                                     pk,insert_time,data,sensor                                      |
+-----------------------------------------------------------------------------------------------------+
| 3003711,2020-10-03 09:55:54.271738+00,"{""dev"": 12, ""kw"": 0, ""temp"": 20.8, ""lux"": 2.0}",12   |
| 3003692,2020-10-03 09:54:54.289131+00,"{""dev"": 12, ""kw"": 0, ""temp"": 20.6, ""lux"": 2.0}",12   |
| 3003681,2020-10-03 09:53:54.287502+00,"{""dev"": 12, ""kw"": 9.0, ""temp"": 20.5, ""lux"": 2.0}",12 |
| 3003670,2020-10-03 09:52:54.284262+00,"{""dev"": 12, ""kw"": 9.0, ""temp"": 20.5, ""lux"": 2.0}",12 |
| 3003659,2020-10-03 09:51:56.382746+00,"{""dev"": 12, ""kw"": 12, ""temp"": 20.5, ""lux"": 2.0}",12  |
| 3003648,2020-10-03 09:50:54.279558+00,"{""dev"": 12, ""kw"": 9.0, ""temp"": 20.5, ""lux"": 2.0}",12 |
| 3003637,2020-10-03 09:49:56.377988+00,"{""dev"": 12, ""kw"": 9.0, ""temp"": 20.5, ""lux"": 2.0}",12 |
+-----------------------------------------------------------------------------------------------------+

The final result may produce a query that (starting from a specific day of month, or specific month, or specific year) will give me the number of minutes and hours that the system was powered up and producing kW.

Like, for example:

  • Selecting the date of 10/08/2020 i need to get the total production minutes and respective hours count
  • Selecting the month of july (7) i need to get the total production minutes and respective hours count
  • Selecting the year (2020) i need to get the total production minutes and respective hours count

Best Answer

Q: This is the query that i've built, actually i'm able to select only the rows that have the kw value greater than 1 and group them by month, but i'm not able to figure out how to calculate the production hours:

You need to get each measurement and the next measurement into a single row. This is typically what the lag window function does:

SELECT mt.insert_time,
       lag(mt.insert_time) OVER w AS next_time 
       (mt.data->>'kw')::int AS curr_value,
       ((lag(mt.data) OVER w)->>'kw')::int AS next_value
FROM mytable mt 
WHERE sensor = 7 
WINDOW w AS (order by pk)  -- or order by insert_time if you prefer

Once both the current and next measurements are in the same row, you have the duration between them (=next_time - insert_time) and can sum and filter out rows when the kw value went from above 1 to below 1 or the opposite, or were both below 1.