Sql-server – Storing Live Timeseries Data in SQL Server Azure Database

columnstoredata-warehouseindex-tuningsql server

I have an SQL Server Azure Database and am trying to store time-series data that I get from multiple sensors.

Data Source:- 5-minute data is obtained via an API.

Current table structure: –

 Timestamp | ComponentId | Parameter1 | Parameter2 | Parameter3

Each sensor has a unique ComponentId. I have a non-clustered index on Timestamp and ComponentId to eliminate duplicates and also a clustered column store index on the whole table (It compresses data and saves space. Also gives a performance boost for aggregate queries). A python script is used to fetch the data via the API and pyodbc library is used to push this data to the table. The script runs every 10 minutes and inserts data into the table.

Some queries like fetching data for a particular component for just one day seem to take 5 seconds. Is this normal?

QUERY:-

SELECT Timestamp,ComponentId,ParameterId FROM TABLE WHERE ComponentId=1 AND Timestamp BETWEEN '2021-05-01' AND '2021-05-02'

Execution Plan

IO Stats: Table 'RawData'. Scan count 2, logical reads 10164, physical reads 3, page server reads 0, read-ahead reads 10146, page server read-ahead reads 0, lob logical reads 2766, lob physical reads 12, lob page server reads 0, lob read-ahead reads 4877, lob page server read-ahead reads 0.
Table 'RawData'. Segment reads 2, segment skipped 10.

Is this way of fetching/pushing data fine? Please let me know if it can be improved and if there are any better methods to do the same.

Best Answer

With that design it is expected that the query requires a full scan of the columnstore. That should be very fast at that scale, but the IO stats show that you're reading from disk. If you look at the wait stats for the actual execution plan you should see that it's more IO waits than CPU utilization.

You could scale up the database to get more cache memory, or perhaps partition the columnstore either by componentId or timestamp. Whenever you partition a columnstore you want to ensure that partitions have at least 1M rows.