Mysql – Inserting 30k prices each minute, how can I skip an entry if the price did not change

database-designMySQLperformancequery-performance

I receive around 30k price updates for various articles every minute and I need to write the new prices into our database (Which is a MySQL Database running InnoDB)

The table containing the prices right now has only three columns: Article_id, InsertTime, and ArticlePrice

My Problem is: I do not want to insert duplicate prices, but I want to store historical data. So when the price goes up or down I insert it. When it stays the same I skip it. Since I receive a lot of data I can not query first and then decide if I can insert or not, as this approach will be too time consuming.

Can this be solved on a database level? For example by adding a time column and using it as a primary key?

I also thought of solving it in the application itself, for example by keeping a List of the previous prices and compare them there before inserting into the database, but I want to guarantee a consistent database and this seems complicated.

Any help is appreciated.

Best Answer

You have 30K new records for the 'current' minute; put them in a temp table; include PRIMARY KEY(ArticleId).

Note: History will need PRIMARY KEY(ArticleId, InsertTime).

You need to find the 'latest' existing price for each of the 30K articles in the existing history table. But first, you need to find the latest InsertPrice for each article in this table: (This is 'select#1')

SELECT ArticleId, MAX(InsertTime) AS InsertTime
    FROM History
    GROUP BY ArticleId

Now back to the latest price (select#2):

SELECT ArticleId, ArticlePrice
    FROM ( select#1 ) t1
    JOIN History  USING(ArticleId, InsertTime)

Now let's find the changed prices (select#3):

SELECT temp.ArticleId, temp.LastInsert, temp.ArticlePrice
    FROM ( select#2 ) t2
    LEFT JOIN temp  USING(ArticleId)
    WHERE temp.ArticlePrice != t2.ArticlePrice

I may have also allowed for 'new' Articles to arrive. (But I am not sure.)

Now, let's copy the new Prices over:

INSERT INTO History (ArticleId, InsertTime, ArticlePrice)
    select#3;