I have a messages
table with some fields like id
, version
, timestamp
I'm trying to insert and retrieve messages with real-time deduplication logic, implemented in stored procedures.
- On insert, if a message with given
id
already exists in the table, I invoke a complex deduplication logic (a number of IF/ELSE statements) and then update messages'sversion
,timestamp
, etc. Otherwise I insert a new row. - On retrieve, I search for the oldest message and not just select it, but also update a few fields (changing
state
, settingtransient_token
, etc)
The problem is, it does not work as fast as I need: ~ 200 rps on insert and the same on retrieve
Separately my stored procedures work good, but when I execute both of them concurrently, they start fighting for locks under load > 30 rps, performance dramatically decreases and deadlocks appear periodically
Are there some best practices to solve my problem? I feel that the whole approach sucks, not my concrete implementation.
May be I should use some temp buffer table for all incoming messages and perform dedup logic afterwords? I would appreciate any ideas.
* I can't use nosql DB in my case (management decision), I need to gain max performance with Mysql/MariaDb
Best Answer
Good news. There is a SQL command usually called MERGE or UPSERT that basically does this.
See this page for more information: https://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
You probably want something like: