Sql-server – SQL Server Merge vs MySQL Insert On Duplicate Key performance comparison

mergeMySQLperformancesql server

I've been searching a lot for this kind of comparison or some sort of performance balance.

What I do know so far is this:

  1. MERGE is T-SQL command that is already implemented on SQL Server. As far as I know, it does perform very well, since it uses "kind of like" INNER JOIN hash mapping for inserting and updating, but I've been having some issues when deleting on the same MERGE statement as well when using CLUSTERED INDEX. Besides that, it's very fast and make let me do some comparison clauses before updating or inserting something, so it's very flexible for me.

  2. In SQL Server some specific scenarios would rather use UPDATE and INSERT as separate statements and I wouldn't argue that. In my short experience, I would use MERGE by default as a standard for my code.

  3. Now as a DBA, I'm facing a new challenge, I have to manage MySQL servers as well, so I ended up looking for similar MERGE behavior in MySQL to improve performance of the queries. So far, I found nothing alike but INSERT … ON DUPLICATE KEY … UPDATE. Still, I have some performance questions, since I'm not sure how it behaves with the server, how does it works and if optional pair of statements would work better or faster than that.

  4. Looking around in MySQL, I found UPDATE + INSERT IGNORE, UPDATE + INSERT, INSERT … REPLACE, and so on..

  5. MySQL documentation is a bit confusing when trying to determinate if I could use some other clauses ex. in MERGE statement I could use AND (TARGET.COLUMN_X > 'VALUE'):

    MERGE _TABLE A_ AS TARGET
    WITH _TABLE B_ AS SOURCE
    ON (TARGET.KEY = SOURCE.KEY)
    WHEN MATCHED AND (TARGET.COLUMN_X > 'VALUE')
      UPDATE
        TARGET.COLUMN_A = SOURCE.COLUMN_A
    ...
    

    I don't find how to do this on MySQL.

*CONSIDERATIONS

I have to manage this to achieve better time result and performance friendly.

What I have as setting:

  • PERCONA MySQL

    • txt file that need to be uploaded to a table every moth with new data and some changes (this is why I'm looking for a MERGE like statement)
    • InnoDB MySQL Engine
    • Relational Database tables so I can't delete or truncate the target table because all of them are related.

Best Answer

IODKU is the best of the options in MySQL. It works something like this:

  1. Use some UNIQUE key (possibly the clustered, unique, PRIMARY KEY) to locate the row to modify.
  2. If no such row, perform an INSERT.
  3. If such a row, perform an UPDATE.

You can't get faster than that.

Note that step 1 will cache everything that is needed for #2 or #3 in the buffer_pool. (Well, OK, secondary indexes are handled in a 'delayed' way via the "Change Buffer".)

Further note that the statement is atomic, whereas your 2-statement alternatives need to be in a transaction.

Keep in mind that REPLACE is DELETE (0, 1, or possibly multiple rows, if you have multiple unique keys), then INSERT. Note that the AUTO_INCREMENT (if used) value is thrown away and a new one is created.

As for that messy query, it can probably be done with GREATER() and/or IF() and/or VALUES(). Assuming that you are trying to merge multiple rows in, you need the IODKU+SELECT syntax:

INSERT INTO Target (key, a)
    ON DUPLICATE KEY UPDATE
        SET a = VALUES(a) 
    SELECT Target.key, Target.a
        FROM Target
        JOIN Source ON Source.key = Target.key
        WHERE Target.x > 'value';

Since I don't know what MERGE does, I can't give you all the details.