There are situations where InnoDB deadlock would come up when you least expect it. For example, SELECT queries can perform locks on the gen_clust_index, aka the Clustered Index.
Here are three past questions I agressively looked over with @RedBlueThing, the person who asked these questions. @RedBlueThing found work arounds for his questions.
Just to keep your question in perspective, when you look over these answers (don't look too deeply, even I get dizzy looking at my own convoluted answers) it should be quickly apparent that even SELECT queries can lock InnoDB data. Although autocommit could be enabled (rendering each query its own transaction), a single SQL statement can still be victimized by deadlocks.
You also have special cases of SELECT where you can lock specific rows on demand.
Based on InnoDB Deadlocking link, the sequences of event to cause this situation could theoretically be as follows:
- Your SQL UPDATEs a single row but generates an error
- The UPDATE causes a rollback of the one row
- The row has a lingering lock
Personally, that last statement scares me. It would have been nice for MySQL to inform everyone of this quirk instead of just documenting and hopong you google for it. Yet, that statement is from the MySQL Documentation. (Oh yeah, Oracle owns InnoDB)
I have developed software with similar functionality, the ability to use the MySQL replication stream (binary log, binlog) to capture events in near-real-time in response to inserts/updates/deletes in the database.
Here are some observations I've made regarding performance. Luckily, the potential hot spots are largely independent of each other.
I will assume, since I was unfamiliar with the Node package you cited and have only just now given their code a cursory review, that they are not actually "tailing" the binlog via polling, but are actually emulating a slave/replica server and connecting to the master and requesting the replication stream.
The first potential bottleneck is the master's ability to write the amount of Binlog data required (I/O throughput is the primary cobstraint). If your master is already logging in ROW
format, then this problem is already solved. If not, then switch your Binlog format, and see. I prefer ROW
format, anyway, because it's very useful for data recovery when queries go bad or the application does something to the data it shouldn't have. It is possible (using 3rd party tools) to capture what happened and reverse it -- in the default configuration, when a delete occurs (for example) the deleted data is actually written to the binary log.
The next point of resource consumption is the slave connection made by such a tool to the master, where the master pushes the data. A common misconception is that a slave "polls" the master. In fact, the slave initiates the connection, but the master pushes the data. This is actually a load on the master that has very little impact on performance when the number of slaves connected is small (say, 5 or fewer). This load can be eliminated from the master entirely by connecting the "binlog tailer" not to the master, but to an existing slave of the master, with log_slave_updates
configured.
Transport of the data from the master to the pseudo-slave can eat significant network bandwidth, so your external utility should support the MySQL client/server compression protocol to reduce this bandwidth. Enabling this capability can achieve compression ratios of 10:1 depending on the payload.
The final pain point is the external utility itself. The MySQL Binlog format is a very tightly packed binary format (hence "binary log") that must be parsed and decoded. The efficiency with which the external utility can unpack and manipulate this data stream will determine how close to real-time the events detected can be emitted, since inefficient code will cause your decided event stream to lag further and further behind the master, though this factor won't have any performance impact on the master server itself.
In short, if your master can handle the workload of generating row-format binlogs for the volume of traffic you expect, the rest of the potential issues are still potential issues, but they should have no meaningful performance implications on the master server, itself.
Best Answer
After some research and further looking into MySQL documentation, it turned out to be it was an error on my part.
I was using
mysqlbinlog -s
option to view binlog. Because the replication statement is coming over as Row Based Replication, it is not possible to see the actual statement.Advantages and Disadvantages of Statement-Based and Row-Based Replication
So when trying to view the binlog using the
mysqlbinlog -s
util, it doesn't decode this blob of information and hence shows up as blank/empty commit.Hope this helps someone else runs into to similar situation.