Mysql – Replication – Binary log parsing(Tableid generation, ON DELETE CASCADE handling, Changed columns for update event)

binlogMySQLmysql-5.5mysqlbinlogreplication

I am using MySQL 5.5 with InnoDB engine. I have set binlog format as row-based replication.

I have some doubts regarding replication:

1) How is the tableID generated? Is it stored somewhere?

2) Say I have Table T1(parent), T2(child), T3(child). T1 has a FK(ON DELETE CASCADE) relationship with its children: T2 & T3. If a row is deleted from T1, then will a DELETE event be generated for all T1, T2 & T3?

3) Is there any way to get info about the changed columns in case of an UPDATE event?

Best Answer

These are interesting questions with answers that are not too easy to come by. The internal structures are sometimes not very well documented or the documentation is difficult to find and more than just a little bit frustrating.

Part 1: table_id

The table_id in a row-based replication TABLE_MAP_EVENT is not a value ordinarily exposed to the user, though you'll see it in the output of mysqlbinlog with --verbose and --base64-output=decode-rows.

#120820 13:42:23 server id 10  end_log_pos 681420677    Table_map: `xifc`.`action_log` mapped to number 4025219

This value comes from an incrementing global counter in sql/sql_base.cc. It increments every time a table is added to the cache of open tables that is limited in size by the global variable open_table_cache... so if that value is smaller than the number of tables being periodically accessed, you should see it increment more frequently. All of the tables will have a new id on the next generated binlog event after a FLUSH TABLES.

Since a table appears to be closed and opened every time it is altered, the table_id apparently has no choice but to increment every time a table is altered, but this does not mean than an increment in the table_id for a given table necessarily means that the structure has changed. My testing reveals that the table_id changes even when a column is renamed, even though all of the data types for the table are identical.

In summary: the table_id for a table always changes when the table is altered, and sometimes changes even when it has not been altered. The counter resets if the server is restarted. The TABLE_MAP event contains the table_schema and table_name.

Part 2: ON DELETE CASCADE ...not in the binlog?

I was surprised to find that, unlike row changes caused by triggers, cascaded deletes in child tables caused by foreign key constraints don't seem to show up in the binary log. Then, I found this explanation and a moment's reflection makes it apparent why this might be true.

I'll give you a hint: it's the same reason MyISAM doesn't do foreign key constraints... namely, that's because MySQL itself doesn't do foreign key constraints... InnoDB does. They're handled internal to InnoDB, apparently outside of the visibility of that part of MySQL server that handles binary logging, and rely on the slave to have the same constraints defined. When the rows are deleted on the slave, InnoDB on the slave handles it, so from a replication perspective, they aren't "needed" in the binary log.

Part 3: Row Changes in UPDATE events and identifying columns

When a row is updated, an UPDATE_ROWS_EVENT is written, containing before and after images of the row in question. (Inserts only contain an after image, deletes only contain a before image). Prior to MySQL 5.6.2, each event contains the full before and after row image, which is what I like. Arguably, the space savings for only logging what is needed for replication into the binlog doesn't justify the loss of potential recovery data, but a new variable binlog_row_image was introduced in 5.6.2 to allow you to change how many columns' worth of information are stored in the log.

Assuming this is set full, which would be the same as the old behavior, you get column values in the row events from every column that isn't null, in the row image (and null flags row columns that are null), and you get column type definitions in the table_map events that is sent at the beginning of every transaction for each of the tables that will be involved. From this, you can extract the column values, ordered by their ordinal position within the table, but you don't get column names. They are not in the binary log.

I have been working on a project that generates live events for background processes by emulating a slave server, connecting to the master (or a downstream slave with log_slave_updates enabled), requesting a binlog stream, decoding it, and generating a JSON representation of the event to be sent into a message queue (SQS or Stomp) or topic (SNS or Stomp). I had to make an architectural decision: provide an array of the column values by ordinal position in the table and require the processes consuming the events to sort them out... or try to figure out the column names for inclusion in the event as an associative array (hash).

I decided to go with the column names, and I accomplished this by asking the server -- on a second connection that I keep open for the purpose -- to DESCRIBE table_schema.table_name every time I see a table_id that I had not seen before, and I cache it in memory until I see the same table mapped to a different id. I also clear my cache if I see a ROTATE_EVENT in the binlog, since MySQL rotates the binlog if the server crashes or shuts down, and the table_id values restart.

The answer here ties back in with the answer on item #1. Every time a row event occurs after a table is altered, the table shows up with a new ID, so if you fetch the table definition each time you see a new table_id, but cache the definition against the old id until it changes, you're pretty safe, though not precisely so: if a table is altered more than once in rapid succession and rows are changed between the alter events, it's possible for you to see a table definition that is newer than the one that was active when the binlog event was generated. This becomes "more true" if your parser can't keep up with the rate at which the server is generating events. Of course, in among the binlog events, you'll also see the ALTER TABLE event get replicated as a query event, but it is replicated exactly as it was entered by the client, so you'd have to parse, tokenize, and understand it, in order to make use of it... but this condition is very difficult to actually produce and unlikely to be a major concern. I my implementation, I handle it by using the lesser of the number of columns in the binlog or the number of columns in the table definition, with the assumption that the primary keys are the most interesting columns, and usually at the left, and unlikely to be encoded incorrectly even if this condition is encountered. A single alter on an active table works perfectly because I've already cached the definition under the table_id and am still using it to decode/encode row events until I see the new table_map event, which prompts me to fetch a new definition.


While working on my project -- which I am using on the fringes of production and plan to release under an open source license if it's ever truly "finished" -- something I found very helpful in figuring out some of the internals (even though I don't know Ruby) was the source code to Jeremy Cole's mysql_binlog. Especially decoding of packed decimals. :(

Related Question