Monday, December 2, 2024

MySQL GTID tags and binlog events

MySQL 8.4 and newer have extended the Global Transaction ID (GTID) functionality with a new “tag” option.

Refresher on GTID

A GTID is a unique ID that is assigned to a transaction. This is used if gtid_mode is set to ON. The benefit of this is that a transaction can be uniquely identified in a MySQL replication setup with multiple levels. Among others this makes it easier to refactor a replication tree as a MySQL replica knows which transactions it has seen and can use this to find the right position to start replicating from a new source.

The format of GTIDs is documented here.

Before GTID was used replication worked based on a file and offset (e.g. file=binlog.000001,offset=4), which is unique to every server.

A GTID without tag looks like this:

896e7882-18fe-11ef-ab88-22222d34d411:1

This is in the format of <server_uuid>:<txid>. The UUID of the server is in the server_uuid global variable and the txid is the transaction id, which is an increasing number.

And for a range of transactions it looks like this:

896e7882-18fe-11ef-ab88-22222d34d411:1-12

Here there is a range of transactions, starting at 1 and ending at 12.

There can also be multiple ranges:

896e7882-18fe-11ef-ab88-22222d34d411:1-12,14-20

Here transaction 13 is missing in the range from 1 to 20.

And then there can be transactions from multiple servers:

896e7882-18fe-11ef-ab88-22222d34d411:1-12,14-20,fb0b950b-acb3-11ef-8f70-3674a2d2348e:1-10

GTID tags

If you run SET gtid_next='AUTOMATIC:abc then the next transaction will have a GTID that looks like this:

896e7882-18fe-11ef-ab88-22222d34d411:abc:1

There can be multiple tags in a GTID set and there can be multiple tags, for example:

896e7882-18fe-11ef-ab88-22222d34d411:1-3:abc:1:foo:1-3,5-7

This value is a compact way of representing this:

896e7882-18fe-11ef-ab88-22222d34d411:1
896e7882-18fe-11ef-ab88-22222d34d411:2
896e7882-18fe-11ef-ab88-22222d34d411:3
896e7882-18fe-11ef-ab88-22222d34d411:abc:1
896e7882-18fe-11ef-ab88-22222d34d411:foo:1
896e7882-18fe-11ef-ab88-22222d34d411:foo:2
896e7882-18fe-11ef-ab88-22222d34d411:foo:3
896e7882-18fe-11ef-ab88-22222d34d411:foo:5
896e7882-18fe-11ef-ab88-22222d34d411:foo:6
896e7882-18fe-11ef-ab88-22222d34d411:foo:7

So what are reason why you want to use GTID tags? One senario I can think of is heartbeats. So if you create a heartbeat table with one row and update that every second then you can use this to test replication delay and write availability.

CREATE TABLE heartbeat (
    id INT PRIMARY KEY,
    t TIMESTAMP(6)
);

-- Monitoring service updates heartbeat, if writes fail: alert, use for write availability metric
SET gtid_next='AUTOMATIC:heartbeat';
REPLACE INTO heartbeat VALUES (1, CURRENT_TIMESTAMP(6));

-- On replicas, check the difference between the heartbeat and the current timestamp to measure replicaton delay
SELECT TIMESTAMPDIFF(MICROSECOND, t, CURRENT_TIMESTAMP(6)) FROM heartbeat;

There are tools like pt-heartbeat from Percona Toolkit that can help you with this.

Why having a heartbeat tag can help here is that losing the data of a heartbeat transaction is fine, while losing a transaction from an application isn’t.

Other senarios are data imports and data archival.

Binlog events

Binlogs (short for binary logs) are logs of changes made to the database. These can be used for replication between a primary database server and replicas and be used together with backups to do point-in-time recovery. Another usecase is sending events to other systems via a Change Data Capture (CDC) tool.

Adding tags impacts a few of the types of events that are sent in the binlog stream.

PREVIOUS_GTIDS_LOG_EVENT

This event is used to send which GTIDs were in previous binlog files after switching to a new file.

With mysqlbinlog ... --read-from-remote-server --hexdump -vv $binlogfile you can see what these events look like. Here is an example:

# at 127
#241126 17:24:31 server id 1  end_log_pos 158 CRC32 0x1b54f2b2
# Position  Timestamp   Type   Source ID        Size      Source Pos    Flags
# 0000007f 3f f6 45 67   23   01 00 00 00   1f 00 00 00   9e 00 00 00   80 00
# 00000092 00 00 00 00 00 00 00 00  b2 f2 54 1b             |..........T.|
#   Previous-GTIDs
# [empty]

The first line is a header with a timestamp, type, server_id, size, position and flags. The second line ends with the CRC32 checksum. The actual body for this event is 00 00 00 00 00 00 00 00, which means that there were no previous GTIDs. If there are GTIDs it would change into 01 00 00 00 00 00 00 00, etc. and then for each GTID set it has the UUID and then one or more ranges of transactions.

When MySQL sees the first tagged GTID it changes this event and keeps using a new format until the server is restarted.

Then the events look like this:

# at 127
#241126 17:57:35 server id 1  end_log_pos 158 CRC32 0x3b12f980
# Position  Timestamp   Type   Source ID        Size      Source Pos    Flags
# 0000007f ff fd 45 67   23   01 00 00 00   1f 00 00 00   9e 00 00 00   80 00
# 00000092 01 00 00 00 00 00 00 01  80 f9 12 3b             |............|
#   Previous-GTIDs
# [empty]

Here the body is 01 00 00 00 00 00 00 01. here the last 01 means the GTID format where 00 is classic and 01 means tagged. For tagged GTIDs the number of GTIDs is encoded slightly differently. First a mask of 0xFF000000000000FF is used and then the resulting value is shifted by 1 byte. And then the GTIDs are encoded with a field for the tag. If there are multiple tags, then these are send as separate GTID sets in this event.

Here is an example:

# at 127
#241129  9:39:26 server id 1  end_log_pos 291 CRC32 0x248c7bd2
# Position  Timestamp   Type   Source ID        Size      Source Pos    Flags
# 0000007f ae 6f 49 67   23   01 00 00 00   a4 00 00 00   23 01 00 00   80 00
# 00000092 01 03 00 00 00 00 00 01  89 6e 78 82 18 fe 11 ef |.........nx.....|
# 000000a2 ab 88 22 22 2d 34 d4 11  00 01 00 00 00 00 00 00 |.....4..........|
# 000000b2 00 01 00 00 00 00 00 00  00 04 00 00 00 00 00 00 |................|
# 000000c2 00 89 6e 78 82 18 fe 11  ef ab 88 22 22 2d 34 d4 |..nx..........4.|
# 000000d2 11 08 64 65 6d 6f 01 00  00 00 00 00 00 00 01 00 |..demo..........|
# 000000e2 00 00 00 00 00 00 04 00  00 00 00 00 00 00 89 6e |...............n|
# 000000f2 78 82 18 fe 11 ef ab 88  22 22 2d 34 d4 11 0c 66 |x..........4...f|
# 00000102 6f 6f 62 61 72 01 00 00  00 00 00 00 00 01 00 00 |oobar...........|
# 00000112 00 00 00 00 00 03 00 00  00 00 00 00 00 d2 7b 8c |................|
# 00000122 24                                               |.|
#   Previous-GTIDs
# 896e7882-18fe-11ef-ab88-22222d34d411:1-3:demo:1-3:foobar:1-2

So 01 03 00 00 00 00 00 01 means it is a tagged GTID format (ends with 01). And that there are 3 GTIDs.

The first one:

# 00000092                          89 6e 78 82 18 fe 11 ef |.........nx.....|
# 000000a2 ab 88 22 22 2d 34 d4 11  00 01 00 00 00 00 00 00 |.....4..........|
# 000000b2 00 01 00 00 00 00 00 00  00 04 00 00 00 00 00 00 |................|
# 000000c2 00                                               |..nx..........4.|
Description Value Decoded
UUID 89 6e 78 82 18 fe 11 ef ab 88 22 22 2d 34 d4 11 896e7882-18fe-11ef-ab88-22222d34d411
Tag 00 01 00 00 00 00 00 00 empty
Start 01 00 00 00 00 00 00 1
End 00 04 00 00 00 00 00 00 4

The second one:

# 000000c2    89 6e 78 82 18 fe 11  ef ab 88 22 22 2d 34 d4 |..nx..........4.|
# 000000d2 11 08 64 65 6d 6f 01 00  00 00 00 00 00 00 01 00 |..demo..........|
# 000000e2 00 00 00 00 00 00 04 00  00 00 00 00 00 00       |...............n|

GTID_TAGGED_LOG_EVENT

There is a new event called GTID_TAGGED_LOG_EVENT (type 2a). I expected this to be basically the same as the GTID_LOG_EVENT (type 21), but with tags.

This is what a GTID_LOG_EVENT looks like:

# at 158
#241202 20:02:25 server id 1  end_log_pos 235 CRC32 0xd3cf038f
# Position  Timestamp   Type   Source ID        Size      Source Pos    Flags
# 0000009e 41 04 4e 67   21   01 00 00 00   4d 00 00 00   eb 00 00 00   00 00
# 000000b1 01 89 6e 78 82 18 fe 11  ef ab 88 22 22 2d 34 d4 |..nx..........4.|
# 000000c1 11 01 00 00 00 00 00 00  00 02 00 00 00 00 00 00 |................|
# 000000d1 00 00 01 00 00 00 00 00  00 00 fb 20 6c 30 4e 28 |............l0N.|
# 000000e1 06 ca f4 5f 01 00 8f 03  cf d3                   |..........|
#   GTID    last_committed=0    sequence_number=1   rbr_only=no original_committed_timestamp=1733166145216763   immediate_commit_timestamp=1733166145216763 transaction_length=202
# original_commit_timestamp=1733166145216763 (2024-12-02 20:02:25.216763 CET)
# immediate_commit_timestamp=1733166145216763 (2024-12-02 20:02:25.216763 CET)
/*!80001 SET @@session.original_commit_timestamp=1733166145216763*//*!*/;
/*!80014 SET @@session.original_server_version=90100*//*!*/;
/*!80014 SET @@session.immediate_server_version=90100*//*!*/;
SET @@SESSION.GTID_NEXT= '896e7882-18fe-11ef-ab88-22222d34d411:1'/*!*/;

This is what a GTID_TAGGED_LOG_EVENT looks like:

# at 158
#241202 19:53:13 server id 1  end_log_pos 240 CRC32 0x71f11620
# Position  Timestamp   Type   Source ID        Size      Source Pos    Flags
# 0000009e 19 02 4e 67   2a   01 00 00 00   52 00 00 00   f0 00 00 00   00 00
# 000000b1 02 76 00 00 02 02 25 02  dc f0 09 02 30 f9 03 22 |.v..........0...|
# 000000c1 bd 03 ad 02 21 02 44 44  5a 68 51 03 22 04 04 06 |......DDZhQ.....|
# 000000d1 0c 66 6f 6f 62 61 72 08  00 0a 04 0c 7f e2 74 90 |.foobar.......t.|
# 000000e1 0f 4e 28 06 10 3d 03 12  a3 ff 0a 20 16 f1 71    |.N............q|
#   GTID    last_committed=0    sequence_number=1   rbr_only=no original_committed_timestamp=1733165593949410   immediate_commit_timestamp=1733165593949410 transaction_length=207
# original_commit_timestamp=1733165593949410 (2024-12-02 19:53:13.949410 CET)
# immediate_commit_timestamp=1733165593949410 (2024-12-02 19:53:13.949410 CET)
/*!80001 SET @@session.original_commit_timestamp=1733165593949410*//*!*/;
/*!80014 SET @@session.original_server_version=90100*//*!*/;
/*!80014 SET @@session.immediate_server_version=90100*//*!*/;
SET @@SESSION.GTID_NEXT= '896e7882-18fe-11ef-ab88-22222d34d411:foobar:1'/*!*/;

The output looks almost identical, but if you look at the hexdump you can see that there are quite some differences, for example you don’t see the 89 6e 78 82 that is the start of the UUID of the server. It looks like a new interal library (mysql::serialization) is used for serialization. Someone wrote took the time to write some documentation for this new library, which is great. However with this documentation I haven’t been able to decode this event yet.

Note that replication protocol is documented here.

Impact

I have been doing some work on the MySQL protocol dissector in Wireshark. As wireshark only decodes the binlog into events and doesn’t decode the events themselve it was enough to just add the event name and code for the GTID_TAGGED_LOG_EVENT.

For go-mysql, which is a Go library that amongst other things parses the binlog stream, the work is still ongoing. The code needs to handle the new format of the PREVIOUS_GTIDS_LOG_EVENT, the new GTID_TAGGED_LOG_EVENT event and also be able to parse the text format of the tagged GTIDs.

There are many other tools that might use the binlog stream and face the same issues.

Also, using tagged GTIDs is a sure way to break replication between 8.4 (or 8.3?) and earlier versions of MySQL.

These are the bugs that are related to tagged GTID that I have filed: