Sunday, November 24, 2013

The importance of multi source replication

One of the latest labs releases of Oracle MySQL brings multi source replication. This lifts the limitation found in earlier releases that a MySQL slave can only have one master.

To be fair, there were other ways of doing this already:
There are many good uses of multi source replication. You could use it to combine data from multiple shards or applications.

If MySQL is used with a loadbalancer the most easy to build setup is a 2-way multi master. This makes it possible to use the InnoDB storage engine. Using MySQL Cluster is another alternative, but MySQL Cluster uses the NDB storage engine, and might not be a supported option for your application. A MySQL Cluster setup also needs at least 4 machines to be fully redundant and MySQL Multi Master only needs two machines.

There is little intelligence required in the loadbalancer. It should write to one server and read from both servers. If the first server is unavailable then it should write to the second one. The requirement to only write to one server has to do with the fact that replication is not fully synchronous (MySQL Cluster is synchronous, and there it is supported to write to all nodes). While the might seem like a disadvantage, it can actually be helpfull to do online schema changes.

One of the drawbacks of multi master is that multi master with more than 2 nodes can be a nightmare to maintain and will probably not help you to get more performance or reliability.

Another drawback is that it is not easy to add a disaster recovery setup to a multi master setup. If you have 2 locations with 2 servers in each location and create a multi-master setup from each pair than it's not possible to get one pair to slave from another pair as each server already has a master. You could create a multi-master on one site and then have a slave-with-a-slave in the other location, but then you'll have to change that setup during or after the site failover to get to the same setup as you had in the primary location.

With multi source replication you can now create a multimaster setup which is slave of another multi-master setup.

I did do some basic tests with the labs release for multi source replication and it looks great.

The "FOR CHANNEL='chan1'" syntax works quite well, although I would have gone for "FOR CHANNEL 'chan1'" (without the equal sign). I would have been nice if MariaDB and MySQL would have used the same syntax, but unfortunately this isn't the case. (MariaDB uses "CHANGE MASTER 'master1' TO...")

For multi source replication to work you have to set both master_info_repository and relay_log_info_repository to TABLE. I only had one of these set, and the error message was not really clear about which setting was wrong.
2013-11-23T14:37:52.972108Z 1 [ERROR] Slave: Cannot create new master info structure when  repositories are of type FILE. Convert slave  repositories  to TABLE to replicate from Multiple sources.

I build a tree-node multi-master with multi source replication. Each server was a slave of the other two servers. The advantages of this setup to a regular 3 node circular setup is that you don't need to enable log-slave-updates, which can save quite some I/O on the binlog files. Also if one node breaks then the remaining two nodes will still receive updates from each other instead of only one node receiving all updates.

If you use sharding you can use multi source replication to combine two (or  more) shards into one. This is similar to how I like to do major version upgrades: first make the new setup a slave of the old setup. This gives you a window for testing the new setup.

So multi source replication is very usefull in many setups.

2 comments:

  1. Only theoretically you can skip log-slave-updates. Monitoring multi-source replication using P_S tables requires GTID, which in turn requires log-slave-updates.

    ReplyDelete
    Replies
    1. For this test I was using "SHOW SLAVE STATUS FOR CHANNEL ='c1'", which for me seems to behave as expected. I wasn't using GTIDs. The RECEIVED_TRANSACTION_SET column in performance_schema.replication_connection_status remained empty. So this indeed P_S monitoring seems to require GTIDs.
      mysql> SELECT * FROM performance_schema.replication_connection_status\G
      *************************** 1. row ***************************
      CHANNEL_NAME: c1
      SOURCE_UUID: d65f823f-544b-11e3-a88a-f0def1356547
      THREAD_ID: 23
      SERVICE_STATE: ON
      RECEIVED_TRANSACTION_SET:
      LAST_ERROR_NUMBER: 0
      LAST_ERROR_MESSAGE:
      LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
      *************************** 2. row ***************************
      CHANNEL_NAME: c2
      SOURCE_UUID: e0f99c98-544b-11e3-a88a-f0def1356547
      THREAD_ID: 25
      SERVICE_STATE: ON
      RECEIVED_TRANSACTION_SET:
      LAST_ERROR_NUMBER: 0
      LAST_ERROR_MESSAGE:
      LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00
      2 rows in set (0.01 sec)

      mysql> SELECT * FROM performance_schema.replication_execute_status\G
      *************************** 1. row ***************************
      CHANNEL_NAME: c1
      SERVICE_STATE: ON
      REMAINING_DELAY: NULL
      *************************** 2. row ***************************
      CHANNEL_NAME: c2
      SERVICE_STATE: ON
      REMAINING_DELAY: NULL
      2 rows in set (0.00 sec)

      mysql> SELECT USER,COMMAND,STATE FROM information_schema.processlist ORDER BY STATE\G
      *************************** 1. row ***************************
      USER: msandbox
      COMMAND: Query
      STATE: executing
      *************************** 2. row ***************************
      USER: rsandbox
      COMMAND: Binlog Dump
      STATE: Master has sent all binlog to slave; waiting for binlog to be up
      *************************** 3. row ***************************
      USER: rsandbox
      COMMAND: Binlog Dump
      STATE: Master has sent all binlog to slave; waiting for binlog to be up
      *************************** 4. row ***************************
      USER: system user
      COMMAND: Connect
      STATE: Slave has read all relay log; waiting for the slave I/O thread t
      *************************** 5. row ***************************
      USER: system user
      COMMAND: Connect
      STATE: Slave has read all relay log; waiting for the slave I/O thread t
      *************************** 6. row ***************************
      USER: system user
      COMMAND: Connect
      STATE: Waiting for master to send event
      *************************** 7. row ***************************
      USER: system user
      COMMAND: Connect
      STATE: Waiting for master to send event
      7 rows in set (0.01 sec)




      Delete