Sunday, August 31, 2014

Putting MySQL Cluster in a container

To get more familiar with docker and to create a test setup for MySQL Cluster I created docker images for the various components of MySQL Cluster (a.k.a. NDB Cluster)

At first I created a Fedora 20 container and ran all components in one container. That worked and is quite easy to setup. But that's not how one is supposed to use docker.

So I created Dockerfile's for all components and one base image.

The base image:
  • contains the MySQL Cluster software
  • has libaio installed
  • has a mysql user and group 
  • serves as a base for the other images
The management node (ndb_mgmd) image:
  • Has ndb_mgmd as entrypoint
  • Has a config.ini for the cluster config
  • Should be started with "--name=mymgm01"
The data node (ndbmtd) image:
  • Has ndbmtd as entrypoint
  • Uses the connect string: "host=${MGM01_PORT_1186_TCP_ADDR}:1186"
  • Should be started with "--link mymgm01:mgm01" to allow it to connect to the management node.
  • You should create 2 containers of this type to create a nodegroup of 2 nodes.
The API node (mysqld) image:
  • has a my.cnf
  • Runs mysqld_safe
  • Should be started with "--link mymgm01:mgm01" to allow it to connect to the management node.
  • The ndb-connectstring is given as parameter to mysqld_safe as it comes from an environment variable. It's not possible to use environment variables from within my.cnf. Docker is supposed to also update /etc/hosts but that didn't work for me.
  • You should expose port 3306 for your application
The management client (ndb_mgm) image:
  • Runs ndb_mgm as entrypoint
  • Should be started with "--link mymgm01:mgm01" to allow it to connect to the management node.
  • Running the ndb_mgm in a container removes the need to publish port 1186 on the management server. More info here.
  • You can override the entrypoint to run other NDB utilities like ndb_desc or ndb_select_all
The images can be found on https://registry.hub.docker.com/u/dveeden/mysqlcluster72/
The Dockerfiles can be found on https://github.com/dveeden/dve-docker  

Possible improvements
  • Use hostnames in the config.ini instead of IPv4 addresses. This makes it more dynamic. But that means updating /etc/hosts or fideling with DNS.
  • Using VOLUMES in the Dockerfiles to make working with data easier.

Sunday, August 17, 2014

The new cloud backup option of MySQL Enterprise Backup

MySQL Enterprise Backup 3.10 support backups to the cloud. The only supported cloud service is Amazon S3.

When the cloud destination is used mysqlbackup will upload the backup as an image file.

You can specify all options on the commandline:
mysqlbackup --cloud-service=s3 --cloud-aws-region=eu-west-1 \
--cloud-access-key-id=AKIAJLGCPXEGVHCQD27B \
--cloud-secret-access-key=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 \
--cloud-bucket=meb_myserver --cloud-object-key=firstbackup --cloud-trace=0 \
--backup-dir=/tmp/firstbackup --backup-image=- --with-timestamp backup-to-image

But you can also put the settings in the my.cnf
[mysqlbackup_cloud]
cloud-service=s3
cloud-aws-region=eu-west-1
cloud-access-key-id=AKIAJLGCPXEGVHCQD27B
cloud-secret-access-key=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
cloud-bucket=meb_myserver
cloud-trace=0
backup-dir=/data/cloudbackup
backup-image=-
with-timestamp

The with-timestamp option is important as the backup won't start if the backup-dir already exists. This is because mysqlbackup will leave the backup directory exists after uploading the backup. The backup directory will only have meta info and the log file, not the actual backup.

By using a group suffix like _cloud you can put settings for multiple types of backups in one cnf file.

mysqlbackup --defaults-group-suffix='_cloud' \
--cloud-object-key=backup_2014081701 backup-to-image

The account you're using should have this policy to be allowed to read and write to the s3 bucket:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1408302840000",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket", 
        "s3:ListBucketMultipartUploads",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::meb_myserver/*"
      ]
    }
  ]
}

This looks like a good option to me if you're already using mysqlbackup and amazon. It would be nice if the next version would support other cloud providers (e.g. openstack swift, ceph). Implementing this should be easy for those with an s3 compatibility layer, but will probably take more time for others.

I did find some bugs (just search for tag=cloud on http://bugs.mysql.com if you're interested).

Thursday, August 7, 2014

MySQL User Group Meetup in Amsterdam

This Tuesday Markus Winand will talk at the MySQL User Group NL meetup about "Indexes: The neglected performance all-rounder".
Markus is known for the http://use-the-index-luke.com website and the SQL Performance Explained book.

Date: Tuesday August 12
Location: Marktplaats/eBay Office Amsterdam
Registration: http://www.meetup.com/MySQL-User-Group-NL/events/196440532/

Sunday, July 20, 2014

Decoding (encrypted) MySQL traffic with Wireshark

In a comment on my post about Using SSL with MySQL xiaochong zhang asked if it is possible to decode SSL/TLS encrypted MySQL traffic. The short answer is: It depends.

To test this we need a MySQL server which is SSL enabled. I used MySQL Sandbox to create a sandboxed 5.6.19 server. Then I used mysslgen to create the config and the certificates.

$ make_sandbox 5.6.19
$ ./mysslgen.py --config=sandboxes/msb_5_6_19/my.sandbox.cnf --ssldir=sandboxes/msb_5_6_19/ssl

This assumes there already is a extracted tarball of MySQL 5.6.19 in ~/mysql/5.6.19

The mysslgen.py script will return a message with the changes you should make in your mysqld and client sections of the my.sandbox.cnf file. Then restart the server to make it active.

For SSL to work we need to connect using TCP/IP instead of over a UNIX socket. So we connect with "./my sql -h 127.0.0.1". Now execute "\s" or "status" to see if we're indeed using SSL.

It probably looks like this:

mysql [127.0.0.1] {msandbox} ((none)) > \s
--------------
/home/dveeden/opt/mysql/5.6.19/bin/mysql  Ver 14.14 Distrib 5.6.19, for linux-glibc2.5 (x86_64) using  EditLine wrapper

Connection id:  3
Current database: 
Current user:  msandbox@localhost
SSL:   Cipher in use is DHE-RSA-AES256-SHA
Current pager:  stdout
Using outfile:  ''
Using delimiter: ;
Server version:  5.6.19 MySQL Community Server (GPL)
Protocol version: 10
Connection:  127.0.0.1 via TCP/IP
Server characterset: latin1
Db     characterset: latin1
Client characterset: utf8
Conn.  characterset: utf8
TCP port:  5619
Uptime:   1 hour 32 min 48 sec

Threads: 1  Questions: 18  Slow queries: 0  Opens: 67  Flush tables: 1  Open tables: 60  Queries per second avg: 0.003
--------------

Now disconnect and start the trace.

sudo tcpdump -i lo -s 65535 port 5619 -w /tmp/mysql.pcap

First connect w/o SSL: "./my sql -h 127.0.0.1 --skip-ssl". And then with SSL: "./my sql -h 127.0.0.1"
Stop the tcpdump session and start wireshark and open the mysql.pcap file. Now we can inspect the protocol. If MySQL is using the default port (3306) then wireshark will automatically decode the traffic, but now we have to use 'Decode as...' to tell wireshark this is MySQL traffic. The server greeting packet and the login request should now be visible. In the login request there are client capability flags, one of the flags indicates 'Switch to SSL after handshake' and should be set for the SSL session.



Both SSL and non-SSL sessions will use the same port and start an unencrypted session. The encrypted session will switch to SSL after the handshake. This is a bit like STARTTLS for IMAP. The current version of the MySQL protocol dissector is not (yet) aware of some of the new information in the initial handshake. So the information for the authentication plugins and connection attributes is not decoded yet. The documentation about the protocol can be found in the MySQL Internals manual.

So that's the plaintext part. Now we get to the SSL part. In my setup the default cipher suite which is used for SSL is  DHE-RSA-AES256-SHA. With OpenSSL's ciphers command we can get some more details:

$ openssl ciphers -v 'DHE-RSA-AES256-SHA'
DHE-RSA-AES256-SHA      SSLv3 Kx=DH       Au=RSA  Enc=AES(256)  Mac=SHA1

This means that SHA 1 is use for the MAC part and that AES-256 is used for encryption and the keyexchange is done with DH (Diffie-Hellman). This poses a problem as DH will generate a session key, and we don't have that in the traffic dump as it's not sent over the network. We could use gdb (and maybe a debug trace?) to get the DH keys out, but for now we have an easier solution: use a different cipher suite.

So start tcpdump again and run "./my sql -h 127.0.0.1 --ssl-cipher=AES256-SHA". This cipher uses RSA for keyexchange instead of DH. This means everything we need is send over the network or is present in the SSL certificate and/or key.

Now start wireshark again and use 'Decode as...' and choose SSL. Then go Edit→Preferences→Protocols→SSL→'RSA key list' and add the server's SSL key. Now you should see the decoded traffic.


So decoding SSL/TLS encrypted MySQL traffic is possible. But you need to have:
  • All traffic since the beginning of the TCP/IP connection
  • The server's SSL key
  • The DH session key if DH is used. (you might want to read about Forward secrecy (PFS) if you're interested in the details).

Wednesday, July 16, 2014

Oracle Critical Patch Update for MySQL

Oracle has released the Critical Patch Update (CPU) for July 2014. The Oracle MySQL Risk Matrix lists 10 security fixes.

It took me some time to understand the subcomponent names. So here is the list with the full name of each subcomponent:

SubcomponentFull name
SRFTSServer: Full Text Search
SRINFOSCServer: INFORMATION_SCHEMA
SRCHARServer: Character sets
ENARCEngine: Archive
SROPTZRServer: Optimizer
SRREPServer: Replication
SRSPServer: Stored Procecure
ENFEDEngine: Federated

I don't think there is anything really important in the list, but it might be a good trigger to update to the latest release.

Upgrading should be easy especially if you're using the APT or YUM repositories from Oracle. If you're upgrading to a new major release (e.g. from 5.5 to 5.6) then you should read the instructions in the reference manual.

Sunday, July 6, 2014

The MySQL 6.0 goodybag

After MySQL 5.1 was released work started on MySQL 5.2, but then this was renamed to MySQL 6.0. There were many new features introduced in 6.0. But then stabilizing this branch became as huge task. Eventually the decision was made to start of with a stable branch and backport the new features from 6.0. This is how many of the 6.0 features landed in 5.5 and 5.6.

So let's see which features 6.0 brought and were they landed. I'll use the What Is New in MySQL 6.0 section of the MySQL 6.0 Reference Manual for this.

  • The Falcon storage engine. This never landed anywhere as far as I know. It's not even included in the list of storage engines in the MariaDB knowledgbase. As both InnoDB and MySQL are now part of Oracle I don't see any reason for Falcon to exist anymore.
  • 4-byte utf8 and support for utf16 and utf32. This is included in MySQL 5.5 together with many other Unicode enhancements.
  • Database backup with SQL. This allows you to make backups by executing 'BACKUP DATABASE' SQL statements. This is has not landed anywhere as far as I know, but some of the code might have made it into MySQL Enterprise Backup (both use  backup_history and backup_progress tables in the mysql database). This might be an interesting thing to have, but with MEB there is not a real need for it.
  • Subquery enhancements, BKA and MRR. This all made it into MySQL 5.6.
  • LOAD XML. This made it into MySQL 5.5, but I don't think it is used often.
  • The Maria storage engine. This is a transactional MyISAM storage egine. This is not used in Oracle MySQL. It is included in MariaDB, but renamed to Aria as it might otherwise cause confusion. As far as I known there is not much development being done on Aria.
  • Foreign Keys. Of course MySQL supports foreign keys (now even with NDB!), but those are implemented in the storage engine, not in the server. This was on the roadmap for 6.x but I don't know if it was actually implemented. Implementing this in the server could make writing storage engines easier and would probably also make it easier to combine partitioning and foreign keys.

Did I forget any 6.x features? Did you ever try 6.x? Let me know in the comments!



Wednesday, July 2, 2014

Single database backup and restore with MEB

I was recently asked about if MySQL Enterprise Backup would be able to restore single databases.

My initial answer was that this was complicated, but might be doable with the Transportable Table Space (TTS) option.

But first let's go back to the basics. A common way of working with mysqldump is to get a list of databases and then loop through the databases and dump the data and schema to a SQL file. But both backups and restores will take a lot of time if the size of the database grows. And it's a luke-warm backup at best instead of a hot backup. So that's why we have MySQL Enterprise Backup.

MySQL Enterprise Backup allows you to make a hot backup of InnoDB tables by copying the datafiles while watching the InnoDB redo log files.

On disk the data from the InnoDB storage engine consists of a system tablespace (one of more ibdataX files), the redo log files (iblogfileX) and zero or more table-specific tablespace files (*.ibd).

The data dictionary data is located in the system tablespace. This is were the tablespace number is stored for each ibd file. The redo logfiles is were the changes are written to before they are written to the datafiles.

This all works fine if you use MEB to backup and restore a whole instance.

Even if you stop MySQL you can't just copy a ibd file to another server and expect it to work. This is because there might be changes for that file still in the redo logfile and the table space ID in the system tablespace might not match.

But you can copy a ibd file to another server if you follow the right procedure. This (partly) works with 5.5, and is greatly enhanced in 5.6.

This is how it works:
a FLUSH TABLES..FOR EXPORT command is issued for certain tables. The tables are then read-only. The changes in the logfile are written to the ibd file and the information from the system tablespace is written to a .cfg file. Then the .ibd and .cfg files can be copied to some other location. Then the table can be unlocked.

You could create a per-database backup with MEB with the --include-tables option. Then if will copy the specified tables' .ibd files and the system tablespace and the redo logs.

This works, but the system tablespace might get big because of many reasons. This is where TTS comes into play. This allows you to make a backup of only the tables w/o copying the system tablespace.

With the --use-tts and --include-table options I can backup all tables for one database and then restore one or more of these tables on another instance. This is without stopping the destination instance.

Some per-database defaults like the default character set are stored in the db.opt file, but this file is not copied. So be aware!

But what if we want to do a point-in-time recovery for just one database? That should work. First restore the database you're concerned about and then use mysqlbinlog with the --database option to restore the binlog entries which are for that specific database.

I haven't use per-database (point-in-time) restores with MEB in production, but it all seems to work fine in a test setup.

I've always used physical backups on a per-instance basis and then mysqldump for single-table and single-database backups. But this might be faster and the data only needs to be backupped once. Also the mysqldumps I took were not prepared for point-in-time restores as the locking required for that would block the server for too long.

Please let me know your experiences in the comments.