Tuesday, December 31, 2013

Looking back on 2013

The year is almost at its end. Looking back at the past year I think it was a good year for MySQL. The 5.6 release was released in February and has been proven to be a very good release. There were many folks who reported bugs, which is good as this means they care about the product. But MySQL is not only a product, its a large ecosystem. One of the big players in the ecosystem is Oracle and this year they really participated in the MySQL community by having their engineers attend conferences which were not organized by Oracle (Like Percona Live and FOSDEM).
 
This year I couldn't attend the MySQL Connect and Percona Live conferences, but I hope to be able to attend in 2014 again. I did attend FOSDEM, which is a really nice (and different) conference.

For MariaDB it also was an interesting year as a number of  Linux distributions and customers switched from MySQL to MariaDB (and sometimes back again). I wonder what 2014 will bring for MariaDB.

The TokuDB storage engine was released as opensource this year.

The Galera replication system did get quite some attention this year and saw a 3.0 release.

I think 2014 is going to be even better for MySQL as the 5.7 release already looks great and Fabric looks promising.

Have a safe an fun new years eve! I'm going to eat an oliebol!

Sunday, November 24, 2013

The importance of multi source replication

One of the latest labs releases of Oracle MySQL brings multi source replication. This lifts the limitation found in earlier releases that a MySQL slave can only have one master.

To be fair, there were other ways of doing this already:
There are many good uses of multi source replication. You could use it to combine data from multiple shards or applications.

If MySQL is used with a loadbalancer the most easy to build setup is a 2-way multi master. This makes it possible to use the InnoDB storage engine. Using MySQL Cluster is another alternative, but MySQL Cluster uses the NDB storage engine, and might not be a supported option for your application. A MySQL Cluster setup also needs at least 4 machines to be fully redundant and MySQL Multi Master only needs two machines.

There is little intelligence required in the loadbalancer. It should write to one server and read from both servers. If the first server is unavailable then it should write to the second one. The requirement to only write to one server has to do with the fact that replication is not fully synchronous (MySQL Cluster is synchronous, and there it is supported to write to all nodes). While the might seem like a disadvantage, it can actually be helpfull to do online schema changes.

One of the drawbacks of multi master is that multi master with more than 2 nodes can be a nightmare to maintain and will probably not help you to get more performance or reliability.

Another drawback is that it is not easy to add a disaster recovery setup to a multi master setup. If you have 2 locations with 2 servers in each location and create a multi-master setup from each pair than it's not possible to get one pair to slave from another pair as each server already has a master. You could create a multi-master on one site and then have a slave-with-a-slave in the other location, but then you'll have to change that setup during or after the site failover to get to the same setup as you had in the primary location.

With multi source replication you can now create a multimaster setup which is slave of another multi-master setup.

I did do some basic tests with the labs release for multi source replication and it looks great.

The "FOR CHANNEL='chan1'" syntax works quite well, although I would have gone for "FOR CHANNEL 'chan1'" (without the equal sign). I would have been nice if MariaDB and MySQL would have used the same syntax, but unfortunately this isn't the case. (MariaDB uses "CHANGE MASTER 'master1' TO...")

For multi source replication to work you have to set both master_info_repository and relay_log_info_repository to TABLE. I only had one of these set, and the error message was not really clear about which setting was wrong.
2013-11-23T14:37:52.972108Z 1 [ERROR] Slave: Cannot create new master info structure when  repositories are of type FILE. Convert slave  repositories  to TABLE to replicate from Multiple sources.

I build a tree-node multi-master with multi source replication. Each server was a slave of the other two servers. The advantages of this setup to a regular 3 node circular setup is that you don't need to enable log-slave-updates, which can save quite some I/O on the binlog files. Also if one node breaks then the remaining two nodes will still receive updates from each other instead of only one node receiving all updates.

If you use sharding you can use multi source replication to combine two (or  more) shards into one. This is similar to how I like to do major version upgrades: first make the new setup a slave of the old setup. This gives you a window for testing the new setup.

So multi source replication is very usefull in many setups.

Saturday, November 9, 2013

MariaDB's RETURNING feature.

There is a new feature in the MariaDB 10 Beta which caught my eye: support for returning a result set on delete.

With a 'regular' DELETE operation you only get to know the number of affected rows. To get more info or actions you have to use a trigger or a foreign key. Anoter posibility is doing a SELECT and then a DELETE and with the correct transaction isolation a transactional support this will work.

With the support for the RETURNING keyword this has become easier to do and it will probably bennefit performance and save you a few roundtrips and a few lines of code.

There is already support for RETURNING in PostgreSQL. And PostgreSQL has an other nifty feature for which RETURNING really helps: CTE or common table expressions or the WITH keyword. I really hope to see CTE support in MySQL or MariaDB some day.

An example from RETURNING and CTE in PostgreSQL:
demo=# select * from t1;
 id | name  
----+-------
  1 | test1
  2 | test2
  3 | test3
  4 | test1
  5 | test2
  6 | test3
(6 rows)

demo=# WITH del_res AS (DELETE FROM t1 RETURNING id) 
demo-# SELECT CONCAT('Removed ID ',id) info FROM del_res;
     info     
--------------
 Removed ID 1
 Removed ID 2
 Removed ID 3
 Removed ID 4
 Removed ID 5
 Removed ID 6
(6 rows)

demo=# 

So my conclusion: Returning a resultset for DELETE is helpfull, and is one step in the direction of CTE support.

The next step step is to get the RETURNING keyword to work for UPDATE.

Sunday, October 27, 2013

Using the PAM authentication plugin

The procedure for using the PAM authentication plugin as documented doesn't work flawlessly on Ubuntu.

So here is how it works on Ubuntu (and probably also on other Debian based systems).

Please note that the PAM authentication plugin is an enterprise feature.

1. Make sure the plugin is loaded

This can be done by adding the following to the mysqld section of my.cnf (Don't forget to restart). You could also use INSTALL PLUGIN to load it without restart.
plugin-load=authentication_pam.so
2.  Add a user which will use the plugin
mysql> CREATE USER 'dveeden'@'localhost' IDENTIFIED WITH authentication_pam;
Query OK, 0 rows affected (0.00 sec)
3. Add a pam config file for 'mysql':
Create /etc/pam.d/mysql with the following contents:
@include common-auth
@include common-account
@include common-session-noninteractive
4. Login with the user
mysql -p --enable-cleartext-plugin
5. Verify if you're really connected as the correct user.
mysql> select user(),current_user(),@@proxy_user;
+-------------------+-------------------+--------------+
| user()            | current_user()    | @@proxy_user |
+-------------------+-------------------+--------------+
| dveeden@localhost | dveeden@localhost | NULL         |
+-------------------+-------------------+--------------+
1 row in set (0.00 sec)
If some step doesn't work then the /var/log/auth.log file can be very helpfull.
 
The plugin has many more options. It allows you to proxy users and use different PAM configurations for different users. The plugin is used on a per user basis so you could use native authentication for your application users and PAM authentication with LDAP for administrators and/or developers.
 
Please note that SHOW GRANTS does not indicate a authentication plugin and blindly copying the grant statements to another server to copy the user might result in users without password.
 
MySQL Utilities:
$ mysqluserclone  --source=usr:pwd@srv -d 'dveeden'@'localhost'
# Source on 127.0.0.1: ... connected.
# Dumping grants for user dveeden@localhost
GRANT USAGE ON *.* TO 'dveeden'@'localhost'
Percona Toolkit:
$ pt-show-grants | grep dveeden
-- Grants for 'dveeden'@'localhost'
GRANT USAGE ON *.* TO 'dveeden'@'localhost';

time for standards 2

I was a bit wrong in my previous post. MySQL 5.6 does allow you to supply a fsp with CURRENT_TIMESTAMP (thanks Roy).

mysql> SELECT CURRENT_TIMESTAMP,CURRENT_TIMESTAMP(6);
+---------------------+----------------------------+
| CURRENT_TIMESTAMP   | CURRENT_TIMESTAMP(6)       |
+---------------------+----------------------------+
| 2013-10-27 10:38:59 | 2013-10-27 10:38:59.182530 |
+---------------------+----------------------------+
1 row in set (0.00 sec)

It however feels a bit weird to me as the CURRENT_TIMESTAMP is often used without () and doesn't look like a function. So when I tried to use a CURRENT_TIMESTAMP with a fsp of 6 it was not behaving how I expected it to be:
mysql> CREATE TABLE t1 (ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP(6));
ERROR 1067 (42000): Invalid default value for 'ts'
mysql> CREATE TABLE t1 (ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP(0));
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO t1 VALUES(CURRENT_TIMESTAMP(6));
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM t1;
+---------------------+
| ts                  |
+---------------------+
| 2013-10-27 10:42:30 |
+---------------------+
1 row in set (0.00 sec)

So it didn't allow me to use a default of CURRENT_TIMESTAMP(6). It however accepted a CURRENT_TIMESTAMP with a fsp of 6, and then threw away the microseconds without any warning.
After some more investigating it turned out that there is a correct way of doing this:
mysql> CREATE TABLE t1 (ts TIMESTAMP(6) DEFAULT CURRENT_TIMESTAMP);
ERROR 1067 (42000): Invalid default value for 'ts'
mysql> CREATE TABLE t1 (ts TIMESTAMP(6) DEFAULT CURRENT_TIMESTAMP(6));
Query OK, 0 rows affected (0.02 sec)

mysql> INSERT INTO t1 VALUES(NULL);
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM t1;
+----------------------------+
| ts                         |
+----------------------------+
| 2013-10-27 10:47:01.604891 |
+----------------------------+
1 row in set (0.00 sec)

So you must specify a fsp for the column time AND the default value, then it works.

Saturday, October 26, 2013

time for standards

MySQL 5.6 includes support for microsecode timestamp resolution, which is a great new feature.

To get the current timestamp in MySQL 5.5 you could use NOW(), SYSDATE() or CURRENT_TIMESTAMP.

mysql_5.5> SELECT NOW(),SYSDATE(),CURRENT_TIMESTAMP;
+---------------------+---------------------+---------------------+
| NOW()               | SYSDATE()           | CURRENT_TIMESTAMP   |
+---------------------+---------------------+---------------------+
| 2013-10-26 15:46:24 | 2013-10-26 15:46:24 | 2013-10-26 15:46:24 |
+---------------------+---------------------+---------------------+
1 row in set (0.01 sec)

If we run the same statement in MySQL 5.6 the output is the same. This is great for compatibility, but what if we want those microsecond timestamps?

mysql_5.6> SELECT NOW(),SYSDATE(),CURRENT_TIMESTAMP;
+---------------------+---------------------+---------------------+
| NOW()               | SYSDATE()           | CURRENT_TIMESTAMP   |
+---------------------+---------------------+---------------------+
| 2013-10-26 15:47:21 | 2013-10-26 15:47:21 | 2013-10-26 15:47:21 |
+---------------------+---------------------+---------------------+
1 row in set (0.00 sec)

For the microsecond timestamps we have to specify the fsp or fractional seconds precision, which is an integer between 0 and 6.

mysql_5.6> SELECT NOW(6),SYSDATE(6),CURRENT_TIMESTAMP;
+----------------------------+----------------------------+---------------------+
| NOW(6)                     | SYSDATE(6)                 | CURRENT_TIMESTAMP   |
+----------------------------+----------------------------+---------------------+
| 2013-10-26 15:50:12.378787 | 2013-10-26 15:50:12.378892 | 2013-10-26 15:50:12 |
+----------------------------+----------------------------+---------------------+
1 row in set (0.00 sec)

Please note that you can't specify a fsp for CURRENT_TIMESTAMP.
So how do other databases behave?

PostgreSQL:
dveeden=# SELECT NOW(),CURRENT_TIMESTAMP;
              now              |              now              
-------------------------------+-------------------------------
 2013-10-26 15:55:11.548362+02 | 2013-10-26 15:55:11.548362+02
(1 row)

There is no SYSDATE() function in PostgreSQL (tested with 9.1). And you may not specify a fsp. And you get microseconds by default.

SQLite:
sqlite> select current_timestamp;
2013-10-26 13:57:57
sqlite> select strftime("%Y-%m-%d %H:%M:%f", "now"); 
2013-10-26 13:59:42.408

Version 3.7 doesn't have sysdate() or now(), only current_timestamp and no microseconds by default.

So it seems to be hard to write version and implementation tolerant SQL code. I couldn't easily find any information about what the SQL standards dictate.

There is one trick which could help in some situation:
mysql_5.5> SELECT NOW(/*!50604 6*/);
+---------------------+
| NOW()               |
+---------------------+
| 2013-10-26 16:04:04 |
+---------------------+
1 row in set (0.00 sec)
 
mysql_5.6> SELECT NOW(/*!50604 6*/);
+----------------------------+
| NOW( 6 )                   |
+----------------------------+
| 2013-10-26 16:03:37.136133 |
+----------------------------+
1 row in set (0.01 sec)

Another thrick you might think of is changing the date_time_format and time_format.

mysql_5.6> show global variables like '%time_format';
+-----------------+-------------------+
| Variable_name   | Value             |
+-----------------+-------------------+
| datetime_format | %Y-%m-%d %H:%i:%s |
| time_format     | %H:%i:%s          |
+-----------------+-------------------+
2 rows in set (0.00 sec)

But that won't work as the documentation points out:
"This variable is unused. It is deprecated as of MySQL 5.6.7 and will be removed in a future MySQL release."

Persistent statistics and partitions

Today when I was studying for the MySQL 5.6 exams.

I was studying for these two items:
  • Create and utilize table partitioning
  • Obtain MySQL metadata from INFORMATION_SCHEMA tables

The first step is to create a table, partition it with a hash.
mysql> CREATE TABLE pfoo (id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(255))
    -> PARTITION BY HASH(id) PARTITIONS 4;
Query OK, 0 rows affected (0.04 sec)

mysql> INSERT INTO pfoo(name) VALUES('test01'),('test02'),('test03'),('test04'),
    -> ('test05'),('test06'),('test07'),('test08'),('test09'),('test10'),('test11');
Query OK, 11 rows affected (0.00 sec)
Records: 11  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM pfoo;
+----+--------+
| id | name   |
+----+--------+
|  4 | test04 |
|  8 | test08 |
|  1 | test01 |
|  5 | test05 |
|  9 | test09 |
|  2 | test02 |
|  6 | test06 |
| 10 | test10 |
|  3 | test03 |
|  7 | test07 |
| 11 | test11 |
+----+--------+
11 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM information_schema.partitions 
    -> WHERE TABLE_NAME='pfoo';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          2 |
| p1             |          3 |
| p2             |          3 |
| p3             |          3 |
+----------------+------------+
4 rows in set (0.01 sec)

The sequence in the id column looks random, but it isn't. It's in partition order.
mysql> SELECT id,name,MOD(id,4) FROM pfoo;
+----+--------+-----------+
| id | name   | MOD(id,4) |
+----+--------+-----------+
|  4 | test04 |         0 |
|  8 | test08 |         0 |
|  1 | test01 |         1 |
|  5 | test05 |         1 |
|  9 | test09 |         1 |
|  2 | test02 |         2 |
|  6 | test06 |         2 |
| 10 | test10 |         2 |
|  3 | test03 |         3 |
|  7 | test07 |         3 |
| 11 | test11 |         3 |
+----+--------+-----------+
11 rows in set (0.00 sec)
So nothing new or unexpected here.
So now we're going to change the number of partitions.
mysql> ALTER TABLE pfoo PARTITION BY HASH(id) PARTITIONS 6;
Query OK, 11 rows affected (0.32 sec)
Records: 11  Duplicates: 0  Warnings: 0

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM information_schema.partitions 
    -> WHERE TABLE_NAME='pfoo';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          0 |
| p1             |          0 |
| p2             |          2 |
| p3             |          0 |
| p4             |          0 |
| p5             |          0 |
+----------------+------------+
6 rows in set (0.01 sec)

mysql> SELECT COUNT(*) FROM pfoo;
+----------+
| COUNT(*) |
+----------+
|       11 |
+----------+
1 row in set (0.00 sec)

So we've changed the number of partitions from 4 to 6. There are still 11 rows, but the information in information_schema.partitions seems to be wrong.
This is because innodb_stats_persistent is enabled.
mysql> SELECT * FROM mysql.innodb_table_stats WHERE table_name LIKE 'pfoo#%';
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| database_name | table_name | last_update         | n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| cert56        | pfoo#P#p0  | 2013-10-26 12:12:28 |      0 |                    1 |                        0 |
| cert56        | pfoo#P#p1  | 2013-10-26 12:12:28 |      0 |                    1 |                        0 |
| cert56        | pfoo#P#p2  | 2013-10-26 12:12:28 |      2 |                    1 |                        0 |
| cert56        | pfoo#P#p3  | 2013-10-26 12:12:28 |      0 |                    1 |                        0 |
| cert56        | pfoo#P#p4  | 2013-10-26 12:12:28 |      0 |                    1 |                        0 |
| cert56        | pfoo#P#p5  | 2013-10-26 12:12:28 |      0 |                    1 |                        0 |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
6 rows in set (0.00 sec)

mysql> SELECT * FROM information_schema.global_variables 
    -> WHERE variable_name='innodb_stats_persistent';
+-------------------------+----------------+
| VARIABLE_NAME           | VARIABLE_VALUE |
+-------------------------+----------------+
| INNODB_STATS_PERSISTENT | ON             |
+-------------------------+----------------+
1 row in set (0.01 sec)
The statistics can be updated by running 'ANALYZE TABLE pfoo'. Or by just waiting a few seconds as InnoDB will update the statistics automatically if innodb_stats_auto_recalc is enabled, but this is not instantaneous.

Saturday, October 19, 2013

MySQL Certification Study: Write a basic .NET application that uses MySQL

I've registered for the MySQL 5.6 beta exams, just like many more on Planet MySQL have done.

One of the topics for the MySQL 5.6 Developer exam is: "Write a basic .NET application that uses MySQL".

I have used Connector/Net a few times the last few years, but the last time I looked at some code involving Connector/Net was when I took the MySQL 5 Developer exam, which was long ago.

I'm using Linux on both my laptop and on servers for most projects. So I don't use or see many .NET code.

So I need a short refresh.

To use Connector/Net on Linux you will need to install Mono. The base for .NET is CIL, which exists in at least 4 versions: 1.0, 2.0 and 4.0. To compile C# code with mono you need the command which correspondends with the CIL version you are using.
  • CIL 1.0: mcs
  • CIL 2.0: gmcs
  • CIL 4.0: dmcs
For Ubuntu you'll need these packages:
  • libmono-system-data4.0-cil
  • libmysql-cil-dev
  • mono-runtime
  • mono-dmcs
  • pkg-config
So now we should be able to comple some C# code:
dmcs -r:System.Data.dll $(pkg-config --libs mysql-connector-net) example1.cs


And run it:
mono example1.exe


The Connector/Net Tutorial Intro on mysql.com has some example code.

I wrote some very very basic C# code based on the tutorial:
$ mono example1.exe
Connecting to MySQL...
5.6.12-log
Done.


The code:
using System;
using System.Data;

using MySql.Data;
using MySql.Data.MySqlClient;

public class Example1
{
 public static void Main(string[] args)
 {
  string connStr = "server=localhost;user=msandbox;database=test;port=5612;password=msandbox";
  MySqlConnection conn = new MySqlConnection(connStr);
  try
  {
   Console.WriteLine("Connecting to MySQL...");
   conn.Open();

   string sql = "SELECT VERSION()";
   MySqlCommand cmd = new MySqlCommand(sql, conn);
   MySqlDataReader rdr = cmd.ExecuteReader();

   while (rdr.Read())
   {
    Console.WriteLine(rdr[0]);
   }
  }
  catch (Exception ex)
  {
   Console.WriteLine(ex.ToString());
  }
  conn.Close();
  Console.WriteLine("Done.");
 }
}

Saturday, September 7, 2013

Using a loadbalancer with MySQL

A loadbalancer is used in many MySQL setups. Loadbalancing can be done with an hardware loadbalancer or with software. Software solutions tend to be a fair bit cheaper.

Loadbalancing is often implemented as an extra layer: The application connects to the loadbalancer via the network and the loadbalancer connects to the database via the network. The loadbalancing can be implemented on the application machines to get rid of the extra layer, this is better for latency (each layer adds latency) and as each application server has it's own loadbalancer it can also remove bottlenecks. The drawback are that you'll lose a single point of control and with multiple loadbalancers the number of health check queries will probably increase. The two-layer setup is often used with Connector/J and haproxy.

There are simple and more intelligent loadbalancers. There are loadbalancers which can split read and write traffic for example. Of course read/write splitting will only work for single statements as a transaction might start with reads and then start writing, though this can change if they detect read only transactions (START TRANSACTION READ ONLY). Connector/J can send read-only transactions to the slave if they are properly marked as read-only in the Java code.

The simple loadbalancers might only check if MySQL still listens on the configured TCP port by connecting to it and then disconnecting (which can increase the counters for aborted connects, etc). The more intelligent loadbalancers can logon to MySQL and query a table and check the result. The F5 BigIP loadbalancers are capable of doing this. But keep in mind that executing a query every x seconds can generate quite some load and might cause locking issues.

To mark a machine for maintenance (or as failed) there are some options:
  • Reject traffic from the loadbalancer using a firewall (iptables)
    • Rejecting all traffic might make it hard to backup/restore the machine or connect with a remote client like MySQL Workbench.
    • Dropping traffic will also work, but a reject might cause the loadbalancer to respond faster
  • Change the contents of the table to which the loadbalancer connects
    • But be carefull with replication
  • Connect to the loadbalancer via an API (e.g. F5 iControl) and change the status
    • Not every solutions does have an API
  • Manually changing the status in the loadbalancer
If you have a replication setup then the loadbalancer must be able to find the node to which it is allowed to write. This can be configured manually or the loadbalancer can check the contents of a table.

There are some great tools to manage failovers in a replication setup like mysqlfailover from the MySQL Utilities and MHA. Those utilities generally can run a pre-failover and post-failover script when a failover happens, this can be used to change the contents of a check table or make some TCP port available or unavailable.

I tend to favor the two-layer setup instead of the three-layer setup (separate loadbalancer), but I do use the three-layer setup more often. The integration between failover tools and loadbalancer could be a lot better.

And I don't like all the health checks. The database pool in the application is doing health checks, the loadbalancer is doing health checks, the failover utility is doing health checks, the monitoring is doing health checks. Some of those should be combined. And better health checks generate more load and can be ran less frequently and simple/fast health checks might not catch a failure.

Please add a comment about how you're using loadbalancers and MySQL and how you automate it.

Sunday, May 26, 2013

Playing hid-and-seek with databases

As far as I know there isn't a well accepted set of best practices for MySQL, but there are many best practices known and used by most MySQL DBA's. One of those best practices is that the datadir must not be equal to a mountpoint; it has to be a subdirectory of a mountpoint. I learned this the hard way a few years ago when I used a NetApp Filer via NFS as a data directory. The NetApp filer exposed the snapshots via a .snapshot directory. A database in MySQL is a directory, so MySQL thought that the .snapshot directory was a database. This resulted in some issues with our monitoring scripts, so we had to create a mysql_data directory and move all the databases to that directory.

For other setups directories like lost+found, .zfs, etc. gave similar issues.

In MySQL 5.6 a new feature was introduced to make it possible to make some databases hidden. To do this a ignore-db-dir option needs to be specified for each directory. On a running server the list of ignored database directories is shown in the ignore_db_dirs variable.

mysql> SHOW DATABASES LIKE 'testdir%';
+---------------------+
| Database (testdir%) |
+---------------------+
| testdir3            |
+---------------------+
1 row in set (0.00 sec)

mysql> \! ls -ld data/testdir*
drwxrwxr-x 2 dveeden dveeden 4096 May 26 10:06 data/testdir1
drwxrwxr-x 2 dveeden dveeden 4096 May 26 10:06 data/testdir2
drwxrwxr-x 2 dveeden dveeden 4096 May 26 10:06 data/testdir3
mysql> SELECT @@ignore_db_dirs;
+-------------------+
| @@ignore_db_dirs  |
+-------------------+
| testdir1,testdir2 |
+-------------------+
1 row in set (0.00 sec)

mysql> \! grep ignore-db-dir my.sandbox.cnf
ignore-db-dir='testdir1'
ignore-db-dir='testdir2'

This helps a lot if there are foreign directories in the datadir, but it's always better to not have those directories there as it might be loaded into the database with a LOAD DATA statement or it might get deleted with a DROP DATABASE statement. So the best practices is still to have a data directory which is a subdirectory of a mountpoint, not the mountpoint itself.

A hidden directory is still available and can be used, it's just not shown in the output of various commands.

Wednesday, May 15, 2013

Calculating the InnoDB free space - part 2

This is part 2, you can find part 1 here.

So in part 1 we learned how to calculate the free space within InnoDB. But unfortunately that won't always work perfectly.

The first issue: the DATA_FREE column in the INFORMATION_SCHEMA.TABLES table will not show a sum of the free space of each partition. This means that if you have innodb_file_per_table disabled and are using partitioning then you must divide DATA_FREE by the number of partitions.
This is Bug #36312.

Example:
mysql> SELECT CONCAT(T.TABLE_SCHEMA,'.',T.TABLE_NAME) AS TABLE_NAME,
    -> P.PARTITION_NAME AS PART,IBT.SPACE,IBD.PATH,T.DATA_FREE AS T_DATA_FREE,
    -> P.DATA_FREE AS P_DATA_FREE FROM INFORMATION_SCHEMA.TABLES T 
    -> LEFT JOIN INFORMATION_SCHEMA.PARTITIONS P ON P.TABLE_SCHEMA=T.TABLE_SCHEMA 
    -> AND P.TABLE_NAME=T.TABLE_NAME 
    -> LEFT JOIN INFORMATION_SCHEMA.INNODB_SYS_TABLES IBT 
    -> ON IBT.NAME=CONCAT(T.TABLE_SCHEMA,'/',T.TABLE_NAME) 
    -> OR IBT.NAME=CONCAT(T.TABLE_SCHEMA,'/',T.TABLE_NAME,'#P#',P.PARTITION_NAME) 
    -> LEFT JOIN INFORMATION_SCHEMA.INNODB_SYS_DATAFILES IBD 
    -> ON IBD.SPACE=IBT.SPACE WHERE ENGINE='InnoDB' ORDER BY T.TABLE_SCHEMA,T.TABLE_NAME;
+----------------------------+------+-------+-----------------------------------+-------------+-------------+
| TABLE_NAME                 | PART | SPACE | PATH                              | T_DATA_FREE | P_DATA_FREE |
+----------------------------+------+-------+-----------------------------------+-------------+-------------+
| innodbfreespacetest.t1     | NULL |     6 | ./innodbfreespacetest/t1.ibd      |     4194304 |     4194304 |
| innodbfreespacetest.t11    | p2   |     0 | NULL                              |   403701760 |    80740352 |
| innodbfreespacetest.t11    | p1   |     0 | NULL                              |   403701760 |    80740352 |
| innodbfreespacetest.t11    | p0   |     0 | NULL                              |   403701760 |    80740352 |
| innodbfreespacetest.t11    | p4   |     0 | NULL                              |   403701760 |    80740352 |
| innodbfreespacetest.t11    | p3   |     0 | NULL                              |   403701760 |    80740352 |
| innodbfreespacetest.t13    | NULL |    46 | ./innodbfreespacetest/t13.ibd     |     4194304 |     4194304 |
| innodbfreespacetest.t2     | NULL |     0 | NULL                              |    80740352 |    80740352 |
| innodbfreespacetest.t5     | p1   |     0 | NULL                              |   403701760 |    80740352 |
| innodbfreespacetest.t5     | p0   |     0 | NULL                              |   403701760 |    80740352 |
| innodbfreespacetest.t5     | p4   |     0 | NULL                              |   403701760 |    80740352 |
| innodbfreespacetest.t5     | p3   |     0 | NULL                              |   403701760 |    80740352 |
| innodbfreespacetest.t5     | p2   |     0 | NULL                              |   403701760 |    80740352 |
| innodbfreespacetest.t6     | p4   |    15 | ./innodbfreespacetest/t6#P#p4.ibd |           0 |           0 |
| innodbfreespacetest.t6     | p3   |    14 | ./innodbfreespacetest/t6#P#p3.ibd |           0 |           0 |
| innodbfreespacetest.t6     | p2   |    13 | ./innodbfreespacetest/t6#P#p2.ibd |           0 |           0 |
| innodbfreespacetest.t6     | p1   |    12 | ./innodbfreespacetest/t6#P#p1.ibd |           0 |           0 |
| innodbfreespacetest.t6     | p0   |    11 | ./innodbfreespacetest/t6#P#p0.ibd |           0 |           0 |
| mysql.innodb_index_stats   | NULL |     2 | ./mysql/innodb_index_stats.ibd    |           0 |           0 |
| mysql.innodb_table_stats   | NULL |     1 | ./mysql/innodb_table_stats.ibd    |           0 |           0 |
| mysql.slave_master_info    | NULL |     4 | ./mysql/slave_master_info.ibd     |           0 |           0 |
| mysql.slave_relay_log_info | NULL |     3 | ./mysql/slave_relay_log_info.ibd  |           0 |           0 |
| mysql.slave_worker_info    | NULL |     5 | ./mysql/slave_worker_info.ibd     |           0 |           0 |
+----------------------------+------+-------+-----------------------------------+-------------+-------------+
23 rows in set (0.05 sec)

This example is on MySQL 5.6.10.
Tables t1 and t13 have their own tablespace.
Table t2 is in the system tablespace.
Tables t5 and t11 are partitioned and in the system tablespace, these tables show the real DATA_FREE multiplied by the number of partitions. The DATA_FREE for individual partitions is correct.
 
For some old 5.1 versions the DATA_FREE might be show in kilobytes instead of bytes. (and there is no column in which the measurement unit size is stored)


Calculating the InnoDB free space

Recently someone asked my if it's possible to find the total free space within InnoDB. I thought this would be very easy as the INFORMATION_SCHEMA.TABLES table has a DATA_FREE column. So we could just use SELECT SUM(DATA_FREE) FROM INFORMATION_SCHEMA.TABLES couldn't we?
 
So what does the DATA_FREE column tell us? It tells us the free data within InnoDB for that particular table. A table can share a tablespace with multiple other tables.
 
The tablespace which is used by a table depends on whether the innodb_file_per_table was enabled during table creation and/or at the last time the table was rebuild (e.g. by OPTIMIZE TABLE).
 
If innodb_file_per_table was always disabled then this query probably reports the correct free space:
SELECT DATA_FREE FROM INFORMATION_SCHEMA.TABLES WHERE ENGINE='InnoDB' LIMIT 1;

This is because all tables will share 1 tablespace.
 
If innodb_file_per_table was always enabled (new default for 5.6!) then this would report the free space:
SELECT SUM(DATA_FREE) FROM INFORMATION_SCHEMA.TABLES WHERE ENGINE='InnoDB';

This is because eache table will have it's own tablespace.
 
But how could we combine these two?
 
To give this a try I create a MySQL 5.6.10 sandbox with 4 tables of which only 2 have their own tablespace.
 
My first try is to use my udf_fileexists UDF:
 
mysql> SELECT TABLE_NAME,DATA_FREE,
    -> udf_fileexists(CONCAT(TABLE_SCHEMA,'/',TABLE_NAME,'.ibd')) AS ibd_file 
    -> FROM information_schema.tables WHERE ENGINE='InnoDB';
+----------------------+-----------+----------+
| TABLE_NAME           | DATA_FREE | ibd_file |
+----------------------+-----------+----------+
| t1                   |   4194304 |        1 |
| t2                   |   4194304 |        1 |
| t3                   |  66060288 |        0 |
| t4                   |  66060288 |        0 |
| innodb_index_stats   |         0 |        1 |
| innodb_table_stats   |         0 |        1 |
| slave_master_info    |         0 |        1 |
| slave_relay_log_info |         0 |        1 |
| slave_worker_info    |         0 |        1 |
+----------------------+-----------+----------+
9 rows in set (0.02 sec)

mysql> SELECT (SELECT DATA_FREE FROM INFORMATION_SCHEMA.TABLES WHERE 
    -> NOT udf_fileexists(CONCAT(TABLE_SCHEMA,'/',TABLE_NAME,'.ibd')) 
    -> AND ENGINE='InnoDB' LIMIT 1) + (SELECT SUM(DATA_FREE) 
    -> FROM INFORMATION_SCHEMA.TABLES WHERE 
    -> udf_fileexists(CONCAT(TABLE_SCHEMA,'/',TABLE_NAME,'.ibd')) 
    -> AND ENGINE='InnoDB') AS TOTAL_DATA_FREE;
+-----------------+
| TOTAL_DATA_FREE |
+-----------------+
|        74448896 |
+-----------------+
1 row in set (0.03 sec)

 
So that works, but it requires loading a UDF. Luckily it's also possible to only use INFORMATION_SCHEMA by using the INNODB_SYS_TABLES table. This works only for 5.6 and newer.
 
mysql> SELECT T.TABLE_SCHEMA,T.TABLE_NAME,T.DATA_FREE,ST.SPACE,SD.PATH 
    -> FROM INFORMATION_SCHEMA.TABLES T LEFT JOIN INFORMATION_SCHEMA.INNODB_SYS_TABLES ST 
    -> ON ST.NAME=CONCAT(T.TABLE_SCHEMA,'/',T.TABLE_NAME) 
    -> LEFT JOIN INFORMATION_SCHEMA.INNODB_SYS_DATAFILES SD ON SD.SPACE=ST.SPACE WHERE T.ENGINE='InnoDB';
+---------------------+----------------------+-----------+-------+----------------------------------+
| TABLE_SCHEMA        | TABLE_NAME           | DATA_FREE | SPACE | PATH                             |
+---------------------+----------------------+-----------+-------+----------------------------------+
| mysql               | innodb_table_stats   |         0 |     1 | ./mysql/innodb_table_stats.ibd   |
| mysql               | innodb_index_stats   |         0 |     2 | ./mysql/innodb_index_stats.ibd   |
| mysql               | slave_relay_log_info |         0 |     3 | ./mysql/slave_relay_log_info.ibd |
| mysql               | slave_master_info    |         0 |     4 | ./mysql/slave_master_info.ibd    |
| mysql               | slave_worker_info    |         0 |     5 | ./mysql/slave_worker_info.ibd    |
| innodbfreespacetest | t1                   |   4194304 |     6 | ./innodbfreespacetest/t1.ibd     |
| innodbfreespacetest | t2                   |   4194304 |     7 | ./innodbfreespacetest/t2.ibd     |
| innodbfreespacetest | t3                   |  66060288 |     0 | NULL                             |
| innodbfreespacetest | t4                   |  66060288 |     0 | NULL                             |
+---------------------+----------------------+-----------+-------+----------------------------------+
9 rows in set (0.02 sec)

mysql> SELECT (SELECT SUM(DATA_FREE) FROM INFORMATION_SCHEMA.TABLES T 
    -> LEFT JOIN INFORMATION_SCHEMA.INNODB_SYS_TABLES ST ON 
    -> ST.NAME=CONCAT(T.TABLE_SCHEMA,'/',T.TABLE_NAME) WHERE SPACE<>0) + 
    -> (SELECT DATA_FREE FROM INFORMATION_SCHEMA.TABLES T 
    -> LEFT JOIN INFORMATION_SCHEMA.INNODB_SYS_TABLES ST 
    -> ON ST.NAME=CONCAT(T.TABLE_SCHEMA,'/',T.TABLE_NAME) 
    -> WHERE SPACE=0 LIMIT 1) AS INNODB_FREE_GLOBAL;
+--------------------+
| INNODB_FREE_GLOBAL |
+--------------------+
|           74448896 |
+--------------------+
1 row in set (0.04 sec)
Calculating the free space can be useful if you don't have autoextend enabled on your datafiles (requires innodb_file_per_table to be disabled). It can also be useful to calculate how much space could be freed by running OPTIMIZE TABLE (requires innodb_file_per_table to be enabled).
 
Conclusion:
If you're on 5.6 you can use INFORMATION_SCHEMA and for 5.5 and earlier you have to use a UDF or have (a script) to peek in the MySQL datadir.

Tuesday, May 14, 2013

MySQL User Group NL Meetup May 31 at Snow

The third meeting for the MySQL User Group NL will be hosted by Snow B.V. in the Snow office in Geldermalsen.
 
The Agenda:
  • Choosing the Best Sharding Policy - Doran Levari (ScaleBase, using a video link)
  • Performance Monitoring with Statsd and Graphite - Art van Scheppingen (Spil Games)
  • Basic MySQL performance tuning for sysadmins - Daniël van Eeden (Snow)
Please RSVP on the meetup.com page.

The user group now has more than 100 members!

Monday, April 15, 2013

MySQL: Every detail matters

Some bugs can have a high impact if it causes data corruption, security issues or simply causes MySQL to crash. But what about small bugs and small usability issues?

This entry from the MySQL 5.6.8 release notes is an interesting one:
InnoDB: On startup, MySQL would not start if there was a mismatch between the value of the innodb_log_file_size configuration option and the actual size of the ib_logfile* files that make up the redo log. This behavior required manually removing the redo log files after changing the value of innodb_log_file_size. The fix causes MySQL to write all dirty pages to disk and re-create the redo log files during startup if it detects a size mismatch. (Bug #14596550)

So previously you had to shutdown MySQL without fast shutdown enabled, then move the ib_logfile's away and start MySQL again. This is not too hard or complicated. But I have seen some serious issues related to this procedure.

Someone changed the log file size in my.cnf anticipating to execute the procedure in a day or two, but then someone else restarted MySQL. Then the MySQL storage engine was not registered, so the data was not available. Then someone thought that MySQL was at fault and restored all tables from the latest dump. As the NO_ENGINE_SUBSTITUTION sql_mode was not set the tables were restored as MyISAM tables.

This fix not only prevents issues like this but it also helps to easily automate the resize of the logfiles on multiple machines. And it makes it possible to change the logfile size with puppet.

I would like to see more fixes of small bugs and usability issues, just like the Every Detail Matters project for GNOME. It helps to keep MySQL a user friendly database.

There are more fixes like this in MySQL 5.6, for example the updated defaults.

One of the things I would like to see to be fixed is the combination of features. For example it is not possible to use:
  • partitioning and foreign keys
  • InnoDB compression with combined table spaces (or raw disk tablespaces)
  • spatial indexes and InnoDB 
  • Full text parsers plugins and InnoDB
MySQL 5.6 already fixed a number of similar issues. It's now possible to use full text indexes and InnoDB and to use read-only InnoDB.

Thursday, April 11, 2013

Instant InnoDB

I've reviewed the Instant InnoDB book. It's a good starting point if you're learning about InnoDB, which is used for most MySQL installations.
As a consultant I've seen many situations where some basic InnoDB setting (buffer pool size, log file size, file per table) can have a huge effect on the performance of the database and make it easier to manage.

The website of the book can be found here.

Wednesday, March 27, 2013

Running Percona XtraDB Cluster in a Sandbox on Ubuntu

I wanted to do some experimentation with Percona XtraDB Cluster (Galera) and I didn't want to use virtual machines. I'm already using MySQL Sandbox for many other projects so that's the natural choice.

I've downloaded the tarball for Percona XtraDB Cluster 5.5.29 and I've extracted it to ~/opt/mysql/5.5.29-pxc.

Then I've installed 3 nodes:
make_sandbox 5.5.29-pxc -- --sandbox_port 4551 \
--sandbox_directory msb_5_5_29-pxc01 

make_sandbox 5.5.29-pxc -- --sandbox_port 4552 \
--sandbox_directory msb_5_5_29-pxc02 

make_sandbox 5.5.29-pxc -- --sandbox_port 4553 \
--sandbox_directory msb_5_5_29-pxc03 

But when I try to start a node this error happens:
130327 14:21:03 [Note] WSREP: wsrep_load(): loading provider library '/home/dvee
den/mysql/5.5.29-pxc/lib/libgalera_smm.so'
130327 14:21:03 [ERROR] WSREP: wsrep_load(): dlopen(): libssl.so.10: cannot open shared object file: No such file or directory
130327 14:21:03 [ERROR] WSREP: wsrep_load(/home/dveeden/mysql/5.5.29-pxc/lib/libgalera_smm.so) failed: Invalid argument (22). Reverting to no provider.


There is no libssl.so.10 on my Ubuntu 12.10 system, but there is a libssl.so.1.0.0.

The fix for this issue is:
cd ~/opt/mysql/5.5.29-pxc/lib/
ln -s /lib/x86_64-linux-gnu/libssl.so.1.0.0 libssl.so.10
ln -s /lib/x86_64-linux-gnu/libcrypto.so.1.0.0 libcrypto.so.10

The symlinks are created inside the library directory for Percona XtraDB Cluster, this way I don't need to tinker with the global library directories.

Monday, March 18, 2013

MyISAM in a MySQL 5.6 InnoDB Only instance

With MySQL 5.5 the default storage engine was changed to InnoDB. This was a good step as most users expected MySQL to support transactions, row level locking and all the other InnoDB features, but with 5.1 they sometimes forgot to choose the right storage engine. As most databases don't have multiple storage engines and many MySQL users changed the default storage engine to InnoDB this made the switch to MySQL easier and the default behaviour more in line with what people expect from a relational database.

Changing the storage engine can of course be done on a per table or per instance basis (default-storage-engine in my.cnf). The temporary tables created with 'CREATE TEMPORARY TABLE ...' should not be forgotten. The performance of InnoDB or MyISAM for temporary tables can have quite some impact, especially with slow storage, a buffer pool which is too small to hold the complete dataset or very small temporary tables. In MySQL 5.6 there is a new variable introduced to set the default storage engine for temporary tables: default_tmp_storage_engine. This makes it possible to use MyISAM or MEMORY as default temporary storage engine, which could benefit performance in some cases.

A great new feature of MySQL 5.6 is the full text indexing support for InnoDB. Now you can switch your fulltext tables to InnoDB! (after careful testing of course as results may differ)
So do we still need MyISAM? The answer is Yes.

The system tables (mysql.*) are mostly MyISAM. The general log and slow query log tables are in CSV format. In 5.6 there are also some InnoDB tables: innodb_index_stats, innodb_table_stats and optionally slave_master_info and slave_relay_log_info. It's not supported to change these MyISAM tables to InnoDB.
mysql> SELECT engine, COUNT(*) FROM information_schema.tables
    -> WHERE table_schema='mysql' GROUP BY engine;
+--------+----------+
| engine | COUNT(*) |
+--------+----------+
| CSV    |        2 |
| InnoDB |        5 |
| MyISAM |       21 |
+--------+----------+
3 rows in set (0.00 sec)

The implicit temporary tables as created by MySQL (e.g. not with CREATE TABLE) are still in MyISAM format. This can be shown by running some query which uses 'Using temporary; Using filesort' and while the query runs looking in the tmpdir location:

Session1:
select a.user,a.host,b.user from mysql.user a, mysql.user b, mysql.user c,
mysql.user d ,mysql.user e,mysql.user f, mysql.user g, mysql.user h order by a.user,b.user

Session2:
mysql> select @@tmpdir;
+----------------------------------------+
| @@tmpdir                               |
+----------------------------------------+
| /home/dveeden/sandboxes/msb_5_6_10/tmp |
+----------------------------------------+
1 row in set (0.00 sec)

mysql> \! ls -l /home/dveeden/sandboxes/msb_5_6_10/tmp
total 4044
-rw-rw---- 1 dveeden dveeden 4120272 Mar 18 13:30 #sql_69cd_0.MYD
-rw-rw---- 1 dveeden dveeden    1024 Mar 18 13:30 #sql_69cd_0.MYI

And the third one: mysqldump uses MyISAM.
A view is first restored as a MyISAM table and after all views are restored then they are converted to 'real' views. This is needed as views might be dependent on each other.
With performance_schema it's possible to 'see' the usage of MyISAM. With 5.6 not everything is enabled by default, so first check the setup_* tables to see what's enabled and what's not. In setup_objects the system tables are disabled, which can easily give wrong results if you're interested in MyISAM usage.
mysql> SELECT * FROM performance_schema.events_waits_summary_global_by_event_name 
    -> WHERE event_name LIKE 'wait/io/file/myisam/%'\G
*************************** 1. row ***************************
    EVENT_NAME: wait/io/file/myisam/data_tmp
    COUNT_STAR: 0
SUM_TIMER_WAIT: 0
MIN_TIMER_WAIT: 0
AVG_TIMER_WAIT: 0
MAX_TIMER_WAIT: 0
*************************** 2. row ***************************
    EVENT_NAME: wait/io/file/myisam/dfile
    COUNT_STAR: 145525
SUM_TIMER_WAIT: 4793712275400
MIN_TIMER_WAIT: 0
AVG_TIMER_WAIT: 32940608
MAX_TIMER_WAIT: 2244115016
*************************** 3. row ***************************
    EVENT_NAME: wait/io/file/myisam/kfile
    COUNT_STAR: 329
SUM_TIMER_WAIT: 17011553368
MIN_TIMER_WAIT: 0
AVG_TIMER_WAIT: 51706768
MAX_TIMER_WAIT: 236143040
*************************** 4. row ***************************
    EVENT_NAME: wait/io/file/myisam/log
    COUNT_STAR: 0
SUM_TIMER_WAIT: 0
MIN_TIMER_WAIT: 0
AVG_TIMER_WAIT: 0
MAX_TIMER_WAIT: 0
4 rows in set (0.00 sec)

mysql> SELECT event_name,object_name,COUNT(*),SUM(TIMER_WAIT) FROM performance_schema.events_waits_history_long 
    -> WHERE event_name LIKE 'wait/io/file/myisam/%' GROUP BY event_name,object_name\G
*************************** 1. row ***************************
     event_name: wait/io/file/myisam/dfile
    object_name: /home/dveeden/sandboxes/msb_5_6_10/tmp/#sql_69cd_0.MYD
       COUNT(*): 84
SUM(TIMER_WAIT): 1120085576
*************************** 2. row ***************************
     event_name: wait/io/file/myisam/kfile
    object_name: /home/dveeden/sandboxes/msb_5_6_10/tmp/#sql_69cd_0.MYI
       COUNT(*): 6
SUM(TIMER_WAIT): 501999104
2 rows in set (0.02 sec)

Some possible sources of high(er) MyISAM usage:
  • The mysql.proc table might be read often if you're often calling stored procedures. (seen with 5.5). This might also be true for events.
  • Monitoring which checks things (too) often.
  • Usage of server side help
And InnoDB doesn't support spatial indexes, so if you need them you still need MyISAM. Storing spatial information can be done with InnoDB, so if you can do without index you don't need MyISAM:
mysql> CREATE TABLE geo_m (i INT NOT NULL,
    -> g GEOMETRY NOT NULL, PRIMARY KEY (i),
    -> SPATIAL KEY (g)) ENGINE=MyISAM;
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE geo_i (i INT NOT NULL,
    -> g GEOMETRY NOT NULL, PRIMARY KEY (i),
    -> SPATIAL KEY (g)) ENGINE=InnoDB;
ERROR 1464 (HY000): The used table type doesn't support SPATIAL indexes
mysql> CREATE TABLE geo_i (i INT NOT NULL,
    -> g GEOMETRY NOT NULL, PRIMARY KEY (i)
    -> ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.01 sec)
If you're interested in performance_schema, make sure to attend the MySQL Virtual Developer Day.

So MyISAM is still a critical part of MySQL and you should not just forget about it, even if you're only using InnoDB.

Wednesday, March 13, 2013

How to lose your (foreign) key?

One of the many reasons to switch from MyISAM to InnoDB (or NDB) is the support of foreign keys which helps to guarantee referential integrity.

This is how it's supposed to work:
mysql> CREATE TABLE parent (id INT AUTO_INCREMENT NOT NULL, 
    -> pname VARCHAR(100) NOT NULL, PRIMARY KEY(`id`));
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE child (id INT AUTO_INCREMENT NOT NULL, cname VARCHAR(100) NOT NULL, 
    -> pid INT NOT NULL, PRIMARY KEY(`id`), FOREIGN KEY `fk_parent_child` (`id`) REFERENCES parent (`id`));
Query OK, 0 rows affected (0.01 sec)

mysql> SHOW CREATE TABLE child\G
*************************** 1. row ***************************
       Table: child
Create Table: CREATE TABLE `child` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `cname` varchar(100) NOT NULL,
  `pid` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  CONSTRAINT `child_ibfk_1` FOREIGN KEY (`id`) REFERENCES `parent` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.01 sec)

But unfortunately it sometimes doesn't work as we expect:
mysql> CREATE TABLE parent (id INT AUTO_INCREMENT NOT NULL, 
    -> pname VARCHAR(100) NOT NULL, PRIMARY KEY(`id`));
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE child (id INT AUTO_INCREMENT NOT NULL, cname VARCHAR(100) NOT NULL, 
    -> pid INT NOT NULL REFERENCES parent(`id`), PRIMARY KEY(`id`));
Query OK, 0 rows affected (0.01 sec)

mysql> SHOW CREATE TABLE child\G
*************************** 1. row ***************************
       Table: child
Create Table: CREATE TABLE `child` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `cname` varchar(100) NOT NULL,
  `pid` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
This is a known bug: Bug #17943.

But there is yet another way of losing a foreign key:
mysql> CREATE TABLE parent (id INT AUTO_INCREMENT NOT NULL, 
    -> pname VARCHAR(100) NOT NULL, PRIMARY KEY(`id`));
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE child (id INT AUTO_INCREMENT NOT NULL, cname VARCHAR(100) NOT NULL COMMENT 'Parent\'s name', 
    -> pid INT NOT NULL, PRIMARY KEY(`id`), FOREIGN KEY `fk_parent_child` (`id`) REFERENCES parent (`id`));
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW CREATE TABLE child\G 
*************************** 1. row ***************************
       Table: child
Create Table: CREATE TABLE `child` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `cname` varchar(100) NOT NULL COMMENT 'Parent''s name',
  `pid` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
This is also a known bug: Bug #68367

And the third way to lose a foreign key:
mysql> CREATE TABLE parent (id INT AUTO_INCREMENT NOT NULL, 
    -> pname VARCHAR(100) NOT NULL, PRIMARY KEY(`id`));                                                    
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE child (id INT AUTO_INCREMENT NOT NULL, cname VARCHAR(100) NOT NULL, 
    -> pid INT NOT NULL, PRIMARY KEY(`id`), FOREIGN KEY `fk_parent_child` (`id`) REFERENCES parent (`id`));
Query OK, 0 rows affected (0.01 sec)

mysql> SHOW CREATE TABLE child\G 
*************************** 1. row ***************************
       Table: child
Create Table: CREATE TABLE `child` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `cname` varchar(100) NOT NULL,
  `pid` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

mysql> ALTER TABLE child ENGINE=InnoDB;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> SHOW CREATE TABLE child\G
*************************** 1. row ***************************
       Table: child
Create Table: CREATE TABLE `child` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `cname` varchar(100) NOT NULL,
  `pid` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
This is not a bug, but it sometimes happens (especially with 5.1 and earlier). The developer creates a table and the realized he/she didn't specify the correct storage engine. This is well documented and well known.

So be careful and check if the key is where you expect it to be. With information_schema the information about foreign keys is not only available with SHOW statements but also as a table.
mysql> SELECT * FROM information_schema.referential_constraints WHERE CONSTRAINT_SCHEMA='fktest'\G
*************************** 1. row ***************************
       CONSTRAINT_CATALOG: def
        CONSTRAINT_SCHEMA: fktest
          CONSTRAINT_NAME: child_ibfk_1
UNIQUE_CONSTRAINT_CATALOG: def
 UNIQUE_CONSTRAINT_SCHEMA: fktest
   UNIQUE_CONSTRAINT_NAME: PRIMARY
             MATCH_OPTION: NONE
              UPDATE_RULE: RESTRICT
              DELETE_RULE: RESTRICT
               TABLE_NAME: child
    REFERENCED_TABLE_NAME: parent
1 row in set (0.00 sec)

MySQL 5.5's new features

The recently released MySQL 5.6 gets a lot of attention, but for those who are still on 5.5 there is also good news: There are two new features in 5.5.

The first feature is that there are more INFORMATION_SCHEMA tables for InnoDB. This means that it's possible to 'see' what's in the buffer pool. It also makes it possible to get more information about the LRU list.

From the 5.5.28 changelog:
InnoDB: Certain information_schema tables originally introduced in MySQL 5.6 are now also available in MySQL 5.5 and MySQL 5.1: INNODB_BUFFER_PAGE, INNODB_BUFFER_PAGE_LRU, and INNODB_BUFFER_POOL_STATS. (Bug #13113026)

This is in the "Bugs Fixed" section instead of the "Functionality Added or Changed" section, which is a bit weird in my opinion.

The second feature is a variable which makes it possible to get more information about the deadlocks.

From the 5.5.30 changelog:
InnoDB: The innodb_print_all_deadlocks configuration option from MySQL 5.6 was backported to MySQL 5.5. This option records each deadlock condition in the MySQL error log, allowing easier troubleshooting if frequent deadlocks point to application coding issues. (Bug #14515889)

So updating to the latest 5.5 (or even  5.1) is not only good for security, stability and performance but it also brings some nice new features.

Monday, March 11, 2013

How MySQL 5.6 handles passwords more securely

There are many thing changed in MySQL 5.6 which are related to passwords:
  • There is a new password hash algorithm (SHA-256)
  • There is obfuscation for passwords with the .mylogin.cnf file.
  • The option to store slave passwords in a database table.
  • It's now possible to supply a password to START SLAVE.
But that's not what this blog post is about.

This blog post is a great new feature: Hiding passwords from your log files, automatically.

MySQL 5.6 will by default hide passwords from the general log. This is not just obfuscation as only the one-way hash will be put in the log files. By setting log-raw=OFF you can disable password hiding for the general log. The log-raw setting will only influence the general log, so the passwords in the slow query log and the binary logs will still be hidden.

With MySQL 5.5 this could be done manually by first storing the hash in a variable, but like many other work-arounds this is not needed anymore.

But it doesn't stop there. According to the manual the mysql client now won't log statements which match "*IDENTIFIED*:*PASSWORD*".

But don't forget that MySQL Server logfiles and MySQL Client logfiles should still be protected with the correct file permissions and the same is true for all other files like master.info. And if you store slave credentials inside the database these should be protected by using the MySQL privileges system, but that's even true if master.info is used as someone with too many database privileges might just load the contents with LOAD DATA INFILE.

Of course there there is still room for improvement:
It would be nice if we could easily enable connection security. MySQL and many other applications use TLS, but that has some performance overhead and isn't really easy to setup. And MySQL is still not secure-by-default, but easy-by-default...

So MySQL 5.6 improves security without needing any special configuration.

Thursday, February 28, 2013

MySQL Events

The last MySQL User Group NL meeting was last Friday. It's always nice to learn about MySQL and meet other MySQL users. There were two presentations: one about MySQL User Defined Functions (UDF's) and one about MySQL TCO. The slides are available from the meetup page.

There are already a number of MySQL events announced for the next few months. I'll only list events in the Netherlands and Virtual events.

MySQL Virtual Developer Days
This is a virtual event which will take place on March 19 (EMEA region, NA event is on March 12). There are many interesting topics: Performance Schema, New 5.6 Features, Replication, MySQL Enterprise Monitor
The eVite

SkySQL and MariaDB roadshow
21 March 2013 in Amsterdam
This event  have presentations by Monty Widenius, Seppo Jaakola and
Anders Karlsson. The main topics are MariaDB and Galera.
The event page


MySQL User Group NL, Q2 Meeting
31 May 2013 in Geldermalsen
The event page


Monday, January 28, 2013

Avoid clear text passwords in MySQL logging.

What happens when you use the PASSWORD() function to insert a password hash into a table?
  • The hash will be written to the table
  • The password might be written in clear text to the binlog
  • The password might be written in clear text to the general log
  • The password might be written in clear text to the slow query log
The query
mysql [localhost] {msandbox} (test) > INSERT INTO testpwd(pwd) VALUES(PASSWORD('secret_password'));
Query OK, 1 row affected (0.00 sec)

The General log
130128 16:04:41     1 Query     INSERT INTO testpwd(pwd) VALUES(PASSWORD('secret_password'))

The Slow query log
# Time: 130128 16:04:41
# User@Host: msandbox[msandbox] @ localhost []
# Query_time: 0.004887  Lock_time: 0.001043 Rows_sent: 0  Rows_examined: 0
SET timestamp=1359385481;
INSERT INTO testpwd(pwd) VALUES(PASSWORD('secret_password'));

The binlog:
# at 219
#130128 16:04:41 server id 1  end_log_pos 287  Query thread_id=1 exec_time=0 error_code=0
SET TIMESTAMP=1359385481/*!*/;
BEGIN
/*!*/;
# at 287
#130128 16:04:41 server id 1  end_log_pos 315  Intvar
SET INSERT_ID=1/*!*/;
# at 315
#130128 16:04:41 server id 1  end_log_pos 438  Query thread_id=1 exec_time=0 error_code=0
SET TIMESTAMP=1359385481/*!*/;
INSERT INTO testpwd(pwd) VALUES(PASSWORD('secret_password'))
/*!*/;
# at 438
#130128 16:04:41 server id 1  end_log_pos 465  Xid = 8
COMMIT/*!*/;

It's possible to do it in a more secure way by using variables.

The query
mysql [localhost] {msandbox} (test) > SET @pwd := PASSWORD('another_secret_password');
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (test) > INSERT INTO testpwd(pwd) VALUES(@pwd);
Query OK, 1 row affected (0.01 sec)

The General log
130128 16:05:18     1 Query     SET @pwd := PASSWORD('another_secret_password')
130128 16:05:30     1 Query     INSERT INTO testpwd(pwd) VALUES(@pwd)

The Slow query log
# Time: 130128 16:05:18
# User@Host: msandbox[msandbox] @ localhost []
# Query_time: 0.000251  Lock_time: 0.000000 Rows_sent: 0  Rows_examined: 0
SET timestamp=1359385518;
SET @pwd := PASSWORD('another_secret_password');
# Time: 130128 16:05:30
# User@Host: msandbox[msandbox] @ localhost []
# Query_time: 0.003031  Lock_time: 0.000288 Rows_sent: 0  Rows_examined: 0
SET timestamp=1359385530;
INSERT INTO testpwd(pwd) VALUES(@pwd);

The binlog
# at 465
#130128 16:05:30 server id 1  end_log_pos 533  Query thread_id=1 exec_time=0 error_code=0
SET TIMESTAMP=1359385530/*!*/;
BEGIN
/*!*/;
# at 533
#130128 16:05:30 server id 1  end_log_pos 561  Intvar
SET INSERT_ID=2/*!*/;
# at 561
#130128 16:05:30 server id 1  end_log_pos 638  User_var
SET @`pwd`:=_utf8 0x2A41424645453834453346463233423442454338323635383832433244383141414536363744363235 COLLATE `utf8_general_ci`/*!*/;
# at 638
#130128 16:05:30 server id 1  end_log_pos 738  Query thread_id=1 exec_time=0 error_code=0
SET TIMESTAMP=1359385530/*!*/;
INSERT INTO testpwd(pwd) VALUES(@pwd)
/*!*/;
# at 738
#130128 16:05:30 server id 1  end_log_pos 765  Xid = 10
COMMIT/*!*/;


An other trick is to set binlog_format to ROW:
# at 555
#130128 16:30:47 server id 1  end_log_pos 623   Query   thread_id=2     exec_time=0     error_code=0
SET TIMESTAMP=1359387047/*!*/;
BEGIN
/*!*/;
# at 623
# at 672
#130128 16:30:47 server id 1  end_log_pos 672   Table_map: `test`.`testpwd` mapped to number 33
#130128 16:30:47 server id 1  end_log_pos 752   Write_rows: table id 33 flags: STMT_END_F

BINLOG '
p5kGURMBAAAAMQAAAKACAAAAACEAAAAAAAEABHRlc3QAB3Rlc3Rwd2QAAggPAv8AAg==
p5kGURcBAAAAUAAAAPACAAAAACEAAAAAAAEAAv/8BQAAAAAAAAApKkNFNDdGODRBOThDOTI0ODdC
RjI5MUM1QzIyNTY3ODg0RjAxMjdGM0U=
'/*!*/;
# at 752
#130128 16:30:47 server id 1  end_log_pos 779   Xid = 19
COMMIT/*!*/;


And even with the "-v" option for mysqlbinlog the passwords won't show with row based binlogs:
### INSERT INTO `test`.`testpwd`
### SET
###   @1=5
###   @2='*CE47F84A98C92487BF291C5C22567884F0127F3E'


With 5.6 hiding passwords should get done automatically and for all the log types. Unfortunately I didn't get it to work yet. (Bug #68200)

To prevent logging of passwords you need to
  • Disable the slow query log
  • Disable the general log
  • Use variables or disable the binlog
  • Or wait on the 5.6 GA version


Friday, January 18, 2013

Installing Multicorn on RHEL6

The Multicorn project makes it possible to write Foreign Data Wrappers for PostgreSQL in Python.

To install Multicorn on RHEL6 the following is needed:
  • PostgreSQL 9.2
  • Python 2.7
  • make, GCC, etc.
Installing PostgreSQL 9.2 is easy as it's available in the PostgreSQL Yum repository.

Unfortunately Python 2.7 is not included in RHEL6. And replacing the 'system' python is a bad idea.

The solution is to do an 'altinstall' of Python. The "--shared" and ucs4 options are required. The altinstall will install a python binary with the name python2.7 instead of just python. This allows you to have multiple python versions on 1 system.

wget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tgz
tar zxf Python-2.7.3.tgz
cd Python-2.7.3
./configure --shared --enable-unicode=ucs4
make
make altinstall

This will result in a /usr/local/bin/python2.7 which doesn't work. This is due to the fact that the libraries are installed /usr/local/lib, which is not in the library path. This can be solved by modifying the library path.
echo "/usr/local/lib/" > /etc/ld.so.conf.d/usr-local-lib.conf
ldconfig

Installing PostgreSQL 9.2 is easy:
yum install postgresql92-server postgresql92-devel

Installing with PGXN will not work as this commit is not yet included. So we'll have to clone the git repo.
git clone git://github.com/Kozea/Multicorn.git
cd Multicorn

sed -i 's/^PYEXEC = python$/PYEXEC = python2.7/' Makefile
export PATH="$PATH:/usr/pgsql-9.2/bin"
make
sudo make install

The step to set the PATH is not in the Multicorn documentation, but it's needed as pg_config is not in our path. 

The sed line is needed to force the use of python2.7 instead of python (which is 2.6).

And then we can load the extension in the database.
postgres=# CREATE EXTENSION multicorn;
CREATE EXTENSION


And then we can use the examples from the documentation:
postgres=# CREATE SERVER csv_srv foreign data wrapper multicorn options (
postgres(#     wrapper 'multicorn.csvfdw.CsvFdw'
postgres(# );
CREATE SERVER
postgres=# create foreign table csvtest (
postgres(#        year numeric,
postgres(#        make character varying,
postgres(#        model character varying,
postgres(#        length numeric
postgres(# ) server csv_srv options (
postgres(#        filename '/tmp/test.csv',
postgres(#        skip_header '1',
postgres(#        delimiter ',');
CREATE FOREIGN TABLE
postgres=# select * from csvtest;
ERROR:  Error in python: """

DETAIL:  [Errno 2] No such file or directory: '/tmp/test.csv'
postgres=# \! echo -en 'Year,Make,Model,Length\n2013,Volkswagen,Golf,3\n2013,Volkswagen,Passat,3' > /tmp/test.csv
postgres=# select * from csvtest;
 year |    make    | model  | length
------+------------+--------+--------
 2013 | Volkswagen | Golf   |      3
 2013 | Volkswagen | Passat |      3
(2 rows)

Tuesday, January 15, 2013

How to install PGXN on RHEL6

Installing PGXN on RHEL 6.3 is not as easy as it might sound.

First you need to install the PostgreSQL yum repo:
rpm -ivh http://yum.postgresql.org/9.2/redhat/rhel-6.3-x86_64/pgdg-redhat92-9.2-7.noarch.rpm

Then you need to install pgxnclient:
yum install pgxnclient

The pgxn client has 2 dependencies which are not listed in the package:
  • setuptools
  • simplejson 2.1
To satisfy the first dependency we need to install python-setuptools
yum install python-setuptools

The second one is not that easy as the simplejson version in RHEL6.3 is 2.0, which is too old. We can use PIP to install a newer version:
yum remove python-simplejson
yum install python-pip python-devel
python-pip install simplejson

And now the pgxn command will work.

MySQL version history (updated)

I've created a graph about the MySQL version history.

It's mysql-graph-history on github.


Please let me know if this is correct or if I'm forgetting some versions.

Wednesday, January 2, 2013

Untrusted downloads and MySQL

When the MySQL version from your distribution isn't good enough you need to download the latest Oracle MySQL. There are more possibilities like Percona Server and MariaDB, but that's not what this post is about.

The dowload site for MySQL is https://www.mysql.com/downloads/mysql/ and contains to a mirror.php script which ask you if you like to login with a Oracle Web Account and then chooses a mirror for you. You don't have to login. Then you will be redirected to the chosen mirror. In my case this is https://cdn.mysql.com

Firefox will give you a "This Connection is Untrusted" dialog. If you click on "Technical details" it will show the following error:

cdn.mysql.com uses an invalid security certificate.

The certificate is only valid for the following names:
  a248.e.akamai.net , *.akamaihd.net , *.akamaihd-staging.net 

(Error code: ssl_error_bad_cert_domain)

The Qualys SSL Labs confirm the mismatch between the site name and the certificate name. The results can be found here.

There is a good reason this didn't show up when Oracle did testing: I'm using the HTTPS-Everywhere add-on from EFF. So I can't really blame them.

What about just disabling the HTTPS-Everywhere add-on for this site or just adding an exception? That's also not a real solution.

The download page is available both with and without SSL. There are MD5 sum's on the website to verify the integrity of the downloaded files.

To verify if there isn't tampered with the downloaded files you need to verify the MD5 sums. This must be done with the MD5 sum's on the HTTPS site as that guarantees that there isn't tampered with the MD5 sums.

MD5 sums are no longer considered attack proof. It's possible to get collisions (2 different files with the same md5sum).

Verifying the MD5 sums for all files is quite a bit of work.

Some things Oracle could improve in the security of the download service:
1. Make sure that cdn.mysql.com has a valid SSL certificate
2. Always redirect users to the HTTPS download site.
3. Make the md5sums available in 1 file (which can be checked with md5sum and/or fciv.exe, both tools need different file formats)
4. Also make sha1sums available (which can be used by sha1sum and fciv.exe)
5. Sign the checksum files and/or the files themselves with PGP/GPG
6. Fix RFE Bug #60833 and provide YUM and APT repositories (which already supply everything for automatic verification using GPG) Repositories also allow easier updates to new version (e.g. in case of security).

Also I do think that most people are not careful enough with downloading files securely. This is almost like just clicking on every e-mail attachment you get.

There is also good news: www.mysql.com gets a really really high score in the Qualys SSL Labs: The results are here.

Tuesday, January 1, 2013

How to install MySQL succesfully on a Raspberry Pi

For those starting to learn MySQL and want to install it on a Raspberry Pi: there is a little gotcha you should be aware of.

Disclaimer: Only run command if you know what it does. Always make sure you have backups of your important data.

First you need to put the official Raspbian image on your SD card and then boot the Raspberry Pi.

Then if you try to install mysql with "sudo apt-get install mysql-server-5.5" this will fail. The reason for this is that the filesystem on the SD card will become full. This can be seen by running the "df -h" command.

The SD card is probably 4GB. The filesystem will be around 1.9GB. This was done to make it fit on 2GB cards. To stretch the filesystem to complete 4GB you need to run "sudo raspi-config" and choose the "expand_rootfs" option. Then you need to reboot.

Now "df -h" should tell you that the filesystem has much more free space.

But your MySQL installation failed. So you run "sudo apt-get purge mysql-server-5.5" and remove the mysql-server-5.5 package and all related files.

If you would try to reinstall MySQL with "sudo apt-get install mysql-server-5.5" it would fail again. This is due to the data in /var/lib/mysql which belongs to the failed installation. You should remove that data "rm -rf /var/lib/mysql" and then install mysql again "sudo apt-get install mysql-server-5.5".

Now you should have a basic MySQL installation. You could use the "mysql_secure_installation" command to secure your installation.