Daniël's Database Blog

Tuesday, December 31, 2013

Looking back on 2013

The year is almost at its end. Looking back at the past year I think it was a good year for MySQL. The 5.6 release was released in February and has been proven to be a very good release. There were many folks who reported bugs, which is good as this means they care about the product. But MySQL is not only a product, its a large ecosystem. One of the big players in the ecosystem is Oracle and this year they really participated in the MySQL community by having their engineers attend conferences which were not organized by Oracle (Like Percona Live and FOSDEM).

This year I couldn't attend the MySQL Connect and Percona Live conferences, but I hope to be able to attend in 2014 again. I did attend FOSDEM, which is a really nice (and different) conference.

For MariaDB it also was an interesting year as a number of Linux distributions and customers switched from MySQL to MariaDB (and sometimes back again). I wonder what 2014 will bring for MariaDB.

The TokuDB storage engine was released as opensource this year.

The Galera replication system did get quite some attention this year and saw a 3.0 release.

I think 2014 is going to be even better for MySQL as the 5.7 release already looks great and Fabric looks promising.

Have a safe an fun new years eve! I'm going to eat an oliebol!

Sunday, November 24, 2013

The importance of multi source replication

One of the latest labs releases of Oracle MySQL brings multi source replication. This lifts the limitation found in earlier releases that a MySQL slave can only have one master.

To be fair, there were other ways of doing this already:

Using a time based switch as described in MySQL High Availability
Using the multi source feature in the yet-to-be released MariaDB 10
Using Tungsten Replicator

There are many good uses of multi source replication. You could use it to combine data from multiple shards or applications.

If MySQL is used with a loadbalancer the most easy to build setup is a 2-way multi master. This makes it possible to use the InnoDB storage engine. Using MySQL Cluster is another alternative, but MySQL Cluster uses the NDB storage engine, and might not be a supported option for your application. A MySQL Cluster setup also needs at least 4 machines to be fully redundant and MySQL Multi Master only needs two machines.

There is little intelligence required in the loadbalancer. It should write to one server and read from both servers. If the first server is unavailable then it should write to the second one. The requirement to only write to one server has to do with the fact that replication is not fully synchronous (MySQL Cluster is synchronous, and there it is supported to write to all nodes). While the might seem like a disadvantage, it can actually be helpfull to do online schema changes.

One of the drawbacks of multi master is that multi master with more than 2 nodes can be a nightmare to maintain and will probably not help you to get more performance or reliability.

Another drawback is that it is not easy to add a disaster recovery setup to a multi master setup. If you have 2 locations with 2 servers in each location and create a multi-master setup from each pair than it's not possible to get one pair to slave from another pair as each server already has a master. You could create a multi-master on one site and then have a slave-with-a-slave in the other location, but then you'll have to change that setup during or after the site failover to get to the same setup as you had in the primary location.

With multi source replication you can now create a multimaster setup which is slave of another multi-master setup.

I did do some basic tests with the labs release for multi source replication and it looks great.

The "FOR CHANNEL='chan1'" syntax works quite well, although I would have gone for "FOR CHANNEL 'chan1'" (without the equal sign). I would have been nice if MariaDB and MySQL would have used the same syntax, but unfortunately this isn't the case. (MariaDB uses "CHANGE MASTER 'master1' TO...")

For multi source replication to work you have to set both master_info_repository and relay_log_info_repository to TABLE. I only had one of these set, and the error message was not really clear about which setting was wrong.
2013-11-23T14:37:52.972108Z 1 [ERROR] Slave: Cannot create new master info structure when repositories are of type FILE. Convert slave repositories to TABLE to replicate from Multiple sources.

I build a tree-node multi-master with multi source replication. Each server was a slave of the other two servers. The advantages of this setup to a regular 3 node circular setup is that you don't need to enable log-slave-updates, which can save quite some I/O on the binlog files. Also if one node breaks then the remaining two nodes will still receive updates from each other instead of only one node receiving all updates.

If you use sharding you can use multi source replication to combine two (or more) shards into one. This is similar to how I like to do major version upgrades: first make the new setup a slave of the old setup. This gives you a window for testing the new setup.

So multi source replication is very usefull in many setups.

Saturday, November 9, 2013

MariaDB's RETURNING feature.

There is a new feature in the MariaDB 10 Beta which caught my eye: support for returning a result set on delete.

With a 'regular' DELETE operation you only get to know the number of affected rows. To get more info or actions you have to use a trigger or a foreign key. Anoter posibility is doing a SELECT and then a DELETE and with the correct transaction isolation a transactional support this will work.

With the support for the RETURNING keyword this has become easier to do and it will probably bennefit performance and save you a few roundtrips and a few lines of code.

There is already support for RETURNING in PostgreSQL. And PostgreSQL has an other nifty feature for which RETURNING really helps: CTE or common table expressions or the WITH keyword. I really hope to see CTE support in MySQL or MariaDB some day.

An example from RETURNING and CTE in PostgreSQL:

demo=# select * from t1;
 id | name  
----+-------
  1 | test1
  2 | test2
  3 | test3
  4 | test1
  5 | test2
  6 | test3
(6 rows)

demo=# WITH del_res AS (DELETE FROM t1 RETURNING id) 
demo-# SELECT CONCAT('Removed ID ',id) info FROM del_res;
     info     
--------------
 Removed ID 1
 Removed ID 2
 Removed ID 3
 Removed ID 4
 Removed ID 5
 Removed ID 6
(6 rows)

demo=#

So my conclusion: Returning a resultset for DELETE is helpfull, and is one step in the direction of CTE support.

The next step step is to get the RETURNING keyword to work for UPDATE.

Sunday, October 27, 2013

Using the PAM authentication plugin

The procedure for using the PAM authentication plugin as documented doesn't work flawlessly on Ubuntu.

So here is how it works on Ubuntu (and probably also on other Debian based systems).

Please note that the PAM authentication plugin is an enterprise feature.

1. Make sure the plugin is loaded

This can be done by adding the following to the mysqld section of my.cnf (Don't forget to restart). You could also use INSTALL PLUGIN to load it without restart.

plugin-load=authentication_pam.so

2. Add a user which will use the plugin

mysql> CREATE USER 'dveeden'@'localhost' IDENTIFIED WITH authentication_pam;
Query OK, 0 rows affected (0.00 sec)

3. Add a pam config file for 'mysql':
Create /etc/pam.d/mysql with the following contents:

@include common-auth
@include common-account
@include common-session-noninteractive

4. Login with the user

mysql -p --enable-cleartext-plugin

5. Verify if you're really connected as the correct user.

mysql> select user(),current_user(),@@proxy_user;
+-------------------+-------------------+--------------+
| user()            | current_user()    | @@proxy_user |
+-------------------+-------------------+--------------+
| dveeden@localhost | dveeden@localhost | NULL         |
+-------------------+-------------------+--------------+
1 row in set (0.00 sec)

If some step doesn't work then the /var/log/auth.log file can be very helpfull.

The plugin has many more options. It allows you to proxy users and use different PAM configurations for different users. The plugin is used on a per user basis so you could use native authentication for your application users and PAM authentication with LDAP for administrators and/or developers.

Please note that SHOW GRANTS does not indicate a authentication plugin and blindly copying the grant statements to another server to copy the user might result in users without password.

MySQL Utilities:

$ mysqluserclone  --source=usr:pwd@srv -d 'dveeden'@'localhost'
# Source on 127.0.0.1: ... connected.
# Dumping grants for user dveeden@localhost
GRANT USAGE ON *.* TO 'dveeden'@'localhost'

Percona Toolkit:

$ pt-show-grants | grep dveeden
-- Grants for 'dveeden'@'localhost'
GRANT USAGE ON *.* TO 'dveeden'@'localhost';

time for standards 2

I was a bit wrong in my previous post. MySQL 5.6 does allow you to supply a fsp with CURRENT_TIMESTAMP (thanks Roy).

mysql> SELECT CURRENT_TIMESTAMP,CURRENT_TIMESTAMP(6);
+---------------------+----------------------------+
| CURRENT_TIMESTAMP   | CURRENT_TIMESTAMP(6)       |
+---------------------+----------------------------+
| 2013-10-27 10:38:59 | 2013-10-27 10:38:59.182530 |
+---------------------+----------------------------+
1 row in set (0.00 sec)

It however feels a bit weird to me as the CURRENT_TIMESTAMP is often used without () and doesn't look like a function. So when I tried to use a CURRENT_TIMESTAMP with a fsp of 6 it was not behaving how I expected it to be:

mysql> CREATE TABLE t1 (ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP(6));
ERROR 1067 (42000): Invalid default value for 'ts'
mysql> CREATE TABLE t1 (ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP(0));
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO t1 VALUES(CURRENT_TIMESTAMP(6));
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM t1;
+---------------------+
| ts                  |
+---------------------+
| 2013-10-27 10:42:30 |
+---------------------+
1 row in set (0.00 sec)

So it didn't allow me to use a default of CURRENT_TIMESTAMP(6). It however accepted a CURRENT_TIMESTAMP with a fsp of 6, and then threw away the microseconds without any warning.
After some more investigating it turned out that there is a correct way of doing this:

mysql> CREATE TABLE t1 (ts TIMESTAMP(6) DEFAULT CURRENT_TIMESTAMP);
ERROR 1067 (42000): Invalid default value for 'ts'
mysql> CREATE TABLE t1 (ts TIMESTAMP(6) DEFAULT CURRENT_TIMESTAMP(6));
Query OK, 0 rows affected (0.02 sec)

mysql> INSERT INTO t1 VALUES(NULL);
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM t1;
+----------------------------+
| ts                         |
+----------------------------+
| 2013-10-27 10:47:01.604891 |
+----------------------------+
1 row in set (0.00 sec)

So you must specify a fsp for the column time AND the default value, then it works.

Saturday, October 26, 2013

time for standards

MySQL 5.6 includes support for microsecode timestamp resolution, which is a great new feature.

To get the current timestamp in MySQL 5.5 you could use NOW(), SYSDATE() or CURRENT_TIMESTAMP.

mysql_5.5> SELECT NOW(),SYSDATE(),CURRENT_TIMESTAMP;
+---------------------+---------------------+---------------------+
| NOW()               | SYSDATE()           | CURRENT_TIMESTAMP   |
+---------------------+---------------------+---------------------+
| 2013-10-26 15:46:24 | 2013-10-26 15:46:24 | 2013-10-26 15:46:24 |
+---------------------+---------------------+---------------------+
1 row in set (0.01 sec)

If we run the same statement in MySQL 5.6 the output is the same. This is great for compatibility, but what if we want those microsecond timestamps?

mysql_5.6> SELECT NOW(),SYSDATE(),CURRENT_TIMESTAMP;
+---------------------+---------------------+---------------------+
| NOW()               | SYSDATE()           | CURRENT_TIMESTAMP   |
+---------------------+---------------------+---------------------+
| 2013-10-26 15:47:21 | 2013-10-26 15:47:21 | 2013-10-26 15:47:21 |
+---------------------+---------------------+---------------------+
1 row in set (0.00 sec)

For the microsecond timestamps we have to specify the fsp or fractional seconds precision, which is an integer between 0 and 6.

mysql_5.6> SELECT NOW(6),SYSDATE(6),CURRENT_TIMESTAMP;
+----------------------------+----------------------------+---------------------+
| NOW(6)                     | SYSDATE(6)                 | CURRENT_TIMESTAMP   |
+----------------------------+----------------------------+---------------------+
| 2013-10-26 15:50:12.378787 | 2013-10-26 15:50:12.378892 | 2013-10-26 15:50:12 |
+----------------------------+----------------------------+---------------------+
1 row in set (0.00 sec)

Please note that you can't specify a fsp for CURRENT_TIMESTAMP.
So how do other databases behave?

PostgreSQL:

dveeden=# SELECT NOW(),CURRENT_TIMESTAMP;
              now              |              now              
-------------------------------+-------------------------------
 2013-10-26 15:55:11.548362+02 | 2013-10-26 15:55:11.548362+02
(1 row)

There is no SYSDATE() function in PostgreSQL (tested with 9.1). And you may not specify a fsp. And you get microseconds by default.

SQLite:

sqlite> select current_timestamp;
2013-10-26 13:57:57
sqlite> select strftime("%Y-%m-%d %H:%M:%f", "now"); 
2013-10-26 13:59:42.408

Version 3.7 doesn't have sysdate() or now(), only current_timestamp and no microseconds by default.

So it seems to be hard to write version and implementation tolerant SQL code. I couldn't easily find any information about what the SQL standards dictate.

There is one trick which could help in some situation:

mysql_5.5> SELECT NOW(/*!50604 6*/);
+---------------------+
| NOW()               |
+---------------------+
| 2013-10-26 16:04:04 |
+---------------------+
1 row in set (0.00 sec)

mysql_5.6> SELECT NOW(/*!50604 6*/);
+----------------------------+
| NOW( 6 )                   |
+----------------------------+
| 2013-10-26 16:03:37.136133 |
+----------------------------+
1 row in set (0.01 sec)

Another thrick you might think of is changing the date_time_format and time_format.

mysql_5.6> show global variables like '%time_format';
+-----------------+-------------------+
| Variable_name   | Value             |
+-----------------+-------------------+
| datetime_format | %Y-%m-%d %H:%i:%s |
| time_format     | %H:%i:%s          |
+-----------------+-------------------+
2 rows in set (0.00 sec)

But that won't work as the documentation points out:
"This variable is unused. It is deprecated as of MySQL 5.6.7 and will be removed in a future MySQL release."

Persistent statistics and partitions

Today when I was studying for the MySQL 5.6 exams.

I was studying for these two items:

Create and utilize table partitioning

Obtain MySQL metadata from INFORMATION_SCHEMA tables

The first step is to create a table, partition it with a hash.

mysql> CREATE TABLE pfoo (id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(255))
    -> PARTITION BY HASH(id) PARTITIONS 4;
Query OK, 0 rows affected (0.04 sec)

mysql> INSERT INTO pfoo(name) VALUES('test01'),('test02'),('test03'),('test04'),
    -> ('test05'),('test06'),('test07'),('test08'),('test09'),('test10'),('test11');
Query OK, 11 rows affected (0.00 sec)
Records: 11  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM pfoo;
+----+--------+
| id | name   |
+----+--------+
|  4 | test04 |
|  8 | test08 |
|  1 | test01 |
|  5 | test05 |
|  9 | test09 |
|  2 | test02 |
|  6 | test06 |
| 10 | test10 |
|  3 | test03 |
|  7 | test07 |
| 11 | test11 |
+----+--------+
11 rows in set (0.00 sec)

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM information_schema.partitions 
    -> WHERE TABLE_NAME='pfoo';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          2 |
| p1             |          3 |
| p2             |          3 |
| p3             |          3 |
+----------------+------------+
4 rows in set (0.01 sec)

The sequence in the id column looks random, but it isn't. It's in partition order.

mysql> SELECT id,name,MOD(id,4) FROM pfoo;
+----+--------+-----------+
| id | name   | MOD(id,4) |
+----+--------+-----------+
|  4 | test04 |         0 |
|  8 | test08 |         0 |
|  1 | test01 |         1 |
|  5 | test05 |         1 |
|  9 | test09 |         1 |
|  2 | test02 |         2 |
|  6 | test06 |         2 |
| 10 | test10 |         2 |
|  3 | test03 |         3 |
|  7 | test07 |         3 |
| 11 | test11 |         3 |
+----+--------+-----------+
11 rows in set (0.00 sec)

So nothing new or unexpected here.
So now we're going to change the number of partitions.

mysql> ALTER TABLE pfoo PARTITION BY HASH(id) PARTITIONS 6;
Query OK, 11 rows affected (0.32 sec)
Records: 11  Duplicates: 0  Warnings: 0

mysql> SELECT PARTITION_NAME, TABLE_ROWS FROM information_schema.partitions 
    -> WHERE TABLE_NAME='pfoo';
+----------------+------------+
| PARTITION_NAME | TABLE_ROWS |
+----------------+------------+
| p0             |          0 |
| p1             |          0 |
| p2             |          2 |
| p3             |          0 |
| p4             |          0 |
| p5             |          0 |
+----------------+------------+
6 rows in set (0.01 sec)

mysql> SELECT COUNT(*) FROM pfoo;
+----------+
| COUNT(*) |
+----------+
|       11 |
+----------+
1 row in set (0.00 sec)

So we've changed the number of partitions from 4 to 6. There are still 11 rows, but the information in information_schema.partitions seems to be wrong.
This is because innodb_stats_persistent is enabled.

mysql> SELECT * FROM mysql.innodb_table_stats WHERE table_name LIKE 'pfoo#%';
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| database_name | table_name | last_update         | n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| cert56        | pfoo#P#p0  | 2013-10-26 12:12:28 |      0 |                    1 |                        0 |
| cert56        | pfoo#P#p1  | 2013-10-26 12:12:28 |      0 |                    1 |                        0 |
| cert56        | pfoo#P#p2  | 2013-10-26 12:12:28 |      2 |                    1 |                        0 |
| cert56        | pfoo#P#p3  | 2013-10-26 12:12:28 |      0 |                    1 |                        0 |
| cert56        | pfoo#P#p4  | 2013-10-26 12:12:28 |      0 |                    1 |                        0 |
| cert56        | pfoo#P#p5  | 2013-10-26 12:12:28 |      0 |                    1 |                        0 |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
6 rows in set (0.00 sec)

mysql> SELECT * FROM information_schema.global_variables 
    -> WHERE variable_name='innodb_stats_persistent';
+-------------------------+----------------+
| VARIABLE_NAME           | VARIABLE_VALUE |
+-------------------------+----------------+
| INNODB_STATS_PERSISTENT | ON             |
+-------------------------+----------------+
1 row in set (0.01 sec)

The statistics can be updated by running 'ANALYZE TABLE pfoo'. Or by just waiting a few seconds as InnoDB will update the statistics automatically if innodb_stats_auto_recalc is enabled, but this is not instantaneous.