Sunday, January 1, 2017

The mysql client, and some improvements

The mysql client is a tool which I use every day as a DBA. I think it's a great tool. When I used a client of several other SQL and NoSQL databases I was quickly reminded of all the features of the mysql client. Note that psql (PostgreSQL client) is also very nice.

Some other interesting things about the mysql client: It is build from the same mysql-server repository as MySQL Server. The source is in client/mysql.cc. In addition to the server version it also reports 14.14 as its version. The previous version (14.13) was around the time of MySQL 5.1, so this version is mostly meaningless.
If you start it it identifies itself as "MySQL monitor", not to be confused with MySQL Enterprise Monitor.
The version of the client is not tightly coupled with the server, in most situations a 5.6 client works fine with a 5.7 server and vice versa. Note that there might be some minor annoyances if you use an older client with a newer server. For example: the 5.6 client doesn't know about the new hint syntax, and considers the hint to be just a comment. And comments are stripped by default, which results in the situation that the hint is not sent to the server.

But there are some situations where the MySQL client has some limitations.

The first one is that the 'pager' option doesn't work on Windows. The pager command is very useful (e.g. less, grep, etc). And cmd.exe isn't the best terminal emulator ever.. using a third party terminal emulator or PowerShell fixes that somewhat. And with PowerShell there are some other issues you might run into: MySQL uses UTF-8, and PowerShell uses UTF-16. While both can do charset conversions, this often makes things more difficult (for example: Bug #74817).

And if you're working with spatial data, images or stored procedures then the mysql client is often not very helpful. The graphical client, MySQL Workbench, is often much better suited in these cases. It has syntax highlighting, a spatial viewer and an image viewer. It allows you to edit a SQL script and then execute it and edit it again and run it again. I you try to do this with the history of the mysql client then the formatting gets lost. For working with SQL procedures, triggers, events, etc the solution is to edit it with your favourite editor and then source it. But for images and spatial data you often really have to use Workbench or something like QGIS.

Besides the CLI vs GUI difference there are some more differences in how most people use both tools. For Workbench it is installed on a the client workstation and then uses a remote connection to the server. Workbench supports both the native SSL/TLS protocol and can tunnel through SSH.
The mysql client supports SSL/TLS, but doesn't support SSH tunnelling. Which is ok, because you can just run it on the server.
This also has implications on configuration: The mysql client only needs to know how to connect to the local server. Workbench needs configuration for every server. This makes the mysql client more useful if you are managing a large set of machines.

One of the more annoying situations with the mysql client is that you quickly want to select a row from a table or run that select query which was reported as being slow. So you ssh to the server and run the query... and then you suddenly get a lot of 'weird' characters on you screen. This happens if you have binary columns (BLOB, varbinary, geometry) to store IP addresses, locations, binary UUID's, photos, etc.
I made a patch to fix that. With the patch binary data is printed with hex literals (e.g. 0x08080404 for the binary version of 8.8.4.4). So this doesn't break your terminal anymore and also allows you to copy the value to the subsequent query.

mysql> select * from t1;
+----+------------------------------------+
| id | ip                                 |
+----+------------------------------------+
|  1 | 0x00000000000000000000000000000001 |
|  2 | 0x7F000001                         |
|  3 | 0x08080808                         |
|  4 | 0x08080404                         |
+----+------------------------------------+
4 rows in set (0.00 sec)

mysql> show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `ip` varbinary(16) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

This might raise the question: why not display them as an IP address instead? I did make a patch to do that. The patch triggers this display if the column is varbinary with a length which matches an IPv4 or IPv6 address. But we might store IP addresses in columns with other names and we might store values which are not an IP, but have the same length. This would require a lot of configuration and configuration options. And this would need more work for geometry types, binary UUID's etc. So for now I decided not to take that route.
It would be nice if the server would allow you to define an 'ip6' datatype which is just an alias for varbinary(16), but would be sent to the client. This could also be done with something like "SELECT c1::ip6" in the query. Or the server really has to define UUID, and IP types. Or user defined types. Or both.

mysql> select id,hex(ip),ip from t1\G
*************************** 1. row ***************************
     id: 1
hex(ip): 00000000000000000000000000000001
     ip: INET6_ATON('::1')
*************************** 2. row ***************************
     id: 2
hex(ip): 7F000001
     ip: INET6_ATON('127.0.0.1')
2 rows in set (0.00 sec)

Also somewhat belonging in this list: I made a patch in 2015 which replaces the drawing characters (+ for corners, - for horizontal lines, | for vertical lines) with unicode drawing characters.

mysql> DESC mysql.func;
╭───────┬──────────────────────────────┬──────┬─────┬─────────┬───────╮
│ Field │ Type                         │ Null │ Key │ Default │ Extra │
├───────┼──────────────────────────────┼──────┼─────┼─────────┼───────┤
│ name  │ char(64)                     │ NO   │ PRI │         │       │
│ ret   │ tinyint(1)                   │ NO   │     │ 0       │       │
│ dl    │ char(128)                    │ NO   │     │         │       │
│ type  │ enum('function','aggregate') │ NO   │     │ NULL    │       │
╰───────┴──────────────────────────────┴──────┴─────┴─────────┴───────╯
4 rows in set (0.00 sec)

I also made a patch to report the runtime with more detail (e.g 0.004 instead of 0.00).

mysql> select sleep(0.123);
+--------------+
| sleep(0.123) |
+--------------+
|            0 |
+--------------+
1 row in set (0.123 sec)

I also once made a patch to set the terminal title.

And what about the future? I don't know, the mysql client might be replaced with MySQL Shell (mysqlsh), but for that to happen mysqlsh needs many improvements. MySQL Workbench could replace some of it if it gets the capability to easily connect to many similar servers without much configuration. But should it? iTerm2 (macOS) now allows you to display images in the terminal, so if more terminal emulators would get this feature then it might make sense to get a image and geometry viewer in the client..

Please leave a comment with your experience with the mysql client and which features you would like to see.

Sunday, September 25, 2016

Common Table Expressions in MySQL

In a recent labs release a new feature was introduced by Oracle, or actually two very related new features were introduced. The first new feature is Common Table Expressions (CTEs), which is also known as WITH. The second feature is recursive CTEs, also known as WITH RECURSIVE.

An example of WITH:

WITH non_root_users AS (SELECT User, Host FROM mysql.user WHERE User<>'root')
SELECT Host FROM non_root_users WHERE User = ?

The non-CTE equivalent is this:

SELECT Host FROM 
    (SELECT User, Host FROM mysql.user WHERE User<>'root') non_root_users
WHERE User = ?

This makes it easier to understand the query, especially if there are many subqueries.

Besides using regular subqueries or CTEs you could also put the subquery in a view, but this requires more privileges. It is also difficult to change the views later on as other quieries might have started to use them.

But views are still very useful. You can make it easier for others to query data or you can use views to restrict access to certain rows.

So CTEs are basically views which are bound to a query. This makes it easier to write complex queries in a way that they are easy to understand. So don't expect CTEs to replace views.

In the PostgreSQL world CTEs existed since version 8.4 (2009) and it is used a lot.

There are some cool things PostgreSQL allows you to do with CTEs and MySQL doesn't:

test=# create table t1 (id serial, name varchar(100));
CREATE TABLE
test=# insert into t1(name) values ('foo'),('bar');
INSERT 0 2
test=# with deleted_names as (delete from t1 where id = 2 returning name)
test-# select name from deleted_names;
 name 
------
 bar
(1 row)

The blog post has more details and examples about recursive CTEs, the second new feature.

One of the examples is generating a range of numbers.

If you're familiar with PostgreSQL that will remind you of the generate_series function. This function can be used to generate a series of intergers or timestamps. So I tried to make a stored procedure which together with the recursive CTE support would emulate generate_series in MySQL, but no such luck as you can't return a table from a stored fuction yet.

In the PostgreSQL world CTEs are also used to trick the optimizer but note that this depends on the specific CTE implementation, so don't assume this trick will work in MySQL.

MariaDB has some support for the RETURNING keyword and in MariaDB 10.2 (not yet released) there is CTE support. Support for recursive CTEs is not yet present, see MDEV-9864 for the progress.

If you want to see the progress of MySQL and MariaDB on other modern SQL features check out this page.

Thursday, September 15, 2016

About Oracle MySQL and CVE-2016-6662

The issue

On 12 September 2016 (three days ago) a MySQL security vulnerability was announced. The CVE id is CVE-2016-6662.

There are 3 claims:
  1. By setting malloc-lib in the configuration file access to an OS root shell can be gained.
  2. By using the general log a configuration file can be written in any place which is writable for the OS mysql user.
  3. By using SELECT...INTO DUMPFILE... it is possible to elevate privileges from a database user with the FILE privilege to any database account including root.

How it is supposed to be used

  1. Find an SQL Injection in a website or otherwise gain access to a MySQL account.
  2. Now create a trigger file (requires FILE privilege)
  3. Now in the trigger or otherwise use SET GLOBAL general_log_file etc to create a my.cnf in the datadir with the correct privileges. Directly using SELECT...INTO DUMPFILE...won't work as that would result in the wrong permissions, which would cause mysqld/mysqld_safe to ignore that file.
  4. Now wait someone/something to restart MySQL (upgrade, daily cold backup, etc) and a shell will be available on a port number and IP address chosen by the attacker.

How it is fixed

The document claims "Official patches for the vulnerability are not available at this time for Oracle MySQL server. ", but that isn't true.

From the 5.7.15 release notes:
  • mysqld_safe attempted to read my.cnf in the data directory, although that is no longer a standard option file location. (Bug #24482156)
  • For mysqld_safe, the argument to --malloc-lib now must be one of the directories /usr/lib, /usr/lib64, /usr/lib/i386-linux-gnu, or /usr/lib/x86_64-linux-gnu. In addition, the --mysqld and --mysqld-version options can be used only on the command line and not in an option file. (Bug #24464380)
  • It was possible to write log files ending with .ini or .cnf that later could be parsed as option files. The general query log and slow query log can no longer be written to a file ending with .ini or .cnf. (Bug #24388753)
The last two items are also in the 5.6.33 release notes.

So 2 out of the 3 vulnerabilities are patched. So the obvious advice is to upgrade.

Further steps to take

But there are more things you can do to further secure your setup.

Check if your my.cnf file(s) are writable for the mysql user

/etc/my.cnf, /etc/mysql/my.cnf /etc/mysql/my.cnf.d/* should NOT be writable for the mysql user. Make sure these are owned by root and mode 644.

Put an empty my.cnf in your datadir and make sure it has the above mentioned privileges. The vulnerability document also mentions a .my.cnf in the datadir, so also make that an empty file.

Review accounts with the FILE privilege:

Run this query and drop accounts or revoke the file privilege from them if they don't really need it.
SELECT GRANTEE FROM INFORMATION_SCHEMA.USER_PRIVILEGES WHERE PRIVILEGE_TYPE='FILE';

Isolate services

Don't run all services on one machine. Isolate services from each other. So put the webserver and database server on separate (virtual) machines or containers.

Use a firewall. If the database suddenly starts to listen on a weird port the attacker should not be able to connect to it. This can be a host based firewall like iptables and a network device. Yes an IDS might be able to detect the network shell, but running an IDS/IPS needs serious amount of time and doesn't give any guarantees.

Prepare for the next vulnerability

This is not only for MySQL, but also for other parts of your stack (OS, webserver, etc).

Make sure the configuration is secured properly for each service. A helpful resource here are the benchmark documents from the Center for Internet Security.

Wednesday, September 14, 2016

Visualizing the MySQL Bug Tide

On the MySQL Bugs website there are some tide stats available. These show rate of bug creation.

I've put them in a graph:
I made these with this IPython Notebook. There are more detailed graphs per version in the notebook.

Update: The version in the notebook now uses the same range for the Y axis and has a marker for the GA dates of each release.

Saturday, March 12, 2016

Re: JSON document fast lookup with MySQL 5.7

This is a response to the JSON document fast lookup with MySQL 5.7 article by Frederic Descamp.

It is very easy to also use MySQL Workbench and the new GeoJSON support to actually show the features.

My query:
SELECT ST_GeomFromGeoJSON(feature->"$.geometry",2) AS feature 
FROM test_features WHERE street='BEACH' ;

The result:
 

Tuesday, January 26, 2016

When simple SQL can be complex

I think SQL is a very simple language, but ofcourse I'm biased.

But even a simple statement might have more complexity to it than you might think.

Do you know what the result is of this statement?
SELECT FALSE = FALSE = TRUE;
scroll down for the answer.



























The answer is: it depends.

You might expect it to return false because the 3 items in the comparison are not equal. But that's not the case.

In PostgreSQL this is the result:
postgres=# SELECT FALSE = FALSE = TRUE;
 ?column? 
----------
 t
(1 row)
So it compares FALSE against FALSE, which results in TRUE and then That is compared against TRUE, which results in TRUE. PostgreSQL has proper boolean literals.

Next up is MySQL:
mysql> SELECT FALSE = FALSE = TRUE;
+----------------------+
| FALSE = FALSE = TRUE |
+----------------------+
|                    1 |
+----------------------+
1 row in set (0.00 sec)
This is similar but it's slightly different. The result is 1 because in MySQL TRUE and FALSE evalueate to 0 and 1. If you use BOOLEAN in your DDL this will be changed to tinyint(1). But note that the (1) is only the display width and doesn't change the storage space (tinyint is 1 byte).

And SQLite has yet another result:
sqlite> SELECT FALSE = FALSE = TRUE;
Error: no such column: FALSE
This is because SQLite doesn't have a boolean type and you're expected to use 0 and 1.
If we use the suggested solution we get the same result as with MySQL.
sqlite> SELECT 0 = 0 = 1;
1

What about the SQL standard?

There is a boolean literal in the SQL:1999 standard according to this Wikipedia article. Note that 1999 is 17 years ago. It is an optional feature so it isn't required. Note that a boolean can have 3 values according to the standard. It can be TRUE, FALSE or UNKNOWN. It suggests that the UNKNOWN literal may evaluate to NULL. Neither MySQL, PostgreSQL or SQLite implements the UNKNOWN literal.

What about commercial databases?

DB2, Oracle and SQL Server don't have a boolean type according to this webpage. For DB2 this has changed, according to this page from IBM BOOLEAN support was added in DB2 9.7.0. It supports TRUE, FALSE and NULL, but not UNKNOWN for what I can see.
Ingres 10.0 has full standards complient support for BOOLEAN according to their wiki.

Interestingly enough there are multiple suggestions about what to use when there is no boolean type: BIT, CHAR(1), NUMBER(1). This blogpost from Peter Zaitsev also lists another option: CHAR(0).

So even something simple as a boolean might be less portable than you might have thought it was.

But what about doing a real three-way compare in SQL?

One solution would be to use the & operator:
postgres=# SELECT FALSE::int & FALSE::int & TRUE::int;
 ?column? 
----------
        0
(1 row)
 
mysql [(none)] > SELECT FALSE & FALSE & TRUE;
+----------------------+
| FALSE & FALSE & TRUE |
+----------------------+
|                    0 |
+----------------------+
1 row in set (0.00 sec)
 
sqlite> SELECT 0 & 0 & 1;
0

Saturday, January 9, 2016

Using Connector/J with Python

With Python you would normally use MySQL Connector/Python or the older MySQLdb to connect from Python to MySQL, but there are more options.

There are also multiple Python implementations: CPython (the main implementation), PyPy, Jython and IronPython. PyPy tries to be faster than CPython by using a Just-in-Time compiler. Jython runs on the JVM and IronPython runs on the .NET CLR.

Connector/Python by default (Without the C Extension) is a pure Python implementation and can work with most if not all implementations. And for MySQLdb there is a drop-in replacement called PyMySQL, which is a pure python implementation.

So there are many options already. But for at least Jython it is also possible to use a Java (JDBC) driver.

But why would you use a different Python implementation? There are multiple reasons for that:

  • Speed. PyPy can be faster and Jython has no Global Interpreter Lock (GIL) , which can allow for more concurrent execution.
  • To access 'native' code. e.g. call Java code from Jython or C# from IronPython.
  • Use existing infrastructure. You can deploy a Jython application on Tomcat.
  • Create testcases, healthchecks etc. which uses the same settings and infrastucture as your Java application with the benefits of a scripting language.

I wanted to test how Connector/J behaves with regards to TLS (the successor of SSL).

Setup

The first step is to get Jython, and Connector/J on your system. On Fedora 23 this is easily done with a dnf install jython mysql-connector-java.

Then I used MySQL Sandbox to setup a MySQL 5.7.10 sandbox. To enable TLS I did a ./my sql_ssl_rsa_setup, which is the Sandbox version of mysql_ssl_rsa_setup. If you have a pre-5.7 version then you can use mysslgen instead.

To convert the CA certificate from the PEM format to the Java Key Store (JKS) format I used keytool.

$ keytool -importcert -trustcacerts -file ca.pem -keystore /tmp/mysql57keystore.jks
Enter keystore password:  
Re-enter new password: 
Owner: CN=MySQL_Server_5.7.10_Auto_Generated_CA_Certificate
Issuer: CN=MySQL_Server_5.7.10_Auto_Generated_CA_Certificate
Serial number: 1
Valid from: Fri Jan 08 16:23:16 CET 2016 until: Mon Jan 05 16:23:16 CET 2026
Certificate fingerprints:
     MD5:  B5:B5:2B:53:5C:91:A2:6A:64:B5:C9:12:85:A0:CE:CC
     SHA1: 85:F1:AB:14:15:33:65:A8:71:4D:00:A6:C6:FC:8F:7F:BE:95:BA:B0
     SHA256: CB:B9:D5:BC:26:76:37:3A:66:67:99:95:5B:3B:8E:95:84:6C:A4:5F:52:39:EF:2A:23:36:6E:AB:B0:3E:81:E0
     Signature algorithm name: SHA256withRSA
     Version: 1
Trust this certificate? [no]:  yes
Certificate was added to keystore

Then I had to set my CLASSPATH in order for Jython to find Connector/J.

$ export CLASSPATH=/usr/share/java/mysql-connector-java.jar

Running the test

I used this to test the database connetion:

#!/usr/bin/jython
from __future__ import with_statement
from com.ziclix.python.sql import zxJDBC
from java.lang import System

System.setProperty("javax.net.ssl.trustStore","/tmp/mysql57keystore.jks");
System.setProperty("javax.net.ssl.trustStorePassword","msandbox");

jdbc_url = 'jdbc:mysql://127.0.0.1:18785/test?useSSL=true'
with zxJDBC.connect(jdbc_url, 'msandbox', 'msandbox', 'com.mysql.jdbc.Driver') as c:
    with c.cursor() as cur:
        cur.execute('SHOW SESSION STATUS LIKE \'Ssl_%\'')
        for result in cur:
            print('%-40s: %s' % result)
    raw_input('Press any key to continue...')

This resulted in a working connection. From the database side it looks like this:

mysql> SELECT ATTR_NAME, ATTR_VALUE FROM
    -> performance_schema.session_connect_attrs WHERE PROCESSLIST_ID=40;
+------------------+----------------------+
| ATTR_NAME        | ATTR_VALUE           |
+------------------+----------------------+
| _runtime_version | 1.8.0_65             |
| _client_version  | 5.1.36-SNAPSHOT      |
| _client_name     | MySQL Connector Java |
| _client_license  | GPL                  |
| _runtime_vendor  | Oracle Corporation   |
+------------------+----------------------+
5 rows in set (0.00 sec)

mysql> SELECT * FROM performance_schema.status_by_thread WHERE 
    -> THREAD_ID=(SELECT THREAD_ID FROM performance_schema.threads
    -> WHERE PROCESSLIST_ID=40) and VARIABLE_NAME LIKE 'Ssl_version';
+-----------+---------------+----------------+
| THREAD_ID | VARIABLE_NAME | VARIABLE_VALUE |
+-----------+---------------+----------------+
|        65 | Ssl_version   | TLSv1          |
+-----------+---------------+----------------+
1 row in set (0.00 sec)

I wanted to see if upgrading Connector/J would change anythong. So I downloaded the latest release and change my CLASSPATH to only include that.

mysql> SELECT ATTR_NAME, ATTR_VALUE FROM
    -> performance_schema.session_connect_attrs WHERE PROCESSLIST_ID=45;
+------------------+----------------------+
| ATTR_NAME        | ATTR_VALUE           |
+------------------+----------------------+
| _runtime_version | 1.8.0_65             |
| _client_version  | 5.1.38               |
| _client_name     | MySQL Connector Java |
| _client_license  | GPL                  |
| _runtime_vendor  | Oracle Corporation   |
+------------------+----------------------+
5 rows in set (0.00 sec)

mysql> SELECT * FROM performance_schema.status_by_thread WHERE
    -> THREAD_ID=(SELECT THREAD_ID FROM performance_schema.threads
    -> WHERE PROCESSLIST_ID=45) and VARIABLE_NAME LIKE 'Ssl_version';
+-----------+---------------+----------------+
| THREAD_ID | VARIABLE_NAME | VARIABLE_VALUE |
+-----------+---------------+----------------+
|        70 | Ssl_version   | TLSv1.1        |
+-----------+---------------+----------------+
1 row in set (0.00 sec)

And it did. Connector/J 5.1.38 uses TLSv1.1 instead of TLSv1.0