Sunday, September 25, 2016

Common Table Expressions in MySQL

In a recent labs release a new feature was introduced by Oracle, or actually two very related new features were introduced. The first new feature is Common Table Expressions (CTEs), which is also known as WITH. The second feature is recursive CTEs, also known as WITH RECURSIVE.

An example of WITH:

WITH non_root_users AS (SELECT User, Host FROM mysql.user WHERE User<>'root')
SELECT Host FROM non_root_users WHERE User = ?

The non-CTE equivalent is this:

SELECT Host FROM 
    (SELECT User, Host FROM mysql.user WHERE User<>'root') non_root_users
WHERE User = ?

This makes it easier to understand the query, especially if there are many subqueries.

Besides using regular subqueries or CTEs you could also put the subquery in a view, but this requires more privileges. It is also difficult to change the views later on as other quieries might have started to use them.

But views are still very useful. You can make it easier for others to query data or you can use views to restrict access to certain rows.

So CTEs are basically views which are bound to a query. This makes it easier to write complex queries in a way that they are easy to understand. So don't expect CTEs to replace views.

In the PostgreSQL world CTEs existed since version 8.4 (2009) and it is used a lot.

There are some cool things PostgreSQL allows you to do with CTEs and MySQL doesn't:

test=# create table t1 (id serial, name varchar(100));
CREATE TABLE
test=# insert into t1(name) values ('foo'),('bar');
INSERT 0 2
test=# with deleted_names as (delete from t1 where id = 2 returning name)
test-# select name from deleted_names;
 name 
------
 bar
(1 row)

The blog post has more details and examples about recursive CTEs, the second new feature.

One of the examples is generating a range of numbers.

If you're familiar with PostgreSQL that will remind you of the generate_series function. This function can be used to generate a series of intergers or timestamps. So I tried to make a stored procedure which together with the recursive CTE support would emulate generate_series in MySQL, but no such luck as you can't return a table from a stored fuction yet.

In the PostgreSQL world CTEs are also used to trick the optimizer but note that this depends on the specific CTE implementation, so don't assume this trick will work in MySQL.

MariaDB has some support for the RETURNING keyword and in MariaDB 10.2 (not yet released) there is CTE support. Support for recursive CTEs is not yet present, see MDEV-9864 for the progress.

If you want to see the progress of MySQL and MariaDB on other modern SQL features check out this page.

Thursday, September 15, 2016

About Oracle MySQL and CVE-2016-6662

The issue

On 12 September 2016 (three days ago) a MySQL security vulnerability was announced. The CVE id is CVE-2016-6662.

There are 3 claims:

By setting malloc-lib in the configuration file access to an OS root shell can be gained.
By using the general log a configuration file can be written in any place which is writable for the OS mysql user.
By using SELECT...INTO DUMPFILE... it is possible to elevate privileges from a database user with the FILE privilege to any database account including root.

How it is supposed to be used

Find an SQL Injection in a website or otherwise gain access to a MySQL account.
Now create a trigger file (requires FILE privilege)
Now in the trigger or otherwise use SET GLOBAL general_log_file etc to create a my.cnf in the datadir with the correct privileges. Directly using SELECT...INTO DUMPFILE...won't work as that would result in the wrong permissions, which would cause mysqld/mysqld_safe to ignore that file.
Now wait someone/something to restart MySQL (upgrade, daily cold backup, etc) and a shell will be available on a port number and IP address chosen by the attacker.

How it is fixed

The document claims "Official patches for the vulnerability are not available at this time for Oracle MySQL server. ", but that isn't true.

From the 5.7.15 release notes:

mysqld_safe attempted to read my.cnf in the data directory, although that is no longer a standard option file location. (Bug #24482156)
For mysqld_safe, the argument to --malloc-lib now must be one of the directories /usr/lib, /usr/lib64, /usr/lib/i386-linux-gnu, or /usr/lib/x86_64-linux-gnu. In addition, the --mysqld and --mysqld-version options can be used only on the command line and not in an option file. (Bug #24464380)
It was possible to write log files ending with .ini or .cnf that later could be parsed as option files. The general query log and slow query log can no longer be written to a file ending with .ini or .cnf. (Bug #24388753)

The last two items are also in the 5.6.33 release notes.

So 2 out of the 3 vulnerabilities are patched. So the obvious advice is to upgrade.

Further steps to take

But there are more things you can do to further secure your setup.

Check if your my.cnf file(s) are writable for the mysql user

/etc/my.cnf, /etc/mysql/my.cnf /etc/mysql/my.cnf.d/* should NOT be writable for the mysql user. Make sure these are owned by root and mode 644.

Put an empty my.cnf in your datadir and make sure it has the above mentioned privileges. The vulnerability document also mentions a .my.cnf in the datadir, so also make that an empty file.

Review accounts with the FILE privilege:

Run this query and drop accounts or revoke the file privilege from them if they don't really need it.
SELECT GRANTEE FROM INFORMATION_SCHEMA.USER_PRIVILEGES WHERE PRIVILEGE_TYPE='FILE';

Isolate services

Don't run all services on one machine. Isolate services from each other. So put the webserver and database server on separate (virtual) machines or containers.

Use a firewall. If the database suddenly starts to listen on a weird port the attacker should not be able to connect to it. This can be a host based firewall like iptables and a network device. Yes an IDS might be able to detect the network shell, but running an IDS/IPS needs serious amount of time and doesn't give any guarantees.

Prepare for the next vulnerability

This is not only for MySQL, but also for other parts of your stack (OS, webserver, etc).

Make sure the configuration is secured properly for each service. A helpful resource here are the benchmark documents from the Center for Internet Security.

Wednesday, September 14, 2016

Visualizing the MySQL Bug Tide

On the MySQL Bugs website there are some tide stats available. These show rate of bug creation.

I've put them in a graph:

I made these with this IPython Notebook. There are more detailed graphs per version in the notebook.

Update: The version in the notebook now uses the same range for the Y axis and has a marker for the GA dates of each release.

Saturday, March 12, 2016

Re: JSON document fast lookup with MySQL 5.7

This is a response to the JSON document fast lookup with MySQL 5.7 article by Frederic Descamp.

It is very easy to also use MySQL Workbench and the new GeoJSON support to actually show the features.

My query:
SELECT ST_GeomFromGeoJSON(feature->"$.geometry",2) AS feature
FROM test_features WHERE street='BEACH' ;

The result:

Tuesday, January 26, 2016

When simple SQL can be complex

I think SQL is a very simple language, but ofcourse I'm biased.

But even a simple statement might have more complexity to it than you might think.

Do you know what the result is of this statement?

SELECT FALSE = FALSE = TRUE;

scroll down for the answer.

The answer is: it depends.

You might expect it to return false because the 3 items in the comparison are not equal. But that's not the case.

In PostgreSQL this is the result:

postgres=# SELECT FALSE = FALSE = TRUE;
 ?column? 
----------
 t
(1 row)

So it compares FALSE against FALSE, which results in TRUE and then That is compared against TRUE, which results in TRUE. PostgreSQL has proper boolean literals.

Next up is MySQL:

mysql> SELECT FALSE = FALSE = TRUE;
+----------------------+
| FALSE = FALSE = TRUE |
+----------------------+
|                    1 |
+----------------------+
1 row in set (0.00 sec)

This is similar but it's slightly different. The result is 1 because in MySQL TRUE and FALSE evalueate to 0 and 1. If you use BOOLEAN in your DDL this will be changed to tinyint(1). But note that the (1) is only the display width and doesn't change the storage space (tinyint is 1 byte).

And SQLite has yet another result:

sqlite> SELECT FALSE = FALSE = TRUE;
Error: no such column: FALSE

This is because SQLite doesn't have a boolean type and you're expected to use 0 and 1.
If we use the suggested solution we get the same result as with MySQL.

sqlite> SELECT 0 = 0 = 1;
1

What about the SQL standard?

There is a boolean literal in the SQL:1999 standard according to this Wikipedia article. Note that 1999 is 17 years ago. It is an optional feature so it isn't required. Note that a boolean can have 3 values according to the standard. It can be TRUE, FALSE or UNKNOWN. It suggests that the UNKNOWN literal may evaluate to NULL. Neither MySQL, PostgreSQL or SQLite implements the UNKNOWN literal.

What about commercial databases?

DB2, Oracle and SQL Server don't have a boolean type according to this webpage. For DB2 this has changed, according to this page from IBM BOOLEAN support was added in DB2 9.7.0. It supports TRUE, FALSE and NULL, but not UNKNOWN for what I can see.
Ingres 10.0 has full standards complient support for BOOLEAN according to their wiki.

Interestingly enough there are multiple suggestions about what to use when there is no boolean type: BIT, CHAR(1), NUMBER(1). This blogpost from Peter Zaitsev also lists another option: CHAR(0).

So even something simple as a boolean might be less portable than you might have thought it was.

But what about doing a real three-way compare in SQL?

One solution would be to use the & operator:

postgres=# SELECT FALSE::int & FALSE::int & TRUE::int;
 ?column? 
----------
        0
(1 row)

mysql [(none)] > SELECT FALSE & FALSE & TRUE;
+----------------------+
| FALSE & FALSE & TRUE |
+----------------------+
|                    0 |
+----------------------+
1 row in set (0.00 sec)

sqlite> SELECT 0 & 0 & 1;
0

Saturday, January 9, 2016

Using Connector/J with Python

With Python you would normally use MySQL Connector/Python or the older MySQLdb to connect from Python to MySQL, but there are more options.

There are also multiple Python implementations: CPython (the main implementation), PyPy, Jython and IronPython. PyPy tries to be faster than CPython by using a Just-in-Time compiler. Jython runs on the JVM and IronPython runs on the .NET CLR.

Connector/Python by default (Without the C Extension) is a pure Python implementation and can work with most if not all implementations. And for MySQLdb there is a drop-in replacement called PyMySQL, which is a pure python implementation.

So there are many options already. But for at least Jython it is also possible to use a Java (JDBC) driver.

But why would you use a different Python implementation? There are multiple reasons for that:

Speed. PyPy can be faster and Jython has no Global Interpreter Lock (GIL) , which can allow for more concurrent execution.
To access 'native' code. e.g. call Java code from Jython or C# from IronPython.
Use existing infrastructure. You can deploy a Jython application on Tomcat.
Create testcases, healthchecks etc. which uses the same settings and infrastucture as your Java application with the benefits of a scripting language.

I wanted to test how Connector/J behaves with regards to TLS (the successor of SSL).

Setup

The first step is to get Jython, and Connector/J on your system. On Fedora 23 this is easily done with a dnf install jython mysql-connector-java.

Then I used MySQL Sandbox to setup a MySQL 5.7.10 sandbox. To enable TLS I did a ./my sql_ssl_rsa_setup, which is the Sandbox version of mysql_ssl_rsa_setup. If you have a pre-5.7 version then you can use mysslgen instead.

To convert the CA certificate from the PEM format to the Java Key Store (JKS) format I used keytool.

$ keytool -importcert -trustcacerts -file ca.pem -keystore /tmp/mysql57keystore.jks
Enter keystore password:  
Re-enter new password: 
Owner: CN=MySQL_Server_5.7.10_Auto_Generated_CA_Certificate
Issuer: CN=MySQL_Server_5.7.10_Auto_Generated_CA_Certificate
Serial number: 1
Valid from: Fri Jan 08 16:23:16 CET 2016 until: Mon Jan 05 16:23:16 CET 2026
Certificate fingerprints:
     MD5:  B5:B5:2B:53:5C:91:A2:6A:64:B5:C9:12:85:A0:CE:CC
     SHA1: 85:F1:AB:14:15:33:65:A8:71:4D:00:A6:C6:FC:8F:7F:BE:95:BA:B0
     SHA256: CB:B9:D5:BC:26:76:37:3A:66:67:99:95:5B:3B:8E:95:84:6C:A4:5F:52:39:EF:2A:23:36:6E:AB:B0:3E:81:E0
     Signature algorithm name: SHA256withRSA
     Version: 1
Trust this certificate? [no]:  yes
Certificate was added to keystore

Then I had to set my CLASSPATH in order for Jython to find Connector/J.

$ export CLASSPATH=/usr/share/java/mysql-connector-java.jar

Running the test

I used this to test the database connetion:

#!/usr/bin/jython
from __future__ import with_statement
from com.ziclix.python.sql import zxJDBC
from java.lang import System

System.setProperty("javax.net.ssl.trustStore","/tmp/mysql57keystore.jks");
System.setProperty("javax.net.ssl.trustStorePassword","msandbox");

jdbc_url = 'jdbc:mysql://127.0.0.1:18785/test?useSSL=true'
with zxJDBC.connect(jdbc_url, 'msandbox', 'msandbox', 'com.mysql.jdbc.Driver') as c:
    with c.cursor() as cur:
        cur.execute('SHOW SESSION STATUS LIKE \'Ssl_%\'')
        for result in cur:
            print('%-40s: %s' % result)
    raw_input('Press any key to continue...')

This resulted in a working connection. From the database side it looks like this:

mysql> SELECT ATTR_NAME, ATTR_VALUE FROM
    -> performance_schema.session_connect_attrs WHERE PROCESSLIST_ID=40;
+------------------+----------------------+
| ATTR_NAME        | ATTR_VALUE           |
+------------------+----------------------+
| _runtime_version | 1.8.0_65             |
| _client_version  | 5.1.36-SNAPSHOT      |
| _client_name     | MySQL Connector Java |
| _client_license  | GPL                  |
| _runtime_vendor  | Oracle Corporation   |
+------------------+----------------------+
5 rows in set (0.00 sec)

mysql> SELECT * FROM performance_schema.status_by_thread WHERE 
    -> THREAD_ID=(SELECT THREAD_ID FROM performance_schema.threads
    -> WHERE PROCESSLIST_ID=40) and VARIABLE_NAME LIKE 'Ssl_version';
+-----------+---------------+----------------+
| THREAD_ID | VARIABLE_NAME | VARIABLE_VALUE |
+-----------+---------------+----------------+
|        65 | Ssl_version   | TLSv1          |
+-----------+---------------+----------------+
1 row in set (0.00 sec)

I wanted to see if upgrading Connector/J would change anythong. So I downloaded the latest release and change my CLASSPATH to only include that.

mysql> SELECT ATTR_NAME, ATTR_VALUE FROM
    -> performance_schema.session_connect_attrs WHERE PROCESSLIST_ID=45;
+------------------+----------------------+
| ATTR_NAME        | ATTR_VALUE           |
+------------------+----------------------+
| _runtime_version | 1.8.0_65             |
| _client_version  | 5.1.38               |
| _client_name     | MySQL Connector Java |
| _client_license  | GPL                  |
| _runtime_vendor  | Oracle Corporation   |
+------------------+----------------------+
5 rows in set (0.00 sec)

mysql> SELECT * FROM performance_schema.status_by_thread WHERE
    -> THREAD_ID=(SELECT THREAD_ID FROM performance_schema.threads
    -> WHERE PROCESSLIST_ID=45) and VARIABLE_NAME LIKE 'Ssl_version';
+-----------+---------------+----------------+
| THREAD_ID | VARIABLE_NAME | VARIABLE_VALUE |
+-----------+---------------+----------------+
|        70 | Ssl_version   | TLSv1.1        |
+-----------+---------------+----------------+
1 row in set (0.00 sec)

And it did. Connector/J 5.1.38 uses TLSv1.1 instead of TLSv1.0

Sunday, January 3, 2016

The performance of TLS with MySQL Connector/Python

I've ran a simple test to see the performance impact of TLS on MySQL connections with MySQL Connector/Python

The test results are in this Jupyter notebook.

TL;DR:

Try to reuse connections if you use TLS
Establishing TLS connections is expensive (server & client)
Improved performance might be possible in the future by using TLS Tickets

Not tested:

Difference between YaSSL and OpenSSL
Difference between Ciphersuites
Performance of larger resultsets and queries