Monday, August 14, 2023

Tutorial: Add a QRCODE() function to TiDB

The code for this tutorial is available here.

Objectives

This tutorial demonstrates how easy it is to add a new function to TiDB that can be used in a SQL-statement. We want to add a function for creating QRCodes.

TiDB is aiming to be compatible with MySQL. However as the functionality we’re adding doesn’t exist in MySQL this isn’t a real concern.

For reference here is the architecture of a TiDB cluster:


We’re only going to add the function to TiDB (in red in the above image), which means this can’t be pushed down to TiKV or TiFlash, however for this function that’s fine.

The TiDB Development Guide has a page that describes some of this and more.

When adding functions to TiDB you should aim for things that can be merged into upstream TiDB instead of running and maintaining your own fork. Might be good to discuss your plans in a GitHub issue before actually starting to do any work.

Step 1: Adding the function to the parser

For this open parser/ast/functions.go and add this:

// List scalar function names.
const (
...
    Qrcode          = "qrcode"
...
)

Here you need to use the lowercase function name.

Step 2: Register the function class

// funcs holds all registered builtin functions. When new function is added,
// check expression/function_traits.go to see if it should be appended to
// any set there.
var funcs = map[string]functionClass{
...
    ast.Qrcode:          &qrcodeFunctionClass{baseFunctionClass{ast.Qrcode, 1, 1}},
...
}

Here we use the Qrcode that we defined in parser/ast/functions.go. The qrcodeFunctionClass is something that we’ll create in the next step.

Step 3: Create the function class

We could put this in any of the existing expression/builtin_*.go files, but for this demonstration we create a new file: expression/builtin_qrcode.go.

In this file we’ll define the function class qrcodeFunctionClass and the builtinQrcodeSig function implementation.

The function class looks like this:

type qrcodeFunctionClass struct {
        baseFunctionClass
}

func (c *qrcodeFunctionClass) getFunction(ctx sessionctx.Context, args []Expression) (builtinFunc, error) {
        if err := c.verifyArgs(args); err != nil {
                return nil, err
        }

        bf, err := newBaseBuiltinFuncWithTp(ctx, c.funcName, args, types.ETString, types.ETString)
        if err != nil {
                return nil, err
        }

        types.SetBinChsClnFlag(bf.tp)
        sig := &builtinQrcodeSig{bf}
        return sig, nil
}

The getFunction() method verifies the arguments and then helps to find the right function implementation (only one option here) and takes care of things like setting the character set and collation flag to binary via types.SetBinChsClnFlag(). If our function would have multiple argument types (integer, string, etc) or would have multiple optional arguments this function would have been more complex.

Note that the arguments for newBaseBuiltinFuncWithTp(): - Sets the context to ctx. - Sets the function name to c.funcName. - Sets the arguments to args. - Sets the return type to types.ETString. - Sets the argument types to a single argument of types.ETString.

And the function implementation looks like this:

type builtinQrcodeSig struct {
        baseBuiltinFunc
}

func (b *builtinQrcodeSig) Clone() builtinFunc {
        newSig := &builtinQrcodeSig{}
        newSig.cloneFrom(&b.baseBuiltinFunc)
        return newSig
}

func (b *builtinQrcodeSig) evalString(row chunk.Row) (string, bool, error) {
        d, isNull, err := b.args[0].EvalString(b.ctx, row)
        if isNull || err != nil {
                return d, isNull, err
        }
        qrc, err := qrcode.New(d)
        if err != nil {
                fmt.Printf("could not generate QRCode: %v", err)
                return "", isNull, err
        }

        var bb bytes.Buffer
        qrbw := &qrBufWriter{&bb}
        qrWriter := standard.NewWithWriter(qrbw)

        if err = qrc.Save(qrWriter); err != nil {
                fmt.Printf("could not save image: %v", err)
                return "", isNull, err
        }

        return qrbw.String(), isNull, nil
}

We also add another type that is used in the function implementation:

type qrBufWriter struct {
        *bytes.Buffer
}

func (b *qrBufWriter) Close() error {
        return nil
}

This is because we need a writer that implements the io.WriteCloser interface.

The evalString() method is where we actually create the QRCode. We get the first argument and store this in d. Then we use create a QRCode based on d which we then write to a buffer and eventually return via qrbw.String().

The full code is available in the repository that is mentioned at the start of this article.

Note that many of the other functions serve as examples of how to implement these.

Step 4: Adding dependencies and buildign

We need to add the library for creating QRCodes to our repository.

go get github.com/yeqown/go-qrcode/v2
go get github.com/yeqown/go-qrcode/writer/standard

This modifies the go.mod and go.sum files.

Now we can run make server to create the bin/tidb-server binary.

Just running ./bin/tidb-server after building the binary should give you a TiDB server that listens on port 4000 and uses the unistore storage. You can also use tiup playground and specify the path to the binary that you just build.

Step 5: The result

If you use a tool like DBeaver or MySQL Workbench and call the new function you get a working QRCode image.

SELECT QRCODE('https://github.com/dveeden/tidb/tree/qrcode')


 


 

Conclusion

Adding a new function isn’t too difficult. While a QRCODE() function in SQL is nice this might not be very useful as it is probably better to move this functionality to the application layer.

This function is lacking unittests, which would be required before actually merging into the master branch of pingcap/tidb on GitHub.

Saturday, November 17, 2018

Tuesday, February 6, 2018

How caching_sha2_password leaks passwords

Oracle recently announced a new authentication plugin: caching_sha2_password. This was added in 8.0.4, the second release candidate for MySQL 8.0. The new plugin is also made the default (can be configured by changing default_authentication_plugin.

Why?

Phasing out SHA1

As Oracle said in the blog post to annouce this change they want to move to a more secure hashing algorithm (SHA256). Which I think is a good reason to do this.

Adding salt

Adding a salt makes hashes for identical passwords, but different users different. Again a good reason to do this.

Performance

Their earlier attempt at this resulted in sha256_password. But this resulted in slower authentication. Without using persistent connections this is a serious limitation. So again a good reason.

What's wrong?

If you don't use SSL/TLS it gives your password away.

To protect against sending the password in cleartext over an insecure connection it encrypts the password before sending it. It does this by using public key cryptography. It encrypts the password with the public key of the server. Then the server can decrypt it with its private key. So far so good.

But the problem is how MySQL gets the public from the server. There is --get-server-public-key which requests the key from the server. But it does so over an insecure connection, so this isn't safe.

An attacker could do a Man-in-the-Middle attack and give you their public key... and then the attacker can decrypt your password and proxy the connection.

The second option is to use --server-public-key-path=file_name. But then you somehow need to collect all public keys from all your servers and securely distribute them to your clients. And you might want to renew these keys every year... this seems like an operational nightmare to me.

Also depending on what connector you use these options may not be available.

If you use SSL/TLS things are not much better.

With default settings mysqld generates self-signed X509 certificates and enables SSL/TLS. And the default ssl-mode is PREFERRED. This is better than the previous defaults as it guards against passive attacks. However this is NOT protecting against active attacks (MitM attacks) as MySQL won't verify if the certificate is signed by a known CA. It by default also doesn't verify if the hostname matches the certificate.

So if someone hijacks your connection and knowns how to do a SSL handshake: then the caching_sha2_password plugin will handover the password in clear text.

Can we use it in a secure way?

Use SSL/TLS and set ssl-mode to VERIFY_IDENTITY (or at least VERIFY_CA). Note that this requires you to configure MySQL with certificates which are signed by your CA and matches the hostnames of your servers.

In case you only need localhost connections: configure MySQL to only listen on local-loopback and you're done.

Staying with mysql_native_password seems also to be an acceptable option for now. Note that sha256_password has many of the same issues and should also be avoided without strict SSL/TLS settings.

I initially reported this to Oracle in Bug #79944 on 13 January 2016 for the sha256_password plugin in 5.7.10.

Sunday, May 7, 2017

MySQL and SSL/TLS Performance

In conversations about SSL/TLS people often say that they either don't need TLS because they trust their network or they say it is too slow to be used in production.

With TLS the client and server has to do additional work, so some overhead is expected. But the price of this overhead also gives you something in return: more secure communication and more authentication options (client certificates).

SSL and TLS have existed for quite a long time. First they were only used for online banking and during authentication on web sites. But slowly many websites went to full-on SSL/TLS. And with the introduction of Let's encrypt many small websites are now using SSL/TLS. And many non-HTTP protocols either add encryption or move to a HTTP based protocol.

So TLS performance is very important for day-to-day usage. Many people and companies have put a lot of effort into improving TLS performance. This includes browser vendors, hardware vendors and much more.

But instead of just hoping for good performance: Let's try to measure it with a simple benchmark.

There are multiple pieces of a database connection we have to benchmark:
  1. New connections
  2. Reconnecting
  3. Bulk transfer
 And for all of these there are multiple things we can measure:
  1. Connect and/or transfer time (performance)
  2. CPU usage (efficiency)
  3. Concurrency 
The benchmark code can be found here: https://github.com/dveeden/mysql_go_tls

Let's look at connection performance. In this test I connect a number of times to MySQL  and do a "DO 1". This is on a localhost TCP connection, so it should be fast.


This is the connection time in ms for a single connection.
With 5.6.33 Community Edition, which is YaSSL based we see a very noticable overhead. And with 5.7.17 Community Edition this overhead is much smaller, but still very noticable.

Then MySQL 5.7 with OpenSSL (compiled on Fedora 25) shows another very noticable improvement over YaSSL. This can be explained because in this case the AVX2 and AES-NI CPU features can be used.

Also OpenSSL supports TLS tickets and YaSSL doesn't. This is why the yellow bar is much shorter that the orange bar. This is not yet supported in libmysqlclient, see Bug #76921 for details.

So SSL/TLS can be slow, but doesn't have to be slow.

Note that TLS needs multiple roundtrips. When testing this with netem on Linux I see this with MySQL 5.7.18 (YaSSL) and a 5ms delay:
No TLS goes from 0.5ms to 52ms
TLS goes from 8ms to 85ms

The second thing to measure is bulk performance. This is for large result sets including mysqldump.

With mysqldump from MySQL 5.7 it is easy to do:

$ time mysqldump --ssl-mode=disabled -A > /dev/null

real 0m0.145s
user 0m0.021s
sys 0m0.005s
$ time mysqldump --ssl-mode=required --ssl-cipher=AES128-SHA -A > /dev/null

real 0m0.120s
user 0m0.039s
sys 0m0.007s 
 
If you do this with multiple ciphers and put some data in the database you'll see something like this:
No TLS
4.5s
TLS Default
10.4s
RC4-MD5
7.1s
DES-CBC3-SHA
23.2s
 This is with MySQL 5.6.33 with YaSSL. Note that this is without using modern CPU features etc.

To conclude, there are some steps you can take to improve SSL/TLS performance:
  1. Upgrade to 5.7
  2. Compile MySQL with OpenSSL
  3. Use TLS tickets
  4. Use persistent connections
  5. Try different cipher suits for mysqldump and other places where you transfer larger amounts of data.

Wednesday, April 12, 2017

Network attacks on MySQL, Part 6: Loose ends

Backup traffic

After securing application-to-database and replication traffic, you should also do the same for backup traffic.

If you use Percona XtraBackup with streaming than you should use SSH to send your backup to a secure location. The same is true for MySQL Enterprise Backup. Also both have options to encrypt the backup itself. If you send your backup to a cloud service this is something you should really do, especially if it is not sent via SSH or HTTPS.

And mysqldump and mysqlbinlog both support SSL. And you could use GnuPG, OpenSSL, WinZIP or any other tool to encrypt it.

Sending credentials

You could try to force the client to send credentials elsewhere. This can be done if you can control the parameters to the mysql client. It reads the config from /etc/my.cnf, ~/.my.cnf and ~/.mylogin.conf but if you for example specify a login-path and a hostname.. it connects to that host, but with the password and username from the loginpath from the encrypted ~/.mylogin.cnf file.

You could use --enable-cleartext-plugin to make it even easier to get to the stored password. Note that if you have direct access to the ~/.mylogin.cnf file that there are options to decrypt it.

See Bug #74545: mysql allows to override login-path for details.

MySQL Cluster (NDB)

Make sure your machines use a private network (VLAN) which can only be accessed from cluster nodes. Your API nodes should be in this network and have a public interface where mysqld listens. Another option might be to use a firewall device or host based firewalls. Just make sure you are aware or the risks.

As usual thers is extensive documentation about this: MySQL Cluster Security and Networking Issues from the MySQL Reference Manual.

Network storage

And use proper security for iSCSI, NFS, FCP or any other kind of network storage you might be using. I've seen setups where iSCSI and/or NFS were publicly available and even with data-at-rest encryption this is not really safe, especially if read-write access is available.

Future

In both MySQL 5.6 and MySQL 5.7 Oracle improved the SSL/TLS support a lot. There are more improvements needed as a lot has changed in how SSL over the past 10 years. Assumptions made years ago are no longer true.

And also the creators of YaSSL have been busy: wolfSSL/mysql-patch on github

Wednesday, April 5, 2017

Network attacks on MySQL, Part 5: Attack on SHA256 based passwords

The mysql_sha256_password doesn't use the nonce system which is used for mysql_new_password, but instead forces the use of RSA or SSL.

This is how that works:

  1. The client connects
  2. The server changes authentication to sha256 password (or default?)
  3. The server sends the RSA public key.
  4. The client encrypts the password with the RSA public key and sends it to the server.
  5. The server decrypts the password with the private key and validates it.

The problem is that the client trusts public key of the server. It is possible to use --server-public-key-path=file_name. But then you need to take care of secure public key distribution yourself.

So if we put a proxy between the client and the server and then have the proxy sent its own public key... then we can decrypt it and reencode it with the real public key and send it to the server. Also the decrypted password is the password, not a hash. So we then know the real password.

And if SSL is used it doesn't do the RSA encryption... but this can be a connection with an invalid certificate. Just anything as long as the connection is SSL.

Wednesday, March 29, 2017

Network attacks on MySQL, Part 4: SSL hostnames

In my previous blogs I told you to enable SSL/TLS and configure it to check the CA. So I followed my advice and did all that. Great!

So the --ssl-mode setting was used a few times as a solution. And it has a setting we didn't use yet: VERIFY_IDENTITY. In older MySQL versions you can use --ssl-verify-server-cert. Both turn on hostname verification.

The attack

Get any certificate which is trusted by the configured CA, this can for example be a certificate from a development machine. And use that with a man-in-the-middle proxy.

Then the client:

  1. Checks if SSL is uses (--ssl-mode=REQUIRED)
  2. Verify if the certificate is signed by a trusted CA (--ssl-mode=VERIFY_CA)

Both checks succeed. But the certificate might be for testhost01.example.com and the database server might be prod-websitedb-123.example.com.

Browsers by default verify hostnames, MySQL does not.

Turning on hostname validation

So use --ssl-mode=VERIFY_IDENTITY and everything should be fine?

Well that might work for simple setups, but would probably fail for more complex setups.

This is because you might have a master-slave setup with loadbalancer in front of it. So your webapp connect to mydb-prod-lb.example.com which might be served by mydb1.example.com (master) or mydb2.example.com (slave). There might or might not be any automatic read/write splitting.

So then just configure the loadbalancer be the endpoint of the SSL connection? Well no, because most loadbalancers don't know how to speak the mysql protocol, which is needed to setup the SSL connection.

Ok, then just configure both servers with the certificate for mydb-prod-lb.example.com and everything should work. And it does!

But then you want to change the replication connection to also use SSL, but now the certificates and hostnames don't match anymore as they connect directly.

The same might be true for mysqldump or mysqlbinlog instances running on a separate backup server.

But there is a X.509 extension available which can be used: 'SubjectAlternativeName' a.k.a. SAN. (Not to be confused with Storage Area Networking). This allows you to have a certificate with multiple hostnames.

So for both hosts put their own hostname and the loadbalancer hostname in there.

But unfortunately that doesn't work yet. MySQL doesn't support this.

See Bug #68052: SSL Certificate Subject ALT Names with IPs not respected with --ssl-verify-serve for more details.

So yes, do enable hostname verification, but probably not everywhere yet.