Monday, November 11, 2024

Release and version management is hard

In this post I will talk about release management with MySQL as example. Release management has to do with what to put in a release, how often to do a release, how to long to support a release and how to number or name the releases.

The early years

Today software is almost exclusively delivered over the internet, but this didn’t use to be the case. Software used to be delivered in a box that you had to buy at a physical store. These were often shrink wrapped and forced you to agree with the EULA by opening the box. The box then contained floppies, CD’s, DVD’s or any other suitable medium.

There were multiple reasons why physical media were used, even when the internet arrived. One of the main limitations was bandwidth, this is also why FreeBSD, Linux, etc were sold on physical media. And one of the other reasons was that online payment wasn’t as ubiquitous as it is today. Then there were also ways of copy protection that relied on physical media and/or dongles. For operating systems and such there was also a chicken-and-the-egg situation that you needed an OS to be able to download anything. And a nice box with a printed manual probably felt more valuable than a downloaded file.

This all put some constraints on release management. The time between releases was quite large as it would need new physical media to be produced, with a newly designed box, manual, etc. And as many software versions were sold per release it needed to have enough new features to convince people to pay for an upgrade. There were sometimes discounts for upgrades, but then the software needed some way to only work as upgrade or have a way to validate that the user was in possession of the previous version. Bugfixes and patches might be delivered online or not at all.

As people were either using dial-up and only connected to the internet for a short amount of time or not at all, this made security a lot less important than it is today. It also meant that people used manuals a lot more as web search might not have been available.

Release management today

Today with new releases being made available over the internet this has changed the constraints it puts on release management a lot.

We now have package managers that might download and upgrade with minimal or no user intervention. And we now have Puppet, Kubernetes, etc to manage software on a set of machines. As package managers take care of dependencies this makes it easier for software packages to have more and more complex dependencies. And with most systems being connected to the internet all the time, security and security updates become more important.

How this applies to MySQL

MySQL has had a interesting history when it comes to releases and versioning.

mysql-history-graph 
Source:  https://github.com/dveeden/mysql-history-graph

Early versions

MySQL 1.0 was never released to a wide audience. Then version 2 was skipped, probably because calling something version 3 was better for marketing. The first widely used version was probably MySQL 3.23.x. Note that MySQL isn’t following semantic versioning where the version has a major, a minor and a patch version. For MySQL the first two numbers (e.g. 5.6) make the major version. After 3.23 followed 4.0 and 4.1

The 5.x versions.

Version 5.0 of MySQL was filled with a large number of both features and bugs. I think this was driven by the partnership with SAP. This is a good example of what you get when you prioritize features over stability.

Version 5.1 brought partitioning.

And then MySQL 5,2, 5.3 and 5.4 never made it to production. Only 5.4 was released as development preview release.

With MySQL 5.5 the build system switched to a unified cmake based build system. Besides a big change for developers this also impacted Linux distros as they had to change how they build MySQL packages. This is the first version that had InnoDB as the default storage engine. Eerlier versions also supported InnoDB, but not as default, probably in part because at that time the InnoDB storage engine was in the hands of InnoBase and/or Oracle.

The 5.6 version brought GTID, which was a major step forward for replication.

Then 5.7 brought native JSON support and generated columns, among many other features.

What happend to 6.x and 7.x

MySQL 6.0 work started shortly after 5.1 was released, but never made it into a production release. It had many interesting features, most of which have been backported to later 5.x versions of MySQL. For what I understand this was due to the 6.0 codebase being unreliable due to a mix of badly tested and/or incomplete features.

Then the 7.x versions had been used for MySQL Cluster, which is basically a modified MySQL version that included the NDB storage engine. In 8.0 the changes for the server were merged with the regular MySQL code allowing a unified version.

The 8.x timeframe

MySQL 8.0 brought many new features but also had a big change in the foundation of MySQL. MySQL 8.0 used InnoDB to store the data dictionary instead of FRM files. This allows for better crash safety and atomic DDL and opens up future possibilities like transactional DDL.

As MySQL made its way to Oracle via Sun it now has to adhere to the release policies of Oracle. This means that each release has to be supported for a long time. So sticking with 8.0, and adding new features in updates of 8.0 was what they did. This worked well, but it had one big drawback: MySQL could deprecate features, API’s etc, but wasn’t able to remove them until a new major version.

Then MySQL 8.1, 8.2 and 8.3 were released as Innovation releases which ten lead to a 8.4 long term support release. In these releases MySQL cleaned up many of the previously deprecated features. I think cleaning up these things is good as it makes it easier to develop new features that don’t have to take these deprecated things into account and it makes for cleaner code. But it also means that third party software might have to test and update their code to avoid these now removed features. And a new major version also means a new training and certification program and all the other things related to a new major version.

The 9.x timeframe

MySQL has released 9.0 and 9.1 as innovation releases. However MySQL is more than just the server and for some components like MySQL Connector/J (The database driver for Java) the 9.1 release isn’t an innovation release but a regular GA release. And the MySQL Workbench GUI tool has been stuck at 8.0.x. So where 8.0 was a coordinated release for all components with more or less the same version, this is no longer true.

Learnings

Don’t sacrifice stability for features

I think MySQL 5.0 and MySQL 6.0 have been examples of this. Adding features first and then later making the codebase stable again doesn’t seem to be a good strategy. I think taking features from 6.0 and then adding them in 5.x releases was a smart decision.

Don’t stay on the same major version for too long

It seems that many people only upgraded to 8.0 recently and found it quite challenging to do. I think this is mostly because the large set of changes between two major versions. More often releases with slightly less big and/or breaking changes makes it easier to upgrade.

Cleanup is needed

I think it is critical for a software project to be able to deprecate and cleanup features and APIs. I do like how Go keeps compatibility between 1.x version but I don’t think this would work for MySQL.

Innovation and LTS releases

I think the innovation releases that are released often, but don’t get the long term support are helpful and make it easier for people to test and try features before they land in a long term support release.

Then a small number of LTS releases make it easier for developers to find a stable base for their application. As many bugfixes have to be backported this reduces the load on the MySQL developers.

Continues release

Some software projects now use a continues release model where there are no real versions, but just an ever moving set of updates. This is used for Debian unstable/sid. CentOS is using something that is somewhat similar with CentOS Stream, which still has versions, but not the minor versions that CentOS Linux used to have.

I don’t think this would make sense for MySQL, however a nightly build like what TiDB offers whould be nice, but this doesn’t seem to align with the procedures from Oracle.

Disclaimers

I don’t work for Oracle/MySQL so this is purely an outsiders view on things.

My main point is that managing release version numbers and contents is difficult. I used MySQL as an example as it has a rich history, but I could have used something else instead.

I work for PingCAP, but the content of this blog is only mine and not that of my employer.