Wednesday, May 15, 2013

Calculating the InnoDB free space

Recently someone asked my if it's possible to find the total free space within InnoDB. I thought this would be very easy as the INFORMATION_SCHEMA.TABLES table has a DATA_FREE column. So we could just use SELECT SUM(DATA_FREE) FROM INFORMATION_SCHEMA.TABLES couldn't we?
 
So what does the DATA_FREE column tell us? It tells us the free data within InnoDB for that particular table. A table can share a tablespace with multiple other tables.
 
The tablespace which is used by a table depends on whether the innodb_file_per_table was enabled during table creation and/or at the last time the table was rebuild (e.g. by OPTIMIZE TABLE).
 
If innodb_file_per_table was always disabled then this query probably reports the correct free space:
SELECT DATA_FREE FROM INFORMATION_SCHEMA.TABLES WHERE ENGINE='InnoDB' LIMIT 1;

This is because all tables will share 1 tablespace.
 
If innodb_file_per_table was always enabled (new default for 5.6!) then this would report the free space:
SELECT SUM(DATA_FREE) FROM INFORMATION_SCHEMA.TABLES WHERE ENGINE='InnoDB';

This is because eache table will have it's own tablespace.
 
But how could we combine these two?
 
To give this a try I create a MySQL 5.6.10 sandbox with 4 tables of which only 2 have their own tablespace.
 
My first try is to use my udf_fileexists UDF:
 
mysql> SELECT TABLE_NAME,DATA_FREE,
    -> udf_fileexists(CONCAT(TABLE_SCHEMA,'/',TABLE_NAME,'.ibd')) AS ibd_file 
    -> FROM information_schema.tables WHERE ENGINE='InnoDB';
+----------------------+-----------+----------+
| TABLE_NAME           | DATA_FREE | ibd_file |
+----------------------+-----------+----------+
| t1                   |   4194304 |        1 |
| t2                   |   4194304 |        1 |
| t3                   |  66060288 |        0 |
| t4                   |  66060288 |        0 |
| innodb_index_stats   |         0 |        1 |
| innodb_table_stats   |         0 |        1 |
| slave_master_info    |         0 |        1 |
| slave_relay_log_info |         0 |        1 |
| slave_worker_info    |         0 |        1 |
+----------------------+-----------+----------+
9 rows in set (0.02 sec)

mysql> SELECT (SELECT DATA_FREE FROM INFORMATION_SCHEMA.TABLES WHERE 
    -> NOT udf_fileexists(CONCAT(TABLE_SCHEMA,'/',TABLE_NAME,'.ibd')) 
    -> AND ENGINE='InnoDB' LIMIT 1) + (SELECT SUM(DATA_FREE) 
    -> FROM INFORMATION_SCHEMA.TABLES WHERE 
    -> udf_fileexists(CONCAT(TABLE_SCHEMA,'/',TABLE_NAME,'.ibd')) 
    -> AND ENGINE='InnoDB') AS TOTAL_DATA_FREE;
+-----------------+
| TOTAL_DATA_FREE |
+-----------------+
|        74448896 |
+-----------------+
1 row in set (0.03 sec)

 
So that works, but it requires loading a UDF. Luckily it's also possible to only use INFORMATION_SCHEMA by using the INNODB_SYS_TABLES table. This works only for 5.6 and newer.
 
mysql> SELECT T.TABLE_SCHEMA,T.TABLE_NAME,T.DATA_FREE,ST.SPACE,SD.PATH 
    -> FROM INFORMATION_SCHEMA.TABLES T LEFT JOIN INFORMATION_SCHEMA.INNODB_SYS_TABLES ST 
    -> ON ST.NAME=CONCAT(T.TABLE_SCHEMA,'/',T.TABLE_NAME) 
    -> LEFT JOIN INFORMATION_SCHEMA.INNODB_SYS_DATAFILES SD ON SD.SPACE=ST.SPACE WHERE T.ENGINE='InnoDB';
+---------------------+----------------------+-----------+-------+----------------------------------+
| TABLE_SCHEMA        | TABLE_NAME           | DATA_FREE | SPACE | PATH                             |
+---------------------+----------------------+-----------+-------+----------------------------------+
| mysql               | innodb_table_stats   |         0 |     1 | ./mysql/innodb_table_stats.ibd   |
| mysql               | innodb_index_stats   |         0 |     2 | ./mysql/innodb_index_stats.ibd   |
| mysql               | slave_relay_log_info |         0 |     3 | ./mysql/slave_relay_log_info.ibd |
| mysql               | slave_master_info    |         0 |     4 | ./mysql/slave_master_info.ibd    |
| mysql               | slave_worker_info    |         0 |     5 | ./mysql/slave_worker_info.ibd    |
| innodbfreespacetest | t1                   |   4194304 |     6 | ./innodbfreespacetest/t1.ibd     |
| innodbfreespacetest | t2                   |   4194304 |     7 | ./innodbfreespacetest/t2.ibd     |
| innodbfreespacetest | t3                   |  66060288 |     0 | NULL                             |
| innodbfreespacetest | t4                   |  66060288 |     0 | NULL                             |
+---------------------+----------------------+-----------+-------+----------------------------------+
9 rows in set (0.02 sec)

mysql> SELECT (SELECT SUM(DATA_FREE) FROM INFORMATION_SCHEMA.TABLES T 
    -> LEFT JOIN INFORMATION_SCHEMA.INNODB_SYS_TABLES ST ON 
    -> ST.NAME=CONCAT(T.TABLE_SCHEMA,'/',T.TABLE_NAME) WHERE SPACE<>0) + 
    -> (SELECT DATA_FREE FROM INFORMATION_SCHEMA.TABLES T 
    -> LEFT JOIN INFORMATION_SCHEMA.INNODB_SYS_TABLES ST 
    -> ON ST.NAME=CONCAT(T.TABLE_SCHEMA,'/',T.TABLE_NAME) 
    -> WHERE SPACE=0 LIMIT 1) AS INNODB_FREE_GLOBAL;
+--------------------+
| INNODB_FREE_GLOBAL |
+--------------------+
|           74448896 |
+--------------------+
1 row in set (0.04 sec)
Calculating the free space can be useful if you don't have autoextend enabled on your datafiles (requires innodb_file_per_table to be disabled). It can also be useful to calculate how much space could be freed by running OPTIMIZE TABLE (requires innodb_file_per_table to be enabled).
 
Conclusion:
If you're on 5.6 you can use INFORMATION_SCHEMA and for 5.5 and earlier you have to use a UDF or have (a script) to peek in the MySQL datadir.

13 comments:

  1. Yes, it is WOW indeed.

    Hey Daniel, what's up with the line 7 in the table?

    Cheers,
    Dmitry
    http://www.scnsoft.com/

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. Great Article Daniel. Hope to read more soon !!

    cheers!

    https://www.startechup.com

    ReplyDelete
  6. Good way to calculate free space .... Nice Article ....Good way to explain the whole process..

    Tarika
    http://www.tarikatechnologies.com

    ReplyDelete
  7. Hello Daniela,

    Thanks for your informative clarification. At our Adoriasoft courses we are required to do the similar thing. Thanks to your post, now I know all the pecularities.

    ReplyDelete
  8. Good for us, cose we get a lot of useful things

    ReplyDelete
  9. Good blog on free space.Thanks for share.
    vertexplus

    ReplyDelete
  10. This comment has been removed by the author.

    ReplyDelete
  11. Nice write up about data. Using big data in the freemium stage allows game developers to accurately measure, predict, and track player behavior to optimize the experience, increasing the likelihood users converting to a paid model. Innovecs Games development share their experience

    ReplyDelete