Saturday, October 25, 2014

When your query is blocked, but there is no blocking query

When I queried information_schema.innodb_trx (introduced in 5.1 with the InnoDB Plugin) I noticed there were a few transactions in LOCK WAIT state.

Example:
mysql [information_schema] > select trx_id,trx_state 
    -> from information_schema.innodb_trx;
+--------+-----------+
| trx_id | trx_state |
+--------+-----------+
| 7974   | LOCK WAIT |
| 7962   | RUNNING   |
+--------+-----------+
2 rows in set (0.00 sec)

Then I made a query to join a few information_schema and performance_schema tables to find out which query was blocking my transactions. It turned out that the blocking transaction had a trx_query=NONE. So my query was block by a transaction doing nothing. That's not really helpful.

Let's try to recreate the situation and see exactly what happened. I'll use two sessions for the transactions and a third to monitor the server.

The first session:
mysql [test] > create table t1 (id int);
Query OK, 0 rows affected (0.01 sec)

mysql [test] > insert into t1 values(1),(2),(3);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql [test] > create temporary table t2 (id int);
Query OK, 0 rows affected (0.01 sec)

mysql [test] > start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql [test] > insert into t2 select * from t1;
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

The second session:
mysql [test] > insert into t1 values(4);
Now the second session is waiting.

Let's start a third session to see what's happening. If the second session hits a wait timeout, you can just restart it.

mysql [information_schema] > select * from innodb_trx\G
*************************** 1. row ***************************
                    trx_id: 8022
                 trx_state: LOCK WAIT
               trx_started: 2014-10-25 11:38:25
     trx_requested_lock_id: 8022:18:3:1
          trx_wait_started: 2014-10-25 11:38:25
                trx_weight: 2
       trx_mysql_thread_id: 19
                 trx_query: insert into t1 values(4)
       trx_operation_state: inserting
         trx_tables_in_use: 1
         trx_tables_locked: 1
          trx_lock_structs: 2
     trx_lock_memory_bytes: 360
           trx_rows_locked: 1
         trx_rows_modified: 0
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 10000
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
*************************** 2. row ***************************
                    trx_id: 8020
                 trx_state: RUNNING
               trx_started: 2014-10-25 11:35:07
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 6
       trx_mysql_thread_id: 1
                 trx_query: NULL
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 0
          trx_lock_structs: 3
     trx_lock_memory_bytes: 360
           trx_rows_locked: 4
         trx_rows_modified: 3
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 10000
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
2 rows in set (0.00 sec)

mysql [information_schema] > select * from innodb_lock_waits\G
*************************** 1. row ***************************
requesting_trx_id: 8022
requested_lock_id: 8022:18:3:1
  blocking_trx_id: 8020
 blocking_lock_id: 8020:18:3:1
1 row in set (0.00 sec)

mysql [information_schema] > select * from innodb_locks\G
*************************** 1. row ***************************
    lock_id: 8022:18:3:1
lock_trx_id: 8022
  lock_mode: X
  lock_type: RECORD
 lock_table: `test`.`t1`
 lock_index: GEN_CLUST_INDEX
 lock_space: 18
  lock_page: 3
   lock_rec: 1
  lock_data: supremum pseudo-record
*************************** 2. row ***************************
    lock_id: 8020:18:3:1
lock_trx_id: 8020
  lock_mode: S
  lock_type: RECORD
 lock_table: `test`.`t1`
 lock_index: GEN_CLUST_INDEX
 lock_space: 18
  lock_page: 3
   lock_rec: 1
  lock_data: supremum pseudo-record
2 rows in set (0.00 sec)

mysql [information_schema] > use sys
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql [sys] > select * from innodb_lock_waits\G
*************************** 1. row ***************************
     waiting_trx_id: 8022
     waiting_thread: 19
      waiting_query: insert into t1 values(4)
    waiting_lock_id: 8022:18:3:1
  waiting_lock_mode: X
  waiting_lock_type: RECORD
 waiting_lock_table: `test`.`t1`
 waiting_lock_index: GEN_CLUST_INDEX
    blocking_trx_id: 8020
    blocking_thread: 1
     blocking_query: NULL
   blocking_lock_id: 8020:18:3:1
 blocking_lock_mode: S
 blocking_lock_type: RECORD
blocking_lock_table: `test`.`t1`
blocking_lock_index: GEN_CLUST_INDEX
1 row in set (0.00 sec)

The last query is from the SYS schema. You'll need version 1.3.0 of SYS for this to work.

So we now know which transaction is blocking which transaction, but we don't yet know what that transaction is doing as blocking_query is NULL.

But we have blocking_thread=1. With performance_schema we can find the statements executed in that thread:
mysql [performance_schema] > SELECT SQL_TEXT FROM events_statements_history_long 
    -> WHERE thread_id=(SELECT thread_id FROM threads WHERE processlist_id=1) 
    -> order by TIMER_START;
+---------------------------------+
| SQL_TEXT                        |
+---------------------------------+
| rollback                        |
| start transaction               |
| insert into t2 select * from t1 |
+---------------------------------+
3 rows in set (0.00 sec)

You should be aware that there are 3 ID's with which we are dealing:
  • The InnoDB Transaction ID (trx_id)
  • The MySQL Processlist ID (information_schema.processlist.ID, used in blocking_thread)
  • The MySQL Thread ID (performance_schema.threads.thread_id)
This can help you to identify what the application in the idle thread was doing and what might happen to it if you would just kill the query.

It might also be possible to find OS pid if the application uses a connector which supports connection attributes. The host on which this pid lives can be found in the processlist.
mysql [performance_schema] > select * from session_connect_attrs where PROCESSLIST_ID=2;
+----------------+-----------------+----------------+------------------+
| PROCESSLIST_ID | ATTR_NAME       | ATTR_VALUE     | ORDINAL_POSITION |
+----------------+-----------------+----------------+------------------+
|              2 | _os             | linux-glibc2.5 |                0 |
|              2 | _client_name    | libmysql       |                1 |
|              2 | _pid            | 25415          |                2 |
|              2 | _client_version | 5.6.21         |                3 |
|              2 | _platform       | x86_64         |                4 |
|              2 | program_name    | mysql          |                5 |
+----------------+-----------------+----------------+------------------+
6 rows in set (0.00 sec)

For this procedure to succeed you'll need this:
  1. MySQL 5.6 or newer.
  2. performance_schema must be enabled
  3. In setup_consumers events_statements_history_long must be enabled.
  4. The performance_schema_events_statements_history_long_size variable must be set large enough. Changing this requires a restart of the server. With this variable you specify how many statements there are kept in the events_statements_history_long table. You probably need to raise this on a busy server to get the statements of the blocking transaction as the transaction might be waiting for quite some time.

2 comments:

  1. For viewing the statements a transaction has done previously, I generally recommend enabling and using the events_statements_history consumer/table, which holds the last 10 statements per active thread, rather than events_statements_history_long, which is the last 10000 globally (which could age out quickly). This will you will *always* get the last 10 statements per thread.

    ReplyDelete