目錄
- 背景
- 問(wèn)題描述
- 原因分析
- 模擬一下
- 總結(jié)一下
背景
在日常的使用過(guò)程中,時(shí)不時(shí)會(huì)遇到個(gè)別,或者大量的連接堆積在 MySQL 中的現(xiàn)象,這時(shí)一般會(huì)考慮使用 kill 命令強(qiáng)制殺死這些長(zhǎng)時(shí)間堆積起來(lái)的連接,盡快釋放連接數(shù)和數(shù)據(jù)庫(kù)服務(wù)器的 CPU 資源。
問(wèn)題描述
在實(shí)際操作 kill 命令的時(shí)候,有時(shí)候會(huì)發(fā)現(xiàn)連接并沒有第一時(shí)間被 kill 掉,仍舊在 processlist 里面能看到,但是顯示的 Command 為 Killed,而不是常見的 Query 或者是 Execute 等。例如:
mysql> show processlist;
+----+------+--------------------+--------+---------+------+--------------+---------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+--------------------+--------+---------+------+--------------+---------------------------------+
| 31 | root | 192.168.1.10:50410 | sbtest | Query | 0 | starting | show processlist |
| 32 | root | 192.168.1.10:50412 | sbtest | Query | 62 | User sleep | select sleep(3600) from sbtest1 |
| 35 | root | 192.168.1.10:51252 | sbtest | Killed | 47 | Sending data | select sleep(100) from sbtest1 |
| 36 | root | 192.168.1.10:51304 | sbtest | Query | 20 | Sending data | select sleep(3600) from sbtest1 |
+----+------+--------------------+--------+---------+------+--------------+---------------------------------+
原因分析
遇事不決先翻官方文檔,這里摘取部分官方文檔的內(nèi)容:
When you use KILL, a thread-specific kill flag is set for the thread. In most cases, it might take some time for the thread to die because the kill flag is checked only at specific intervals:During SELECT operations, for ORDER BY and GROUP BY loops, the flag is checked after reading a block of rows. If the kill flag is set, the statement is aborted.
ALTER TABLE operations that make a table copy check the kill flag periodically for each few copied rows read from the original table. If the kill flag was set, the statement is aborted and the temporary table is deleted.
The KILL statement returns without waiting for confirmation, but the kill flag check aborts the operation within a reasonably small amount of time. Aborting the operation to perform any necessary cleanup also takes some time.
During UPDATE or DELETE operations, the kill flag is checked after each block read and after each updated or deleted row. If the kill flag is set, the statement is aborted. If you are not using transactions, the changes are not rolled back.
GET_LOCK() aborts and returns NULL.
If the thread is in the table lock handler (state: Locked), the table lock is quickly aborted.
If the thread is waiting for free disk space in a write call, the write is aborted with a “disk full” error message.
官方文檔第一段就很明確的說(shuō)清楚了 kill 的作用機(jī)制:會(huì)給連接的線程設(shè)置一個(gè)線程級(jí)別的 kill 標(biāo)記,等到下一次“標(biāo)記檢測(cè)”的時(shí)候才會(huì)生效。這也意味著如果下一次“標(biāo)記檢測(cè)”遲遲沒有發(fā)生,那么就有可能會(huì)出現(xiàn)問(wèn)題描述中的現(xiàn)象。
官方文檔中列舉了不少的場(chǎng)景,這里根據(jù)官方的描述列舉幾個(gè)比較常見的問(wèn)題場(chǎng)景:
- select 語(yǔ)句中進(jìn)行 order by,group by 的時(shí)候,如果服務(wù)器 CPU 資源比較緊張,那么讀取/獲取一批數(shù)據(jù)的時(shí)間會(huì)變長(zhǎng),從而影響下一次“標(biāo)記檢測(cè)”的時(shí)間。
- 對(duì)大量數(shù)據(jù)進(jìn)行 DML 操作的時(shí)候,kill 這一類 SQL 語(yǔ)句會(huì)觸發(fā)事務(wù)回滾(InnoDB引擎),雖然語(yǔ)句被 kill 掉了,但是回滾操作也會(huì)非常久。
- kill alter 操作時(shí),如果服務(wù)器的負(fù)載比較高,那么操作一批數(shù)據(jù)的時(shí)間會(huì)變長(zhǎng),從而影響下一次“標(biāo)記檢測(cè)”的時(shí)間。
- 其實(shí)參考 kill 的作用機(jī)制,做一個(gè)歸納性的描述的話,那么:任何阻塞/減慢 SQL 語(yǔ)句正常執(zhí)行的行為,都會(huì)導(dǎo)致下一次“標(biāo)記檢測(cè)”推遲、無(wú)法發(fā)生,最終都會(huì)導(dǎo)致 kill 操作的失敗。
模擬一下
這里借用一個(gè)參數(shù)innodb_thread_concurrency來(lái)模擬阻塞 SQL 語(yǔ)句正常執(zhí)行的場(chǎng)景:
Defines the maximum number of threads permitted inside of InnoDB. A value of 0 (the default) is interpreted as infinite concurrency (no limit). This variable is intended for performance tuning on high concurrency systems.
參照官方文檔的描述,這個(gè)參數(shù)設(shè)置得比較低的時(shí)候,超過(guò)數(shù)量限制的 InnoDB 查詢會(huì)被阻塞。因此在本次模擬中,這個(gè)參數(shù)被設(shè)置了一個(gè)非常低的值。
mysql> show variables like '%innodb_thread_concurrency%';
+---------------------------+-------+
| Variable_name | Value |
+---------------------------+-------+
| innodb_thread_concurrency | 1 |
+---------------------------+-------+
1 row in set (0.00 sec)
然后開兩個(gè)數(shù)據(jù)庫(kù)連接(Session 1 和 Session 2),分別執(zhí)行select sleep(3600) from sbtest.sbtest1
語(yǔ)句,然后在第三個(gè)連接上 kill 掉 Session 2 的查詢:
Session 1:
mysql> select sleep(3600) from sbtest.sbtest1;
Session 2:
mysql> select sleep(3600) from sbtest.sbtest1;
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql>
Session 3:
mysql> show processlist;
+----+------+--------------------+------+---------+------+--------------+----------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+--------------------+------+---------+------+--------------+----------------------------------------+
| 44 | root | 172.16.64.10:39290 | NULL | Query | 17 | User sleep | select sleep(3600) from sbtest.sbtest1 |
| 45 | root | 172.16.64.10:39292 | NULL | Query | 0 | starting | show processlist |
| 46 | root | 172.16.64.10:39294 | NULL | Query | 5 | Sending data | select sleep(3600) from sbtest.sbtest1 |
+----+------+--------------------+------+---------+------+--------------+----------------------------------------+
3 rows in set (0.00 sec)
mysql> kill 46;
Query OK, 0 rows affected (0.00 sec)
mysql> show processlist;
+----+------+--------------------+------+---------+------+--------------+----------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+--------------------+------+---------+------+--------------+----------------------------------------+
| 44 | root | 172.16.64.10:39290 | NULL | Query | 26 | User sleep | select sleep(3600) from sbtest.sbtest1 |
| 45 | root | 172.16.64.10:39292 | NULL | Query | 0 | starting | show processlist |
| 46 | root | 172.16.64.10:39294 | NULL | Killed | 14 | Sending data | select sleep(3600) from sbtest.sbtest1 |
+----+------+--------------------+------+---------+------+--------------+----------------------------------------+
3 rows in set (0.00 sec)
mysql>
可以看到,kill 命令執(zhí)行之后,Session 2 的連接馬上就斷開了,但是 Session 2 發(fā)起的查詢?nèi)耘f殘留在 MySQL 中。當(dāng)然,如果是因?yàn)?code>innodb_thread_concurrency這個(gè)參數(shù)導(dǎo)致了類似的問(wèn)題的話,直接使用set global
的命令調(diào)高上限,或者直接設(shè)置為 0 就可以解決,這個(gè)參數(shù)的變更是實(shí)時(shí)對(duì)所有連接生效的。
總結(jié)一下
MySQL 的 kill 操作并不是想象中的直接強(qiáng)行終止數(shù)據(jù)庫(kù)連接,只是發(fā)送了一個(gè)終止的信號(hào),如果 SQL 自身的執(zhí)行效率過(guò)慢,或者受到其他的因素影響(服務(wù)器負(fù)載高,觸發(fā)大量數(shù)據(jù)回滾)的話,那么這個(gè) kill 的操作很有可能并不能及時(shí)終止這些問(wèn)題查詢,反而可能會(huì)因?yàn)槌绦騻?cè)連接被斷開之后觸發(fā)重連,產(chǎn)生更多的低效查詢,進(jìn)一步拖垮數(shù)據(jù)庫(kù)。
以上就是MySQL kill不掉線程的原因的詳細(xì)內(nèi)容,更多關(guān)于MySQL kill線程的資料請(qǐng)關(guān)注腳本之家其它相關(guān)文章!
您可能感興趣的文章:- 詳解MySQL kill 指令的執(zhí)行原理
- MySQL kill指令使用指南
- Mysql誤刪數(shù)據(jù)解決方案及kill語(yǔ)句原理
- Mysql使用kill命令解決死鎖問(wèn)題(殺死某條正在執(zhí)行的sql語(yǔ)句)
- MySQL Slave 觸發(fā) oom-killer解決方法
- MySQL OOM 系列三 擺脫MySQL被Kill的厄運(yùn)
- MySQL OOM 系統(tǒng)二 OOM Killer
- percona-toolkit之pt-kill 殺掉mysql查詢或連接的方法
- 批量 kill mysql 中運(yùn)行時(shí)間長(zhǎng)的sql