狀況描述:
今天登錄一個MySQL數(shù)據(jù)庫slave節(jié)點(diǎn)主機(jī)發(fā)現(xiàn)/var/lib/mysql下存放大量的mysql-relay-bin文件,最早的文件創(chuàng)建日期甚至是2018年,我記得在slave庫同步完master的日志操作記錄后,會刪除這些文件(默認(rèn)設(shè)置不會刪除,我記錯了),于是便查看了slave庫的狀態(tài),發(fā)現(xiàn)如下報錯:
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: *.*.*.*
Master_User: dbsync
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000095
Read_Master_Log_Pos: 869242147
Relay_Log_File: mysqld-relay-bin.000146
Relay_Log_Pos: 871280529
Relay_Master_Log_File: mysql-bin.000075
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB: cdb,cdb_admin
Replicate_Ignore_DB: mysql
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1594
Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
Skip_Counter: 0
Exec_Master_Log_Pos: 871280384
Relay_Log_Space: 19994786573
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1594
Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
1 row in set (0.00 sec)
ERROR:
No query specified
原因:
我在master節(jié)點(diǎn)上刪除了名稱為mysql-bin.00007格式的文件,其中包括mysql-bin.000075,因此,slave庫找不到該文件,無法同步。
解決辦法:
1、在slave庫上重新指定同步位置。(不可行)
slave stop;
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000095',MASTER_LOG_POS=869242147; //mysql master節(jié)點(diǎn)上mysql-bin.000095的已有位置
slave start;
slave節(jié)點(diǎn)上show slave status,依然報錯,具體的報錯內(nèi)容沒有復(fù)制下來,只記得errno為1236,Slave_IO_Running進(jìn)程不運(yùn)行,Slave_SQL_Running進(jìn)程運(yùn)行,大概描述就是某個庫的某個表有問題。
在多次嘗試指定不同的同步位置(報錯的位置,master上mysql-bin-000095剛寫過的位置)依然存在該錯誤。
實(shí)際上,表記錄已經(jīng)有問題,就拿描述中提出的那個表來說,slave庫存放了約1200條記錄,master庫則有1900+的記錄。除非手工將這些數(shù)據(jù)補(bǔ)上,否則由于記錄操作數(shù)據(jù)的日志已經(jīng)丟失(被我刪除),是找不到最近的一致的日志操作執(zhí)行位置的。
2、重做slave庫。
由于數(shù)據(jù)差異太大,而且我覺得不光一張表出現(xiàn)了數(shù)據(jù)不一樣的問題,所以干凈點(diǎn),把從庫重做。
1)比對master、slave節(jié)點(diǎn)庫配置信息,保證一致。(我不知道為什么設(shè)置了雙主模式,實(shí)際上我只有一個實(shí)例跑在master節(jié)點(diǎn)上啊?)
2)在master、slave節(jié)點(diǎn)上查看流量情況(show processlist),保證要重做的slave庫上沒有業(yè)務(wù)的流量接入。
3)停止master節(jié)點(diǎn)上slave進(jìn)程。(這個停了以后,我就沒開過,不知道有沒有問題,待觀察)
4)記錄master節(jié)點(diǎn)上庫的日志記錄位置,之后備份數(shù)據(jù)庫:
mysql> show master status;
+------------------+-----------+-------------------------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+-----------+-------------------------------+------------------+
| mysql-bin.000095 | 871760173 | cdb,cdb_admin | mysql |
+------------------+-----------+-------------------------------+------------------+
1 row in set (0.01 sec)
mysqldump -u root -p --databases cdb,cdb_admin > bak.master.sql
5)保險起見,備份slave節(jié)點(diǎn)庫:
mysqldump -u root -p --databases cdb,cdb_admin gt; bak.slave.sql
6)重做開始:把master庫備份文件復(fù)制到slave節(jié)點(diǎn)上,導(dǎo)入該備份文件
mysql -u root -p lt; bak.master.sql
7)在slave節(jié)點(diǎn)上,重新指定讀master日志的位置:
slave stop;
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000095',MASTER_LOG_POS=871760173; //POS為剛才記錄的master節(jié)點(diǎn)日志記錄位置
slave start;
8)slave節(jié)點(diǎn)上 show slave status;此時Slave_IO_Running,Slave_SQL_Running均運(yùn)行起來了,刷新slave status,Read_Master_Log_Pos數(shù)值也開始增加,重新開始同步了。
總結(jié):
清理文件時,要注意mysql-bin文件在master、slave節(jié)點(diǎn)日志讀取和寫的位置??!,刪之前一定要確認(rèn)日志位置在master和slave斷已被讀過,不要亂刪,否則搞得slave庫無法同步了,就算在slave節(jié)點(diǎn)上強(qiáng)行指定master日志讀取位置或者跳過該錯誤,也不排除slave庫上數(shù)據(jù)丟失的可能。
以上就是本文的全部內(nèi)容,希望對大家的學(xué)習(xí)有所幫助,也希望大家多多支持腳本之家。
您可能感興趣的文章:- mysql同步問題之Slave延遲很大優(yōu)化方法
- 解決MySQL中的Slave延遲問題的基本教程
- MySQL中slave監(jiān)控的延遲情況分析
- mysql 主從數(shù)據(jù)不一致,提示: Slave_SQL_Running: No 的解決方法
- Mysql主從數(shù)據(jù)庫(Master/Slave)同步配置與常見錯誤
- MySQL中slave_exec_mode參數(shù)詳解
- MySQL5.6 數(shù)據(jù)庫主從同步安裝與配置詳解(Master/Slave)
- MySQL Slave 觸發(fā) oom-killer解決方法
- MySQL slave 延遲一列 外鍵檢查和自增加鎖