溫馨提示×

溫馨提示×

您好，登錄后才能下訂單哦！

密碼登錄×

忘記密碼？

登錄注冊×

獲取短信驗證碼

其他方式登錄

點擊登錄注冊即表示同意《億速云用戶服務條款》

用戶登錄×

賬戶密碼登錄

請使用微信掃描上方二維碼

使用幫助

請求超時！

請點擊重新獲取二維碼

【MySQL】5.6/5.7并行復制bug導致的故障 ERROR 1755/1756

發布時間：2020-08-09 10:57:08 來源：ITPUB博客閱讀：579 作者：麻花vodka 欄目：MySQL數據庫

最近做了很多組基于并行復制（MTS）的主從，其中大多數為5.6->5.7的結構，少部分5.6->5.6的并行復制。
每組m-s結構配置相近，有一定幾率出現如下錯誤，但不是全部出現：

〇 ERROR 1755：
錯誤場景：
    Master(5.6) -> Slave(5.6/5.7)

相關配置：
    Slave開啟并行復制：
    slave_parallel_workers>=1。

Slave報錯信息：（此處是5.7的Slave，5.6也類似，但reason會有不同）

……
Slave_IO_Running: Yes
Slave_SQL_Running: No
……
Last_Errno: 1755
Last_Error: Cannot execute the current event group in the parallel mode. Encountered event Gtid, relay-log name {目錄}/relaylog/mysql-relay.000002, position 280408 which prevents execution of this event group in parallel mode. Reason: The master event is logically timestamped incorrectly..
……
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1755
Last_SQL_Error: Cannot execute the current event group in the parallel mode. Encountered event Gtid, relay-log name {目錄}/relaylog/mysql-relay.000002, position 280408 which prevents execution of this event group in parallel mode. Reason: The master event is logically timestamped incorrectly..
Replicate_Ignore_Server_Ids:
……

錯誤提示很明顯：
Cannot execute the current event group in the parallel mode
不能在parallel模式下執行目前的這個event組

在5.6作為slave也有可能遇到這個問題。

錯誤提示和原因顯示很明白，關掉并行復制就可以了：

STOP SLAVE;
SET GLOBAL slave_parallel_workers=0;
START SLAVE;

同樣是1755報錯，目前收集到日志中可能給出的reason有下面三個：

① Reason：The master event is logically timestamped incorrectly（這個可能也和在5.7上設置slave_parallel_type="LOGICAL_CLOCK"有關）
② Reason: possible malformed group of events from an old master
③ Reason：the events is a part of a group that is unsupported in the parallel execution mode.

總結一下原因可以是：
在5.6老版本->5.6新版本/5.7的復制結構下，master的event沒有記錄并行復制的相關信息。

在Slave為5.6和5.7下均有出現的可能，已經被認作是個BUG，可以參考：
https://bugs.mysql.com/bug.php?id=71495
https://bugs.mysql.com/bug.php?id=72537

〇 ERROR 1756
錯誤場景：
    Master(5.6) -> Slave(5.7)

相關配置：
    Slave開啟并行復制：
    slave_parallel_workers>=1。
    slave_parallel_type='LOGICAL_CLOCK'。

與1755不同的是，出現1756錯誤的可能性似乎更多。

Slave報錯信息：

……
Slave_IO_Running: Yes
Slave_SQL_Running: No
……
Last_Errno: 1756
Last_Error: … The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details).
……
Last_SQL_Errno: 1756
Last_SQL_Error: … The slave coordinator and worker threads are stopped, possibly leaving data in inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases you have to examine your data (see documentation for details).
……

此處報錯原因：
Slave的復制分發對象被為“logical_clock”，但5.6是僅支持“database”粒度的并行復制。

那么為什么5.7使用基于logical_clock的就會出現這個問題呢？

因為在5.7的binlog event中，新增了“last_committed”和“sequence_number”
前者表示事務提交時，上次提交的事務編號，若事務具有相同的last_committed，則表明這些事務在同一個組內，可以并行進行apply
這兩個的出現，也是5.7新增基于logical_clock進行并行復制的基礎。
無論在開啟GTID還是關閉GTID的情況下，都會有對應信息的產生。

在5.7源碼中，MYSQL_BIN_LOG定義了兩個Logical_clock的變量：

class MYSQL_BIN_LOG: public TC_LOG
{
...
public:
/* Committed transactions timestamp */
Logical_clock max_committed_transaction;
/* "Prepared" transactions timestamp */
Logical_clock transaction_counter;
...

max_committed_transaction：記錄上次提交事務的logical_clock，即last_committed。
transaction_counter：記錄當前組提交中各事務的logical_clock，即sequence_number。

而5.6所產生的binlog是沒有這些記錄的，作為slave的5.7自然無法基于logical_clock進行并行復制。

這種情況下，修正該問題就好說了：

STOP SLAVE;

SET GLOBAL slave_parallel_type="DATABASE";

START SLAVE;

或者關閉并行復制也可以，即如1755一樣，設置slave_parallel_workers=0;
不幸的是，1756錯誤發生不止這一種原因。

更多可以參考：
https://bugs.mysql.com/bug.php?id=69369
https://bugs.mysql.com/bug.php?id=77239
………………

其中一個比較有趣的是，MHA作者Yoshinori Matsunobu也遇到了ERROR 1756：
https://bugs.mysql.com/bug.php?id=68465
其原因是并行復制并不支持“slave_transaction_retries”

他在rpl_slave.cc發現了該描述：
----
/* MTS technical limitation no support of trans retry */
if (mi->rli->opt_slave_parallel_workers != 0 && slave_trans_retries != 0


復現操作：
1.將slave_transaction_retries設置為一個較高的值
2.開啟并行復制：slave_parallel_workers>=0
3.在slave上，對t1表持有一個較長時間的InnoDB鎖，比如BEGIN; UPDATE t1 SET a=100;
4.在master上執行一個沖突的語句并提交傳輸到slave上，比如UPDATE t1 SET a=100 WHERE id=1;

這個bug所造成的ERROR 1756已經在5.7.5被修復。

關于ERROR 1755/1756總結一下：
避免跨版本的并行復制。
升級到5.6.x的更高版本，避免使用老版本5.6的并行復制。
升級到5.7.x的更高版本，避免使用老版本5.7。

〇參考文檔

MySQL 5.7并行復制實現原理與調優 by 姜承堯
從MySQL 5.6到5.7復制錯誤解決 by 佚名
https://bugs.mysql.com/

向AI問一下細節

推薦閱讀：

免責聲明：本站發布的內容（圖片、視頻和文字）以原創、轉載和分享為主，文章觀點不代表本網站立場，如果涉及侵權請聯系站長郵箱：is@yisu.com進行舉報，并提供相關證據，一經查實，將立刻刪除涉嫌侵權內容。

上一篇新聞：
有贊零售財務中臺架構設計與實踐
下一篇新聞：
Mac活動監視器教程 – CPU監控

猜你喜歡

AI
助
手

產品服務

地區劃分

專題活動

幫助支持

關于我們

售后咨詢

7*24小時在線電話：400-100-2938

7*24小時在線 QQ：800811969

關注億速云

億速云公眾號

手機網站二維碼

亚洲午夜精品一区二区_中文无码日韩欧免_久久香蕉精品视频_欧美主播一区二区三区美女