Mysql 5.7 gtid主从复制错误处理

Posted by Sunday on 2018-04-24

环境

主库IP:192.168.11.31
从库IP:192.168.11.32
数据库:MySQL 5.7 gtid 主从复制
原因:由于从库没有限制好权限,开发人员在从库插入数据从而导致主从同步错误。

在主库31 新建库

1
2
3
4
5
6
7
8
CREATE DATABASE test;
CREATE TABLE `test`.`info_test` (
`id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'ID',
`name` varchar(20) NOT NULL DEFAULT '' COMMENT '姓名',
`country` varchar(20) NOT NULL DEFAULT '' COMMENT '国家/城市',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='测试表';
`

在主库31插入数据

1
2
3
INSERT INTO `test`.`info_test` (`id`, `name`, `country`) VALUES ('001', '李幸倪', '马来西亚');
INSERT INTO `test`.`info_test` (`id`, `name`, `country`) VALUES ('002', '陈奕迅', '中国香港');
INSERT INTO `test`.`info_test` (`id`, `name`, `country`) VALUES ('003', 'Taylor Swift', '美国');

查看同步状态

主库31查看表数据

1
2
3
4
5
6
7
8
9
mysql> select * from test.info_test;
+----+--------------+--------------+
| id | name | country |
+----+--------------+--------------+
| 1 | 李幸倪 | 马来西亚 |
| 2 | 陈奕迅 | 中国香港 |
| 3 | Taylor Swift | 美国 |
+----+--------------+--------------+
3 rows in set (0.00 sec)

从库32查看表数据

1
2
3
4
5
6
7
8
9
mysql> select * from test.info_test;
+----+--------------+--------------+
| id | name | country |
+----+--------------+--------------+
| 1 | 李幸倪 | 马来西亚 |
| 2 | 陈奕迅 | 中国香港 |
| 3 | Taylor Swift | 美国 |
+----+--------------+--------------+
3 rows in set (0.00 sec)

模拟故障

从库32插入id 4

1
mysql> INSERT INTO `test`.`info_test` (`id`, `name`, `country`) VALUES ('004', '罗志祥', '中国台湾');

主库31插入id 4

1
mysql> INSERT INTO `test`.`info_test` (`id`, `name`, `country`) VALUES ('004', '叶巧琳', '中国香港');

从库报错1062

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.11.31
Master_User: slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master-bin.000001
Read_Master_Log_Pos: 3175
Relay_Log_File: db2-relay-bin.000003
Relay_Log_Pos: 2917
Relay_Master_Log_File: master-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1062
Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 4 failed executing transaction '834449ff-4487-11e8-8b27-000c294b06ca:11' at master log master-bin.000001, end_log_pos 3144. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Skip_Counter: 0
Exec_Master_Log_Pos: 2762
Relay_Log_Space: 3898
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1062
Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 4 failed executing transaction '834449ff-4487-11e8-8b27-000c294b06ca:11' at master log master-bin.000001, end_log_pos 3144. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
Replicate_Ignore_Server_Ids:
Master_Server_Id: 31
Master_UUID: 834449ff-4487-11e8-8b27-000c294b06ca
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 180424 12:09:16
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 834449ff-4487-11e8-8b27-000c294b06ca:1-11
Executed_Gtid_Set: 68303133-4489-11e8-84e9-000c293eaee6:1,
834449ff-4487-11e8-8b27-000c294b06ca:1-10
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

直接跳过错误的GTID事务

查看出错内容,ID

GTID的复制对于错误信息的可读性不是很好,但可以通过错误代码(1060)从监控表replication_applier_status_by_worker查看:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
mysql> select * from performance_schema.replication_applier_status_by_worker where LAST_ERROR_NUMBER=1062\G;
*************************** 1. row ***************************
CHANNEL_NAME:
WORKER_ID: 4
THREAD_ID: NULL
SERVICE_STATE: OFF
LAST_SEEN_TRANSACTION: 834449ff-4487-11e8-8b27-000c294b06ca:11
LAST_ERROR_NUMBER: 1062
LAST_ERROR_MESSAGE: Worker 4 failed executing transaction '834449ff-4487-11e8-8b27-000c294b06ca:11' at master log master-bin.000001, end_log_pos 3144; Could not execute Write_rows event on table test.info_test; Duplicate entry '4' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log master-bin.000001, end_log_pos 3144
LAST_ERROR_TIMESTAMP: 2018-04-24 12:09:16
1 row in set (0.00 sec)

ERROR:
No query specified

使用GTID跳过错误的方法:找到错误的GTID跳过(通过Exec_Master_Log_Pos去binlog里找GTID,或则通过上面监控表找到GTID,也可以通过Executed_Gtid_Set算出GTID),这里使用监控表来找到错误的GTID。找到GTID之后,跳过错误的步骤:

跳过事务操作

1
2
3
4
5
6
mysql> stop slave;   #停止同步
mysql> set gtid_next='834449ff-4487-11e8-8b27-000c294b06ca:11'; #跳过错误的GTID
mysql> begin; #提交一个空事务
mysql> commit;
mysql> set gtid_next='automatic'; 设置回自动模式
mysql> start slave;
1
2
3
4
传统方式
mysql> stop slave;
mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = n;
mysql> start slave;

恢复正常 但数据不一致

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.11.31
Master_User: slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master-bin.000001
Read_Master_Log_Pos: 3175
Relay_Log_File: db2-relay-bin.000004
Relay_Log_Pos: 457
Relay_Master_Log_File: master-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 3175
Relay_Log_Space: 3838
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 31
Master_UUID: 834449ff-4487-11e8-8b27-000c294b06ca
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 834449ff-4487-11e8-8b27-000c294b06ca:1-11
Executed_Gtid_Set: 68303133-4489-11e8-84e9-000c293eaee6:1,
834449ff-4487-11e8-8b27-000c294b06ca:1-11
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)

ERROR:
No query specified

同步恢复正常,但会导致主从的数据不一致

主库31

1
2
3
4
5
6
7
8
9
10
11
12
mysql> select * from info_test;
+----+--------------+--------------+
| id | name | country |
+----+--------------+--------------+
| 1 | 李幸倪 | 马来西亚 |
| 2 | 陈奕迅 | 中国香港 |
| 3 | Taylor Swift | 美国 |
| 4 | 叶巧琳 | 中国香港 |
| 5 | Zayn | 英国 |
+----+--------------+--------------+
5 rows in set (0.00 sec)
·

从库32

1
2
3
4
5
6
7
8
9
10
11
mysql> select * from info_test;
+----+--------------+--------------+
| id | name | country |
+----+--------------+--------------+
| 1 | 李幸倪 | 马来西亚 |
| 2 | 陈奕迅 | 中国香港 |
| 3 | Taylor Swift | 美国 |
| 4 | 罗志祥 | 中国台湾 |
| 5 | Zayn | 英国 |
+----+--------------+--------------+
5 rows in set (0.00 sec)

一致性解决

http://www.cnblogs.com/zhoujinyi/p/5704567.html