MySQL Replication is asynchronous and one-way. (Even master-master replication is 2 one-way relationships, not a single two-way relationship.)
The slave server copies events from the master's binary log into its own relay log, then applies events from its relay log to its own live database. There is no feedback mechanism from the slave to the master if applying events to the slave's database causes an unrecoverable error; the slave stops replicating and waits for help from an administrator.
This procedure will solve a data corruption problem that halts a master-master replication.
Specifically, we'll handle a primary key collision. The table important.stuff
has a typical AUTO_INCREMENT
primary key. Both servers in this master-master replication accept INSERTs to the same table. Because MySQL replication is asynchronous, it is possible for both masters to write a new row that uses the same primary key. When these new rows replicate, the receiving master will have a duplicate key on a column that must be unique, and replication will stop in both directions.
At the beginning of this lesson, replication should be working correctly.
Log in to a.example.com
From A, insert a new record in the important.stuff
table.
a ~ $ mysql -u root mysql> INSERT INTO important.stuff SET details='Gift from A to B'; Query OK, 1 row affected (0.03 sec)
Log in to b.example.com
Now on B, check that the record has replicated.
b ~ $ mysql -u root mysql> SELECT * FROM important.stuff ORDER BY id DESC LIMIT 1\G *************************** 1. row *************************** id: 7 details: Gift from A to B happened: 2013-03-27 04:09:40 1 rows in set (0.00 sec)
important.stuff
Now test that changes to B also replicate to A.
Still on B, insert a new record in the important.stuff
table.
mysql> INSERT INTO important.stuff SET details='Gift from B to A'; Query OK, 1 row affected (0.03 sec) mysql>
Now on A, check that the record has replicated.
mysql> SELECT * FROM important.stuff ORDER BY id DESC LIMIT 1\G *************************** 1. row *************************** id: 9 details: Gift from B to A happened: 2013-03-27 04:09:40 1 rows in set (0.00 sec)
important.stuff
In this step, we're going to force a collision by slowing down replication.
MySQL replication uses two threads on the slave machine. The I/O Thread copies binary log data from the master to the relay log on the slave. Then the SQL Thread applies changes from the relay log to the slave database.
Stopping the slave I/O thread is no different than having a short network outage between the two servers. Both servers are healthy, but they're unable to tell each other about changes. Because MySQL replication is asynchronous, neither server stops accepting client traffic.
On both servers, stop the I/O thread.
mysql> STOP SLAVE IO_THREAD;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: a.example.com
Master_User: replicator
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 3415
Relay_Log_File: mysql-relay-bin.000002
Relay_Log_Pos: 803
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 3415
Relay_Log_Space: 959
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 20
1 row in set (0.00 sec)
mysql>
SHOW SLAVE STATUS
on A shows Slave_IO_Running
is No
SHOW SLAVE STATUS
on B shows Slave_IO_Running
is No
Insert a new row on A
mysql> INSERT INTO important.stuff SET details='Unique data from A'; Query OK, 1 row affected (0.03 sec)
Insert something complicated on B
mysql> INSERT INTO important.stuff SET details='Unique data from B'; Query OK, 1 row affected (0.03 sec)
Restart the replication I/O Thread on both servers.
mysql> START SLAVE; Query OK, 0 rows affected (0.00 sec)
Now check replication status on both servers. The INSERTs have caused replication to halt.
mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: b.example.com Master_User: replicator Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 4127 Relay_Log_File: mysql-relay-bin.000003 Relay_Log_Pos: 253 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1062 Last_Error: Error 'Duplicate entry '11' for key 'PRIMARY'' on query. Default database: ''. Query: 'INSERT INTO important.stuff SET details='Unique data from B'' Skip_Counter: 0 Exec_Master_Log_Pos: 3884 Relay_Log_Space: 1585 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1062 Last_SQL_Error: Error 'Duplicate entry '11' for key 'PRIMARY'' on query. Default database: ''. Query: 'INSERT INTO important.stuff SET details='Unique data from B'' Replicate_Ignore_Server_Ids: Master_Server_Id: 20 1 row in set (0.00 sec) mysql>
SHOW SLAVE STATUS
on A shows Slave_IO_Running
is Yes
SHOW SLAVE STATUS
on A shows Slave_SQL_Running
is No
SHOW SLAVE STATUS
on A shows Last_Errno
is 1062
SHOW SLAVE STATUS
on B shows Slave_IO_Running
is Yes
SHOW SLAVE STATUS
on B shows Slave_SQL_Running
is No
SHOW SLAVE STATUS
on B shows Last_Errno
is 1062
MySQL has two settings that can help master-master pairs avoid duplicate auto-increment values.
auto_increment_increment
tells MySQL how much to increase the next AUTO_INCREMENT
value. The default increment is 1: the next value is the last value +1. Changing the increment to 2 will make MySQL use every-other value (e.g., 1,3,5,7 instead of 1,2,3,4).
auto_increment_offset
tells MySQL where to start numbering AUTO_INCREMENT values. The default offset is 1.
The formula for what AUTO_INCREMENT value to assign the Nth row is auto_increment_offset + (N * auto_increment_increment)
We're going to use these settings so that A uses odd numbers (offset 1, increment 2) and B uses even numbers (offset 2, increment 2). This will prevent future auto_increment key collisions.
On server A, edit /etc/my.cnf
mysql> exit a ~ $ sudoedit /etc/my.cnf
Add these two lines to directly under [mysqld]
/etc/my.cnf[mysqld] auto_increment_increment = 2 auto_increment_offset = 1 relay_log = mysql-relay-bin log_slave_updates = 1 log_bin = mysql-bin server_id = 10 ......some content not shown
Now restart MySQL:
a ~ $ sudo service mysqld restart Stopping mysqld: [ OK ] Starting mysqld: [ OK ]
On server B, edit /etc/my.cnf
mysql> exit b ~ $ sudoedit /etc/my.cnf
Add these two lines to directly under [mysqld]
/etc/my.cnf[mysqld] auto_increment_increment = 2 auto_increment_offset = 2 relay_log = mysql-relay-bin log_slave_updates = 1 log_bin = mysql-bin server_id = 20 ......some content not shown
Now restart MySQL:
b ~ $ sudo service mysqld restart Stopping mysqld: [ OK ] Starting mysqld: [ OK ]
/etc/my.cnf
sets auto_increment_increment
to 2
/etc/my.cnf
sets auto_increment_offset
to 1
auto_increment_increment
has value 2
auto_increment_offset
has value 1
/etc/my.cnf
sets auto_increment_increment
to 2
/etc/my.cnf
sets auto_increment_offset
to 2
auto_increment_increment
has value 2
auto_increment_offset
has value 1
The new AUTO_INCREMENT settings will help prevent this problem with future INSERTs, but it won't fix the rows that are blocking replication right now.
Look at the Last_Error
in the slave status to see the id of the row causing the problem:
b ~ $ mysql -u root mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: a.example.com ......some content not shown Replicate_Wild_Ignore_Table: Last_Errno: 1062 Last_Error: Error 'Duplicate entry '11' for key 'PRIMARY'' on query. Default database: ''. Query: 'INSERT INTO important.stuff SET details='Unique data from B'' Skip_Counter: 0 Exec_Master_Log_Pos: 3882 ......some content not shown
Note the value of the duplicate entry. (It's 11 in this example.) Look up that record:
mysql> select * from important.stuff where id=11;
+----+--------------------+---------------------+
| id | details | happened |
+----+--------------------+---------------------+
| 11 | Unique data from B | 2013-04-06 19:33:44 |
+----+--------------------+---------------------+
1 row in set (0.00 sec)
mysql>
Now delete the row with that ID, only from B. It's important that the act of deleting the row never replicate to A, or it will delete the unique data with the same id. By turning off binary logging, this change will apply to the local database, but won't be seen by the I/O thread on A.
mysql> SET sql_log_bin="OFF";
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like "sql_log_bin";
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| sql_log_bin | OFF |
+---------------+-------+
1 row in set (0.00 sec)
mysql> delete from important.stuff where id=11;
Query OK, 1 row affected (0.03 sec)
Now turn binary logging back on, then re-INSERT the row. We want this version of the INSERT to replicate to A (and of course everything after this.) Providing a NULL value for the id will let us make sure the new auto-increment settings work.
mysql> SET sql_log_bin="ON"; Query OK, 0 rows affected (0.00 sec) mysql> insert into important.stuff VALUES(NULL, 'Unique data from B', '2013-04-06 19:33:44'); Query OK, 1 row affected (0.02 sec)
Select the last row from the table to see what id was assigned to the new row.
mysql> select * from important.stuff order by id DESC LIMIT 1;
+----+--------------------+---------------------+
| id | details | happened |
+----+--------------------+---------------------+
| 12 | Unique data from B | 2013-04-06 19:33:44 |
+----+--------------------+---------------------+
1 row in set (0.00 sec)
mysql>
Note the new id. It should be higher than the old value and, because it was inserted on B, it should be even.
Start replication on B.
mysql> slave start; Query OK, 0 rows affected (0.00 sec) mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: a.example.com Master_User: replicator Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000002 Read_Master_Log_Pos: 107 Relay_Log_File: mysql-relay-bin.000006 Relay_Log_Pos: 253 Relay_Master_Log_File: mysql-bin.000002 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 107 Relay_Log_Space: 555 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 10 1 row in set (0.00 sec)
The formerly conflicting row from A now happily coexists with the new row on B.
mysql> select * from important.stuff where id >= 11; +----+--------------------+---------------------+ | id | details | happened | +----+--------------------+---------------------+ | 11 | Unique data from A | 2013-04-06 19:33:44 | | 12 | Unique data from B | 2013-04-06 19:33:44 | +----+--------------------+---------------------+ 2 rows in set (0.00 sec) mysql>
SHOW SLAVE STATUS
on B shows Slave_IO_Running
is Yes
SHOW SLAVE STATUS
on B shows Slave_SQL_Running
is Yes
SHOW SLAVE STATUS
on B shows Last_Errno
is 0
on A, we still need to skip the insert from B with the bad index.
a ~ $ mysql -u root mysql> SET GLOBAL sql_slave_skip_counter = 1; Query OK, 0 rows affected (0.00 sec) mysql>
Now start replication on A.
mysql> slave start; Query OK, 0 rows affected (0.00 sec) mysql>
You should see the row from B with the new id.
mysql> select * from important.stuff where id >= 11; +----+--------------------+---------------------+ | id | details | happened | +----+--------------------+---------------------+ | 11 | Unique data from A | 2013-04-06 19:33:44 | | 12 | Unique data from B | 2013-04-06 19:33:44 | +----+--------------------+---------------------+ 2 rows in set (0.00 sec) mysql>
SHOW SLAVE STATUS
on A shows Slave_IO_Running
is Yes
SHOW SLAVE STATUS
on A shows Slave_SQL_Running
is Yes
SHOW SLAVE STATUS
on A shows Last_Errno
is 0
Re-test normal replication, following the instructions from step 1 and step 2.
important.stuff
important.stuff
Re-test paused replication, following the instructions from step 3 and step 4.
This time, the changes we made in step 5 will prevent a duplicate key, and replication will not halt.
important.stuff
important.stuff
SHOW SLAVE STATUS
shows Last_Errno
is 0
SHOW SLAVE STATUS
shows Last_Errno
is 0