Promote a Slave to Replace a Failed Master

Background

When a master server fails, a replicated slave can be "promoted" to take over for it.

If you have several slaves, you need to pick the best slave to promote, and then reconfigure the remaining slaves to replicate from the promoted slave.

Because MySQL Replication is asynchronous, on systems under heavy load, some slaves may not have received every transaction the master committed before it failed. (This can be mitigated in a planned outage.)

So it is necessary before promoting a slave to make sure that it has the most complete transaction history available. Then when changing the other slaves to replicate from the new master, each needs to start replicating from the last transaction that slave already knows about.

Steps

Determine Which Slave to Promote

First, review the replication status on each slave to see which has received the most data from the now-dead master.

The most up-to-date slave will have the highest Master_Log_File number. If both slaves were reading from the same file, the one with the highest Read_Master_Log_Pos is most up-to-date.

Connect to slave_a.example.com

slave_a ~ $ mysql -u root
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Connecting to master
                  Master_Host: master.example.com
                  Master_User: replicator
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 84840
               Relay_Log_File: mysql-relay-bin.000004
                Relay_Log_Pos: 4
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Connecting
            Slave_SQL_Running: Yes
......some content not shown

Connect to slave_b.example.com

slave_b ~ $ mysql -u root
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Connecting to master
                  Master_Host: master.example.com
                  Master_User: replicator
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 90308
               Relay_Log_File: mysql-relay-bin.000004
                Relay_Log_Pos: 4
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Connecting
            Slave_SQL_Running: Yes
......some content not shown

Here, slave_b is further ahead than slave_a.

Promote the New Master

Because slave_b is more up-to-date, it will become the new master.

First, make sure this server can't be interrupted accidentally by any efforts to recover the old master, by stopping replication:

mysql> slave stop;
Query OK, 0 rows affected (0.02 sec)

mysql> reset slave all;
Query OK, 0 rows affected (0.06 sec)

Now that slave_b will become a writeable master instead of a read slave, we need to un-set the read_only configuration.

On slave_b edit /etc/my.cnf

mysql> exit
slave_b ~ $ sudoedit /etc/my.cnf

Delete the line read_only = true

/etc/my.cnf
[mysqld] read_only = true relay_log = mysql-relay-bin log_slave_updates = 1 log_bin = mysql-bin server_id = 30 ......some content not shown

Now restart MySQL:

slave_b ~ $ sudo service mysqld restart
Stopping mysqld:                                           [  OK  ]
Starting mysqld:                                           [  OK  ]
After this step is completed:
  1. On slave_b, SHOW SLAVE STATUS shows no replication configuration.
  2. On slave_b, /etc/my.cnf does not set read_only
  3. On slave_b, the MySQL variable read_only has value OFF
Start Remaining Slaves Replicating from the New Master

First, you need to find the last data slave_a received from the master. Begin by finding the binary log file slave_a is currently writing to.

mysql> show master status;
+------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000003 |    38755 |              |                  |
+------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

Now we can use mysqlbinlog to read the binary log file and find the last statement this server replicated from the old master.

MySQL's data directory is /var/lib/mysql/ in this course. This is set by the datadir setting in /etc/my.cnf

mysql> exit
Bye
slave_a ~ $ sudo mysqlbinlog /var/lib/mysql/mysql-bin.000003 | tail -n20
#130410 20:38:59 server id 10  end_log_pos 82400     Query    thread_id=7    exec_time=0    error_code=0
SET TIMESTAMP=1365626339/*!*/;
BEGIN
/*!*/;
# at 82400
#130410 20:38:59 server id 10  end_log_pos 82428     Intvar
SET INSERT_ID=298/*!*/;
# at 82428
#130410 20:38:59 server id 10  end_log_pos 82581     Query    thread_id=7    exec_time=0    error_code=0
SET TIMESTAMP=1365626339/*!*/;
INSERT INTO stuff VALUES (NULL, 'sing zone wrap swap', '2013-04-09 03:02:38')
/*!*/;
# at 82581
#130410 20:38:59 server id 10  end_log_pos 82608     Xid = 608
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
        

Now on the new master, slave_b find the corresponding event. The central problem now is that when an event replicates from the master, where it is stored in the slave's binary log is entirely up to the slave. If slaves are rebooted or upgraded at different times, their binary log layout can be completely different.

Start by seeing how many binary log files we need to look through.

slave_b ~ $ cd /var/lib/mysql
slave_b /var/lib/mysql $ ls mysql-bin.*
mysql-bin.000001  mysql-bin.000003  mysql-bin.index
mysql-bin.000002  mysql-bin.000004

Now search these files for the exact statement that was the last to replicate to slave_a. When you find it, you want to note the log location of the event that BEGINs next.

slave_b /var/lib/mysql $ sudo mysqlbinlog mysql-bin.000001 | grep "INSERT INTO stuff VALUES (NULL, 'shin noon yard him', '2013-04-09 07:00:08')" --context 10
slave_b /var/lib/mysql $ sudo mysqlbinlog mysql-bin.000002 | grep "INSERT INTO stuff VALUES (NULL, 'shin noon yard him', '2013-04-09 07:00:08')" --context 10
#130411  1:59:03 server id 10  end_log_pos 18811     Query    thread_id=8    exec_time=0    error_code=0
SET TIMESTAMP=1365645543/*!*/;
BEGIN
/*!*/;
# at 18811
#130411  1:59:03 server id 10  end_log_pos 18839     Intvar
SET INSERT_ID=296/*!*/;
# at 18839
#130411  1:59:03 server id 10  end_log_pos 18991     Query    thread_id=8    exec_time=0    error_code=0
SET TIMESTAMP=1365645543/*!*/;
INSERT INTO stuff VALUES (NULL, 'shin noon yard him', '2013-04-09 07:00:08')
/*!*/;
# at 18991
#130411  1:59:03 server id 10  end_log_pos 19018     Xid = 606
COMMIT/*!*/;
# at 19018
#130411  1:59:03 server id 10  end_log_pos 19085     Query    thread_id=9    exec_time=0    error_code=0
SET TIMESTAMP=1365645543/*!*/;
BEGIN
/*!*/;
# at 19085
slave_b /var/lib/mysql $ 

In the output above, we found the last statement to replicate to slave_a, in file mysql-bin.000002. The next statement (that will be news to slave_a) begins at position 19085

Now restart replication on slave_a pulling events after the most recent from slave_b. Be sure you're using the file and position you just collected from slave_b.

slave_a ~ $ mysql -u root
mysql> slave stop;
Query OK, 0 rows affected (0.03 sec)

mysql> CHANGE MASTER TO MASTER_HOST='slave_b.example.com', MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=19085;
Query OK, 0 rows affected (0.05 sec)

mysql> slave start;
Query OK, 0 rows affected (0.00 sec)

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: slave_b.example.com
                  Master_User: replicator
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000004
          Read_Master_Log_Pos: 107
               Relay_Log_File: mysql-relay-bin.000004
                Relay_Log_Pos: 253
        Relay_Master_Log_File: mysql-bin.000004
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 107
              Relay_Log_Space: 555
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 30
1 row in set (0.00 sec)
After this step is completed:
  1. On slave_a, SHOW SLAVE STATUS shows Master_Host is slave_b.example.com
  2. On slave_a, SHOW SLAVE STATUS shows Slave_IO_Running is Yes
  3. On slave_a, SHOW SLAVE STATUS shows Slave_SQL_Running is Yes
  4. On slave_a, SHOW SLAVE STATUS shows Last_Errno is 0

Verify the New Slave has Complete Data

No process is foolproof. How can you ensure that slave_a now contains exactly the same data slave_b does?

Because both servers are lightly loaded and the data set is small, we can make a complete backup of our application data (the important database) and get an MD5 digest of the backup. If both are identical, the MD5 will be identical. If they differ, you could diff the backups or otherwise work with your application to find and mitigate problems.

On both servers:

slave_a ~ $ mysqldump -u root important --skip-dump-date | md5sum
38f3cbbdb073619c744465b4c48c0463  -
slave_b ~ $ mysqldump -u root important --skip-dump-date | md5sum
38f3cbbdb073619c744465b4c48c0463  -

In this example, the MD5 digests match, so the data must be identical.

Test restored replication

Now, insert some new data on the new master, slave_b.

slave_b ~ $ mysql -u root
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.5.29-log MySQL Community Server (GPL)

Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> insert into important.stuff set details='Now I am the master';
Query OK, 1 row affected (0.03 sec)

mysql>

The row you just inserted should show up on the slave_a server almost immediately.

mysql> select * from important.stuff;
+-----+---------------------+---------------------+
| id  | details             | happened            |
+-----+---------------------+---------------------+
| 412 | Now I am the master | 2013-04-12 19:29:33 |
+-----+---------------------+---------------------+
1 row in set (0.00 sec)

mysql> 
After this step is completed:
  1. On slave_b, INSERT a new record in the table important.stuff
  2. Within 1 second, that record is replicated to slave_a