搭建MySQL高可用架构MHA v1.0
MHA简介
MHA的主要目的是自动化master故障转移和slave自动提升为master,在较短时间(一般为10-30秒)的停机时间,可以避免复制和一致性问题,不用花钱购买新的服务器,没有性能上的损失,不复杂的(容易安装),并且不用改变现有的部署。
MHA还提供了一个在线预定master开关的方式:改变当前正在运行的master安全的转移到到新的mater,在几秒钟内的停机时间(0.5-2秒)(只支持块写入操作)。
MHA提供了以下功能,并且可以满足许多部署的要求,如高可用性,数据完整性,不中断master的维护
1. 准备测试环境
Cloud1 (192.168.100.133) Cloud2(192.168.100.216) Cloud3(192.168.100.14)
Cloud2 和Cloud3为MHA node Cloud1为MHA Manager
2. 建立3台测试机的信任登录关系
#cloud1(MHA manger) ssh-keygen ssh-copy-id -i ~/.ssh/id_rsa.pub cloud1 ssh-copy-id -i ~/.ssh/id_rsa.pub cloud2 ssh-copy-id -i ~/.ssh/id_rsa.pub cloud3 #cloud2 ssh-keygen ssh-copy-id -i ~/.ssh/id_rsa.pub cloud1 ssh-copy-id -i ~/.ssh/id_rsa.pub cloud2 #cloud3 ssh-keygen ssh-copy-id -i ~/.ssh/id_rsa.pub cloud1 ssh-copy-id -i ~/.ssh/id_rsa.pub cloud2 ssh-copy-id -i ~/.ssh/id_rsa.pub cloud3 |
3. 下载MHA并安装,mha4mysql-node包在所有机器上都要安装,mha4mysql-manager包只需要在管理节点上安装
#MHA node(cloud1 cloud2 cloud3) rpm -ivh http://dl.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm wget http://mysql-master-ha.googlecode.com/files/mha4mysql-node-0.53-0.noarch.rpm rpm -ivh mha4mysql-node-0.53-0.noarch.rpm
#MHA manager(cloud1) yum -y install perl-Config-Tiny perl-Params-Validate\ perl-Log-Dispatch perl-Parallel-ForkManager perl-Config-IniFiles
rpm -ivh mha4mysql-manager-0.53-0.noarch.rpm |
4. 配置MHA 配置文件
#MHA manager(cloud1) mkdir /etc/masterha mkdir -p /masterha/app1 vi /etc/masterha/app1.cnf
[server default] user=mhauser password=mhauser123 manager_workdir=/masterha/app1 manager_log=/masterha/app1/manager.log remote_workdir=/masterha/app1 ssh_user=root repl_user=rep repl_password=rep123 ping_interval=1
[server1] hostname=192.168.100.133 ssh_port=9999 master_binlog_dir=/data no_master=1
[server2] hostname=192.168.100.216 ssh_port=9999 master_binlog_dir=/data
[server3] hostname=192.168.100.14 ssh_port=9999 master_binlog_dir=/data candidate_master=1
|
5. 验证ssh 信任登录是否成功
masterha_check_ssh --conf=/etc/masterha/app1.cnf
Tue Jan 15 15:36:38 2013 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Tue Jan 15 15:36:38 2013 - [info] Reading application default configurations from /etc/masterha/app1.cnf.. Tue Jan 15 15:36:38 2013 - [info] Reading server configurations from /etc/masterha/app1.cnf.. Tue Jan 15 15:36:38 2013 - [info] Starting SSH connection tests.. …. Tue Jan 15 15:36:39 2013 - [debug] ok. Tue Jan 15 15:36:39 2013 - [info] All SSH connection tests passed 看到successfully.表明ssh 测试成功 |
6. 首先部署3台mysql ,建立测试库和表,并建立起主从复制
#cloud3 为master,cloud2为slave1,cloud3为slave2 #yum 安装mysql yum –y install mysql
#设置/etc/my.cnf [client] port = 3306 socket = /var/lib/mysql/mysql.sock [mysqld] port = 3306 socket = /var/lib/mysql/mysql.sock datadir = /data skip-locking key_buffer = 16M max_allowed_packet = 1M table_cache = 64 sort_buffer_size = 512K net_buffer_length = 8K read_buffer_size = 256K read_rnd_buffer_size = 512K myisam_sort_buffer_size = 8M skip-federated log-bin=mysql-bin server-id = 1 #master id=1 ;slave 1 id =2; slave 2 id =3 [mysqldump] quick max_allowed_packet = 16M [mysql] no-auto-rehash [isamchk] key_buffer = 20M sort_buffer_size = 20M read_buffer = 2M write_buffer = 2M [myisamchk] key_buffer = 20M sort_buffer_size = 20M read_buffer = 2M write_buffer = 2M [mysqlhotcopy] interactive-timeout
#建立测试库dbtest create database dbtest; #建立测试表tb1 use dbtest;
CREATE TABLE `tb1` ( `id` int(11) NOT NULL auto_increment, `name` varchar(32) default NULL, PRIMARY KEY (`id`) );
#设置的复制权限帐号 GRANT REPLICATION SLAVE,REPLICATION CLIENT ON *.* TO 'rep'@'%' IDENTIFIED BY 'rep123'; FLUSH PRIVILEGES; #建立主从复制关系 reset master; show master status;
master_host='192.168.100.14', master_port=3306, master_user='rep', master_password='rep123', master_log_file='mysql-bin.000001', master_log_pos=98;
slave start; show slave status\G; show full processlists; #在master插入数据查看binlog是否同步 insert into tb1(name) values ('123');
#建立mha使用的帐号并设置权限 grant all on *.* to mhauser@'cloud2' identified by 'mhauser123'; grant all on *.* to mhauser@'cloud3' identified by 'mhauser123';
|
7. 验证mysql复制是否成功
masterha_check_repl --conf=/etc/masterha/app1.cnf Tue Jan 15 16:15:22 2013 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Tue Jan 15 16:15:22 2013 - [info] Reading application default configurations from /etc/masterha/app1.cnf.. Tue Jan 15 16:15:22 2013 - [info] Reading server configurations from /etc/masterha/app1.cnf.. Tue Jan 15 16:15:22 2013 - [info] MHA::MasterMonitor version 0.53. Tue Jan 15 16:15:22 2013 - [info] Dead Servers: … MySQL Replication Health is OK. 看到is OK,表面mysql复制成功 |
8. 启动并验证MHA manager
#启动MHA manager
#验证MHA状态 masterha_check_status --conf=/etc/masterha/app1.cnf app1 (pid:5605) is running(0:PING_OK), master:192.168.100.14
#关闭MHA manager
|
9. 测试master宕机后是否自动切换
#MHA node(cloud3) /etc/init.d/mysqld stop #MHA manager(cloud1) tail -f /masterha/app1/manager.log … ----- Failover Report ----- app1: MySQL Master failover 192.168.100.14 to 192.168.100.216 succeeded Master 192.168.100.14 is down! Check MHA Manager logs at cloud1:/masterha/app1/manager.log for details. Started automated(non-interactive) failover. The latest slave 192.168.100.133(192.168.100.133:3306) has all relay logs for recovery. Selected 192.168.100.216 as a new master. 192.168.100.216: OK: Applying all logs succeeded. 192.168.100.133: This host has the latest relay log events. Generating relay diff files from the latest slave succeeded. 192.168.100.133: OK: Applying all logs succeeded. Slave started, replicating from 192.168.100.216. 192.168.100.216: Resetting slave info succeeded. Master failover to 192.168.100.216(192.168.100.216:3306) completed successfully. 看到successfully.表明切换成功 |
10. 恢复原来的主数据库为master
#MHA node(cloud3) #建立到新master (cloud2)的同步关系 change master to master_host='192.168.100.216', master_port=3306, master_user='rep', master_password='rep123', master_log_file='mysql-bin.000002', master_log_pos=98;
#MHA manager(cloud1) ps auxf|grep masterha_manager|grep -v grep|awk '{print $2}'|xargs kill masterha_master_switch --master_state=alive --conf=/etc/masterha/app1.cnf … It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 192.168.100.216(192.168.100.216:3306)? (YES/no):yes … Starting master switch from 192.168.100.216(192.168.100.216:3306) to 192.168.100.14(192.168.100.14:3306)? (yes/NO): yes … Wed Jan 16 13:58:10 2013 - [info] 192.168.100.14: Resetting slave info succeeded. Wed Jan 16 13:58:10 2013 - [info] Switching master to 192.168.100.14(192.168.100.14:3306) completed successfully.
#就此最简单的master 高可用测试成功 |