HA 구성 - (5) MHA : ERROR편

2024. 1. 19. 01:12MySQL/Class

반응형

# 구성을 마치고 테스트를 진행 하던 중 발생된 각종 에러에 대해 정리를 해보도록 하겠습니다.

$ su - mhauser
$ /usr/local/bin/masterha_check_repl --conf=/etc/mha.cnf

 

 

(1) Redundant argument in sprintf at NodeUtil.pm

# 해결

$ vi /usr/local/share/perl5/5.32/MHA/NodeUtil.pm

 - 아래 빨간 글씨를 추가해 줍니다.

sub parse_mysql_version($) {
  my $str = shift;
  ($str) = $str =~ m/^[^-]*/g; 추가
  my $result = sprintf( '%03d%03d%03d', $str =~ m/(\d+)/g );
  return $result;
}

sub parse_mysql_major_version($) {
  my $str = shift;
  my $result = sprintf( '%03d%03d', $str =~ m/(\d+)/g );
  ($str) =  $str =~ m/^[^-]*/g; 추가
  my $result = sprintf( '%03d%03d%03d', $str =~ m/(\d+)/g ); 추가
  return $result;
}

 

(2) Failed to get master_ip_failover_script status with return code

# 해결

$ su -
$ vi /mha/scripts/master_ip_failover

주석처리

 

(3) replicates is not defined in the configuration file!

[root@test03 scripts]# /usr/local/bin/masterha_check_repl --conf=/etc/mha.cnf

Tue Jan 16 20:51:27 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Jan 16 20:51:27 2024 - [info] Reading application default configuration from /etc/mha.cnf..
Tue Jan 16 20:51:27 2024 - [info] Reading server configuration from /etc/mha.cnf..
Tue Jan 16 20:51:27 2024 - [info] MHA::MasterMonitor version 0.57.
Tue Jan 16 20:51:28 2024 - [error][/usr/local/share/perl5/5.32/MHA/ServerManager.pm, ln675] Master 172.16.173.130:3306 from which slave 172.16.173.134(172.16.173.134:3306) replicates is not defined in the configuration file!
Tue Jan 16 20:51:28 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/local/share/perl5/5.32/MHA/MasterMonitor.pm line 329.
Tue Jan 16 20:51:28 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Tue Jan 16 20:51:28 2024 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

# 해결

mha.cnf 에 누락된 서버가 있는지 확인 해보세요. 저는 마스터 서버를 누락 하여 발생된 에러였습니다.

누락된 서버를 추가 해주세요.

[server1]
hostname=172.16.173.130
candidate_master=1

[server2]
hostname=172.16.173.132
candidate_master=1

[server3]
hostname=172.16.173.133
candidate_master=1

[server4]
hostname=172.16.173.134
candidate_master=1

 

(4) Error happened on checking configurations. SSH Configuration Check Failed!

$ /usr/local/bin/masterha_check_repl --conf=/etc/mha.cnf

Tue Jan 16 20:54:15 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Jan 16 20:54:15 2024 - [info] Reading application default configuration from /etc/mha.cnf..
Tue Jan 16 20:54:15 2024 - [info] Reading server configuration from /etc/mha.cnf..
Tue Jan 16 20:54:15 2024 - [info] MHA::MasterMonitor version 0.57.
Tue Jan 16 20:54:16 2024 - [info] GTID failover mode = 0
Tue Jan 16 20:54:16 2024 - [info] Dead Servers:
Tue Jan 16 20:54:16 2024 - [info]   172.16.173.132(172.16.173.132:3306)
Tue Jan 16 20:54:16 2024 - [info]   172.16.173.133(172.16.173.133:3306)
Tue Jan 16 20:54:16 2024 - [info] Alive Servers:
Tue Jan 16 20:54:16 2024 - [info]   172.16.173.130(172.16.173.130:3306)
Tue Jan 16 20:54:16 2024 - [info]   172.16.173.134(172.16.173.134:3306)
Tue Jan 16 20:54:16 2024 - [info] Alive Slaves:
Tue Jan 16 20:54:16 2024 - [info]   172.16.173.134(172.16.173.134:3306)  Version=8.0.33 (oldest major version between slaves) log-bin:enabled
Tue Jan 16 20:54:16 2024 - [info]     Replicating from 172.16.173.130(172.16.173.130:3306)
Tue Jan 16 20:54:16 2024 - [info]     Primary candidate for the new Master (candidate_master is set)
Tue Jan 16 20:54:16 2024 - [info] Current Alive Master: 172.16.173.130(172.16.173.130:3306)
Tue Jan 16 20:54:16 2024 - [info] Checking slave configurations..
Tue Jan 16 20:54:16 2024 - [warning]  relay_log_purge=0 is not set on slave 172.16.173.134(172.16.173.134:3306).
Tue Jan 16 20:54:16 2024 - [info] Checking replication filtering settings..
Tue Jan 16 20:54:16 2024 - [info]  binlog_do_db= , binlog_ignore_db= 
Tue Jan 16 20:54:16 2024 - [info]  Replication filtering check ok.
Tue Jan 16 20:54:16 2024 - [info] GTID (with auto-pos) is not supported
Tue Jan 16 20:54:16 2024 - [info] Starting SSH connection tests..
Tue Jan 16 20:54:17 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. SSH Configuration Check Failed!
 at /usr/local/share/perl5/5.32/MHA/MasterMonitor.pm line 373.
Tue Jan 16 20:54:17 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Tue Jan 16 20:54:17 2024 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

 

# 해결

mha.cnf 에서 log 경로가 맞게 설정해 주었는지 확인이 필요합니다.

 

(5) Server is dead, but must be alive! Check server settings.

$ /usr/local/bin/masterha_check_repl --conf=/etc/mha.cnf
Tue Jan 16 21:48:00 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Jan 16 21:48:00 2024 - [info] Reading application default configuration from /etc/mha.cnf..
Tue Jan 16 21:48:00 2024 - [info] Reading server configuration from /etc/mha.cnf..
Tue Jan 16 21:48:00 2024 - [info] MHA::MasterMonitor version 0.57.
Tue Jan 16 21:48:01 2024 - [info] GTID failover mode = 0
Tue Jan 16 21:48:01 2024 - [info] Dead Servers:
Tue Jan 16 21:48:01 2024 - [info]   172.16.173.132(172.16.173.132:3306)
Tue Jan 16 21:48:01 2024 - [info]   172.16.173.133(172.16.173.133:3306)
Tue Jan 16 21:48:01 2024 - [info] Alive Servers:
Tue Jan 16 21:48:01 2024 - [info]   172.16.173.130(172.16.173.130:3306)
Tue Jan 16 21:48:01 2024 - [info]   172.16.173.134(172.16.173.134:3306)
Tue Jan 16 21:48:01 2024 - [info] Alive Slaves:
Tue Jan 16 21:48:01 2024 - [info]   172.16.173.134(172.16.173.134:3306)  Version=8.0.33 (oldest major version between slaves) log-bin:enabled
Tue Jan 16 21:48:01 2024 - [info]     Replicating from 172.16.173.130(172.16.173.130:3306)
Tue Jan 16 21:48:01 2024 - [info]     Primary candidate for the new Master (candidate_master is set)
Tue Jan 16 21:48:01 2024 - [info] Current Alive Master: 172.16.173.130(172.16.173.130:3306)
Tue Jan 16 21:48:01 2024 - [info] Checking slave configurations..
Tue Jan 16 21:48:01 2024 - [warning]  relay_log_purge=0 is not set on slave 172.16.173.134(172.16.173.134:3306).
Tue Jan 16 21:48:01 2024 - [info] Checking replication filtering settings..
Tue Jan 16 21:48:01 2024 - [info]  binlog_do_db= , binlog_ignore_db= 
Tue Jan 16 21:48:01 2024 - [info]  Replication filtering check ok.
Tue Jan 16 21:48:01 2024 - [info] GTID (with auto-pos) is not supported
Tue Jan 16 21:48:01 2024 - [info] Starting SSH connection tests..
Tue Jan 16 21:48:02 2024 - [info] All SSH connection tests passed successfully.
Tue Jan 16 21:48:02 2024 - [info] Checking MHA Node version..
Tue Jan 16 21:48:02 2024 - [info]  Version check ok.
Tue Jan 16 21:48:02 2024 - [error][/usr/local/share/perl5/5.32/MHA/ServerManager.pm, ln492]  Server 172.16.173.132(172.16.173.132:3306) is dead, but must be alive! Check server settings.
Tue Jan 16 21:48:02 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/local/share/perl5/5.32/MHA/MasterMonitor.pm line 402.
Tue Jan 16 21:48:02 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Tue Jan 16 21:48:02 2024 - [info] Got exit code 1 (Not master dead).

 

해결

[root@centOS09-02 log]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: ens160
  sources: 
  services: cockpit dhcpv6-client ssh
  ports: 
  protocols: 
  forward: yes
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules: 
[root@centOS09-02 log]# firewall-cmd --permanent --zone=public --add-port=3306/tcp
success
[root@centOS09-02 log]# firewall-cmd --reload
success
[root@centOS09-02 log]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: ens160
  sources: 
  services: cockpit dhcpv6-client ssh
  ports: 3306/tcp
  protocols: 
  forward: yes
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules:

 

(6) Binlog setting check failed

Tue Jan 16 22:36:03 2024 - [info] Starting SSH connection tests..
Tue Jan 16 22:36:06 2024 - [info] All SSH connection tests passed successfully.
Tue Jan 16 22:36:06 2024 - [info] Checking MHA Node version..
Tue Jan 16 22:36:06 2024 - [info]  Version check ok.
Tue Jan 16 22:36:06 2024 - [info] Checking SSH publickey authentication settings on the current master..
Tue Jan 16 22:36:07 2024 - [info] HealthCheck: SSH to 172.16.173.130 is reachable.
Tue Jan 16 22:36:07 2024 - [info] Master MHA Node version is 0.57.
Tue Jan 16 22:36:07 2024 - [info] Checking recovery script configurations on 172.16.173.130(172.16.173.130:3306)..
Tue Jan 16 22:36:07 2024 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/home/mhauser --output_file=/home/mhauer/save_binary_logs_test --manager_version=0.57 --start_file=binlog.000003 
Tue Jan 16 22:36:07 2024 - [info]   Connecting to mhauser@172.16.173.130(172.16.173.130:22).. 
Failed to save binary log: Binlog not found from /home/mhauser! If you got this error at MHA Manager, please set "master_binlog_dir=/path/to/binlog_directory_of_the_master" correctly in the MHA Manager's configuration file and try again.
 at /usr/local/bin/save_binary_logs line 123.
        eval {...} called at /usr/local/bin/save_binary_logs line 70
        main::main() called at /usr/local/bin/save_binary_logs line 66
Tue Jan 16 22:36:07 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln161] Binlog setting check failed!
Tue Jan 16 22:36:07 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln408] Master configuration failed.
Tue Jan 16 22:36:07 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/local/bin/masterha_check_repl line 48.
Tue Jan 16 22:36:07 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Tue Jan 16 22:36:07 2024 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

해결

mha.cnf 의 master_binlog_dir 에 경로가 제대로 지정되었는 확인해주세요.

 

(7) Slaves settings check failed

Tue Jan 16 22:40:02 2024 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql_data --output_file=/home/mhauser/save_binary_logs_test --manager_version=0.57 --start_file=binlog.000003 
Tue Jan 16 22:40:02 2024 - [info]   Connecting to mhauser@172.16.173.130(172.16.173.130:22).. 
  Creating /home/mhauser if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /data/mysql_data, up to binlog.000003
Tue Jan 16 22:40:03 2024 - [info] Binlog setting check done.
Tue Jan 16 22:40:03 2024 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Tue Jan 16 22:40:03 2024 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mhauser' --slave_host=172.16.173.132 --slave_ip=172.16.173.132 --slave_port=3306 --workdir=/home/mhauser --target_version=8.0.33 --manager_version=0.57 --relay_dir=/data/mysql_data --current_relay_log=relay.000002  --slave_pass=xxx
Tue Jan 16 22:40:03 2024 - [info]   Connecting to mhauser@172.16.173.132(172.16.173.132:22).. 
mysqlbinlog: error while loading shared libraries: libcrypto.so.1.1: cannot open shared object file: No such file or directory
mysqlbinlog version command failed with rc 127:0, please verify PATH, LD_LIBRARY_PATH, and client options
 at /usr/local/bin/apply_diff_relay_logs line 493.
Tue Jan 16 22:40:03 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln208] Slaves settings check failed!
Tue Jan 16 22:40:03 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln416] Slave configuration failed.
Tue Jan 16 22:40:03 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/local/bin/masterha_check_repl line 48.
Tue Jan 16 22:40:03 2024 - [error][/usr/local/share/perl5/5.32/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Tue Jan 16 22:40:03 2024 - [info] Got exit code 1 (Not master dead).

해결

mysql, mysqlbinlog 링크가 제대로 걸려있는지 확인해보세요. 저는 링크가 걸리지 않아서 발생되었습니다.

[root@centOS09-02 bin]# ll
합계 44
-r-xr-xr-x. 1 root root 16381  1월 13 23:46 apply_diff_relay_logs
-r-xr-xr-x. 1 root root  4807  1월 13 23:46 filter_mysqlbinlog
lrwxrwxrwx. 1 root root    33  1월 16 23:01 mysql -> /mysql/local/mysql_8033/bin/mysql
lrwxrwxrwx. 1 root root    39  1월 16 23:01 mysqlbinlog -> /mysql/local/mysql_8033/bin/mysqlbinlog
-r-xr-xr-x. 1 root root  8261  1월 13 23:46 purge_relay_logs
-r-xr-xr-x. 1 root root  7525  1월 13 23:46 save_binary_logs
[root@mha01 mha]# ln -s /mysql/local/mysql_8033/bin/mysqlbinlog /usr/local/bin/mysqlbinlog
[root@mha01 mha]# ln -s /mysql/local/mysql_8033/bin/mysql /usr/local/bin/mysql
반응형

'MySQL > Class' 카테고리의 다른 글

기초공부 - (2) process & thread  (0) 2024.01.26
기초공부 - (1) sql_mode  (0) 2024.01.23
HA 구성 - (4) MHA  (0) 2024.01.18
HA 구성 - (3) replication의 동작원리  (0) 2024.01.08
HA 구성 - (2) 운영 서버  (0) 2024.01.06