Forum
Welcome, Guest
Username: Password: Remember me

TOPIC:

Slave Not Promoting On New Install 5 years 4 months ago #1704

I have the same problem...

2 node pool deployed for R&D... set up as per documentation. No problems encountered during deployment. iSCSI, DRBD all working as expected.

Management interfaces are connected to the LAN. Replication interfaces connected via crossover cable.

Pull the plug on Master to simulate a disaster scenario but the slave does not react or self promote. HA Enabled just shows a timeout.

At this point, if i run ha-cfg log, i get only the status screen.

ha-lizard Version: 2.1.4
Operating Mode: Mode [ 1 ] Managing Appliances
Host Role: slave
Pool UUID:
Host UUID:
Master UUID:
Daemon Status: ha-lizard is running [ OK ]
Watchdog Status: ha-lizard-watchdog is running [ OK ]
HA Enabled: TIMEOUT

Please Log in or Create an account to join the conversation.

Last edit: by Nathan Scannell.

Slave Not Promoting On New Install 5 years 4 months ago #1705

I just turned the Master back on and checked the slave....

ha-lizard Version: 2.1.4
Operating Mode: Mode [ 2 ] Managing All VMs in Pool
Host Role: slave
Pool UUID: 0d97c8b2-2aae-c1a5-3903-9ffda8713d63
Host UUID:
Master UUID:
Daemon Status: ha-lizard is running [ OK ]
Watchdog Status: ha-lizard-watchdog is running [ OK ]
HA Enabled: true


Now ha-cfg log works but i can see this in the log....


Nov 26 13:11:51 ks2 ha-lizard: master_ip: Pool Master IP Address = 192.168.69.91
Nov 26 13:11:51 ks2 ha-lizard: Validating master is still a master
Nov 26 13:11:51 ks2 ha-lizard: [ /etc/ha-lizard/scripts/timeout 1 /etc/ha-lizard/scripts/host_is_slave 192.168.69.91 ]
Nov 26 13:11:52 ks2 ha-lizard: This slave - ks2: selected as allowed to become master: setting ALLOW_PROMOTE_MASTER=1
Nov 26 13:11:52 ks2 ha-lizard: check_xapi: Pool Host 192.168.69.91 xapi status = 0
Nov 26 13:11:52 ks2 ha-lizard: Mail Spool Directory Found /dev/shm/ha-lizard-mail
Nov 26 13:11:52 ks2 ha-lizard: check_email_enabled: Email enabled for check_xapi
Nov 26 13:11:52 ks2 ha-lizard: email: Duplicate message - not sending. Content = check_xapi: Pool Host on Server: 192.168.69.91 not responding to HTTP - manual intervention may be required
Nov 26 13:11:52 ks2 ha-lizard: email: Message barred for 60 minutes
Nov 26 13:11:52 ks2 ha-lizard: Pool Master NOT OK - Checking if ha-lizard is enabled in latest state file
Nov 26 13:11:52 ks2 ha-lizard: Checking if ha-lizard is enabled
Nov 26 13:11:52 ks2 ha-lizard: Statefile /etc/ha-lizard/state/ha_lizard_enabled found: checking if ha-lizard is enabled
Nov 26 13:11:52 ks2 ha-lizard-ERROR-/etc/ha-lizard/init/ha-lizard.mon: /etc/ha-lizard/ha-lizard.sh: line 369: [: =: unary operator expected
Nov 26 13:11:52 ks2 ha-lizard: ha-lizard is disabled - exiting
Nov 26 13:11:55 ks2 ha-lizard: ha-lizard Watchdog: ha-lizard running - OK

Please Log in or Create an account to join the conversation.

Slave Not Promoting On New Install 5 years 4 months ago #1706

Sorry Phillipe, don't mean to hijack your thread... B)


It looks like my slave became fenced?? There was never any notice of this and ha-cfg was normal... now i'm lost.


[root@ks1 ~]# ls -l /etc/ha-lizard/state/
total 0
Nov 26 11:27 autopromote_uuid
Nov 26 13:20 ha_lizard_enabled
Nov 26 13:20 host.3b96dc7d-9290-4a61-818c-d4f7d08790f8.vmlist.uuid_array
Nov 26 11:27 host.ea71227c-1037-4c4f-9e1d-358d5f70bdc1.fenced
Nov 26 13:20 host.ea71227c-1037-4c4f-9e1d-358d5f70bdc1.vmlist.uuid_array
Nov 26 13:20 host_ip_list
Nov 26 13:20 host_uuid_ip_list
Nov 26 13:20 local_host_uuid
Nov 26 13:20 master_uuid
Nov 26 13:20 pool_num_hosts
Nov 26 13:20 status_report


[root@ks2 ~]# ls -l /etc/ha-lizard/state/
total 0
Nov 25 21:18 autopromote_uuid
Nov 25 21:18 ha_lizard_enabled
Nov 25 21:18 local_host_uuid

Please Log in or Create an account to join the conversation.

Last edit: by Nathan Scannell.

Slave Not Promoting On New Install 5 years 4 months ago #1707

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
Ok. This is starting to look like an issue with http access between the hosts. When the hosts check eachothers management interface, 2 validations are performed. Http and ping.

Can you reply with the version of xcp or xenserver you are testing with and also whether you can ping each of the management interfaces?

Am guessing you are on 7.6 and something has changed in the latest release. If that is the case, we have a new release almost ready and can add a fix to address this issue

Please Log in or Create an account to join the conversation.

Slave Not Promoting On New Install 5 years 4 months ago #1709

Using XCP-ng 7.6.0

Hosts have no problem pinging each other so we might be seeing a HTTP problem with this flavour of XenServer.

What is the HTTP interaction for? What port?

I can confirm that port 80 is working for its own web page.

Please Log in or Create an account to join the conversation.

Slave Not Promoting On New Install 5 years 4 months ago #1710

FYI... just uninstalled and re-installed on the same system and the slave is getting fenced immediately after install.

This is one iteration of ha-cfg log in the Master:



Nov 26 14:37:45 ks1 ha-lizard: This iteration is count 75
Nov 26 14:37:45 ks1 ha-lizard: Checking if this host is a Pool Master or Slave
Nov 26 14:37:45 ks1 ha-lizard: This host's pool status = master
Nov 26 14:37:45 ks1 ha-lizard: Checking if ha-lizard is enabled for this pool
Nov 26 14:37:45 ks1 ha-lizard: check_ha_enabled: Checking if ha-lizard is enabled for pool: 0d97c8b2-2aae-c1a5-3903-9ffda8713d63
Nov 26 14:37:45 ks1 ha-lizard: check_ha_enabled: ha-lizard is enabled
Nov 26 14:37:45 ks1 ha-lizard: check_ha_enabled: checking whether maintenance mode enabled
Nov 26 14:37:46 ks1 ha-lizard: ha-lizard is enabled
Nov 26 14:37:46 ks1 ha-lizard: check_xs_ha: Checking XenServer HA status
Nov 26 14:37:46 ks1 ha-lizard: update_global_conf_params: Successfully updated global pool configuration settings in /etc/ha-lizard/ha-lizard.pool.conf.
Nov 26 14:37:46 ks1 ha-lizard: update_global_conf_params: DISABLED_VAPPS=()#012ENABLE_LOGGING=1#012FENCE_ACTION=stop#012FENCE_ENABLED=1#012FENCE_FILE_LOC=/etc/ha-lizard/fence#012FENCE_HA_ONFAIL=0#012FENCE_HEURISTICS_IPS=192.168.69.254#012FENCE_HOST_FORGET=0#012FENCE_IPADDRESS=#012FENCE_METHOD=POOL#012FENCE_MIN_HOSTS=2#012FENCE_PASSWD=#012FENCE_QUORUM_REQUIRED=1#012FENCE_REBOOT_LONE_HOST=0#012FENCE_USE_IP_HEURISTICS=1#012GLOBAL_VM_HA=1#012HOST_SELECT_METHOD=0#012MAIL_FROM="root@localhost"#012MAIL_ON=1#012MAIL_SUBJECT="SYSTEM_ALERT-FROM_HOST:$HOSTNAME"#012MAIL_TO="root@localhost"#012MGT_LINK_LOSS_TOLERANCE=5#012MONITOR_DELAY=15#012MONITOR_KILLALL=1#012MONITOR_MAX_STARTS=20#012MONITOR_SCANRATE=10#012OP_MODE=2#012PROMOTE_SLAVE=1#012SLAVE_HA=1#012SLAVE_VM_STAT=0#012SMTP_PASS=""#012SMTP_PORT="25"#012SMTP_SERVER="127.0.0.1"#012SMTP_USER=""#012XAPI_COUNT=2#012XAPI_DELAY=10#012XC_FIELD_NAME='ha-lizard-enabled'#012XE_TIMEOUT=10
Nov 26 14:37:46 ks1 ha-lizard: check_master_mgt_link_state: Checking management interface link state
Nov 26 14:37:46 ks1 ha-lizard: check_master_mgt_link_state: Link State = [ true ] for management interface IP [ 192.168.69.91 ]
Nov 26 14:37:46 ks1 ha-lizard: check_master_mgt_link_state: Link [ xenbr0 ] state UP
Nov 26 14:37:46 ks1 ha-lizard: Master management link OK - checking prior link state
Nov 26 14:37:46 ks1 ha-lizard: This host detected as pool Master
Nov 26 14:37:46 ks1 ha-lizard: Found 2 hosts in pool
Nov 26 14:37:46 ks1 ha-lizard: validate_vm_ha_state: Validating VM HA-state
Nov 26 14:37:46 ks1 ha-lizard: validate_vm_ha_state: VM [ 37f79cd3-aeb5-217a-cfe2-d9b1ed86d72b ] state [ false ] = OK
Nov 26 14:37:46 ks1 ha-lizard: Calling function write_pool_state
Nov 26 14:37:46 ks1 ha-lizard: 3488 Calling function autoselect_slave
Nov 26 14:37:46 ks1 ha-lizard: 3493 Calling function check_slave_status
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_master_mgt_link_state: Checking management interface link state
Nov 26 14:37:46 ks1 ha-lizard: 3488 autoselect_slave: This host UUID found: 3b96dc7d-9290-4a61-818c-d4f7d08790f8
Nov 26 14:37:46 ks1 ha-lizard: 3488 autoselect_slave: MASTER host UUID found: 3b96dc7d-9290-4a61-818c-d4f7d08790f8
Nov 26 14:37:46 ks1 ha-lizard: get_vms_on_host: Returned 37f79cd3-aeb5-217a-cfe2-d9b1ed86d72b
Nov 26 14:37:46 ks1 ha-lizard: 3488 autoselect_slave: 3b96dc7d-9290-4a61-818c-d4f7d08790f8 is Master UUID - excluding from list of available slaves
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_master_mgt_link_state: Link State = [ true ] for management interface IP [ 192.168.69.91 ]
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_master_mgt_link_state: Link [ xenbr0 ] state UP
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_slave_status: Management link OK - continue
Nov 26 14:37:46 ks1 ha-lizard: get_vms_on_host: No VMs found on host: ea71227c-1037-4c4f-9e1d-358d5f70bdc1
Nov 26 14:37:46 ks1 ha-lizard: 3488 autoselect_slave: 1 available Slave UUIDs found: ea71227c-1037-4c4f-9e1d-358d5f70bdc1
Nov 26 14:37:46 ks1 ha-lizard: 3493 get_pool_host_list: returned 3b96dc7d-9290-4a61-818c-d4f7d08790f8#012ea71227c-1037-4c4f-9e1d-358d5f70bdc1
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_slave_status: Removing Master UUID from list of Hosts
Nov 26 14:37:46 ks1 ha-lizard: 3488 autoselect_slave: Selected Slave: ea71227c-1037-4c4f-9e1d-358d5f70bdc1 = Current slave: ea71227c-1037-4c4f-9e1d-358d5f70bdc1 - ignoring update
Nov 26 14:37:46 ks1 ha-lizard: 3493 get_pool_ip_list: returned 192.168.69.92
Nov 26 14:37:46 ks1 ha-lizard: check_ha_enabled: Checking if ha-lizard is enabled for pool: 0d97c8b2-2aae-c1a5-3903-9ffda8713d63
Nov 26 14:37:46 ks1 ha-lizard: check_ha_enabled: ha-lizard is enabled
Nov 26 14:37:46 ks1 ha-lizard: check_ha_enabled: checking whether maintenance mode enabled
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_xapi: Pool Host 192.168.69.92 xapi status = 0
Nov 26 14:37:46 ks1 ha-lizard: 3493 Mail Spool Directory Found /dev/shm/ha-lizard-mail
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_email_enabled: Email enabled for check_xapi
Nov 26 14:37:46 ks1 ha-lizard: 3493 email: Duplicate message - not sending. Content = check_xapi: Pool Host on Server: 192.168.69.92 not responding to HTTP - manual intervention may be required
Nov 26 14:37:46 ks1 ha-lizard: 3493 email: Message barred for 60 minutes
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_slave_status: Slave host [ ea71227c-1037-4c4f-9e1d-358d5f70bdc1 ] health status = [ failed ] - break
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_slave_status: Host IP Address check Status Array for Slaves = (0)
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_slave_status: Quorum check called
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_quorum: Checking host IPs: 192.168.69.91 192.168.69.92
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_quorum: Host IP: 192.168.69.91 Response = OK
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_quorum: LIVE HOSTs = 1
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_quorum: Host IP: 192.168.69.92 Response = OK
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_quorum: LIVE HOSTs = 2
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_quorum: Using network points: 192.168.69.254 as possible additional vote
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_quorum: Heuristic IP: 192.168.69.254 Response = OK
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_quorum: Successful Replies = 1
Nov 26 14:37:46 ks1 ha-lizard: 3493 Total enpoints checked = 1 with total successful replies = 1
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_quorum: Additional heuristic vote success. Incremeting vote by 1
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_quorum: Minimum number of hosts needed to allow fencing = 1 + 1
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_quorum: 3 Hosts found. Minimum needed = 1 + 1. Fencing allowed
Nov 26 14:37:46 ks1 ha-lizard: get_pool_host_list: enabled flag set - returning only hosts with enabled=true
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_slave_status: Failed slave count = 1
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_slave_status: Processing failed slave: ea71227c-1037-4c4f-9e1d-358d5f70bdc1 on this iteration
Nov 26 14:37:46 ks1 ha-lizard: 3493 Mail Spool Directory Found /dev/shm/ha-lizard-mail
Nov 26 14:37:46 ks1 ha-lizard: 3493 check_email_enabled: Email enabled for check_slave_status
Nov 26 14:37:47 ks1 ha-lizard: get_pool_host_list: returned 3b96dc7d-9290-4a61-818c-d4f7d08790f8#012ea71227c-1037-4c4f-9e1d-358d5f70bdc1
Nov 26 14:37:47 ks1 ha-lizard: 3493 email: Duplicate message - not sending. Content = check_slave_status: Server ks1: Some Pool Slaves not not responding , ea71227c-1037-4c4f-9e1d-358d5f70bdc1
Nov 26 14:37:47 ks1 ha-lizard: 3493 email: Message barred for 60 minutes
Nov 26 14:37:47 ks1 ha-lizard: 3493 check_slave_status: Some Pool Slaves not not responding , ea71227c-1037-4c4f-9e1d-358d5f70bdc1
Nov 26 14:37:47 ks1 ha-lizard: 3493 check_slave_status: Calling function get_vms_on_host for UUID(s) ea71227c-1037-4c4f-9e1d-358d5f70bdc1
Nov 26 14:37:47 ks1 ha-lizard: 3493 check_slave_status: Calling function fence_host to remove unresponsive host from pool. Failed Host(s) = ea71227c-1037-4c4f-9e1d-358d5f70bdc1
Nov 26 14:37:47 ks1 ha-lizard: 3493 check_slave_status: fence_host ea71227c-1037-4c4f-9e1d-358d5f70bdc1 executed on prior iteration - host already fenced
Nov 26 14:37:47 ks1 ha-lizard: 3493 Function check_slave_status Host Power = Off, calling vm_mon
Nov 26 14:37:47 ks1 ha-lizard: 3493 vm_mon: ha-lizard is operating mode 2 - managing pool VMs
Nov 26 14:37:47 ks1 ha-lizard: get_pool_ip_list: returned 192.168.69.91
Nov 26 14:37:47 ks1 ha-lizard: 3493 vm_mon: Retrived list of VMs for this poll: 37f79cd3-aeb5-217a-cfe2-d9b1ed86d72b
Nov 26 14:37:47 ks1 ha-lizard: 3493 vm_mon: Removing Control Domains from VM list
Nov 26 14:37:47 ks1 ha-lizard: 3493 vm_mon: VM list returned = 37f79cd3-aeb5-217a-cfe2-d9b1ed86d72b
Nov 26 14:37:47 ks1 ha-lizard: get_pool_ip_list: returned 192.168.69.91 192.168.69.92
Nov 26 14:37:47 ks1 ha-lizard: write_status_report: Writing status report
Nov 26 14:37:47 ks1 ha-lizard: 3493 vm_state: Machine state for 37f79cd3-aeb5-217a-cfe2-d9b1ed86d72b returned: running
Nov 26 14:37:47 ks1 ha-lizard: 3493 vm_mon: VM 37f79cd3-aeb5-217a-cfe2-d9b1ed86d72b state = running
Nov 26 14:37:47 ks1 ha-lizard: 3493 vm_mon: 0 Eligible Halted VMs found
Nov 26 14:37:52 ks1 ha-lizard: ha-lizard Watchdog: ha-lizard running - OK
Nov 26 14:38:00 ks1 ha-lizard: 3306 Spawning new instance of ha-lizard
Nov 26 14:38:00 ks1 ha-lizard: Mail Spool Directory Found /dev/shm/ha-lizard-mail

Please Log in or Create an account to join the conversation.