Forum
Welcome, Guest
Username: Password: Remember me

TOPIC:

ha-cfg commands not working after slave fenced 7 years 4 months ago #1081

  • Andrew Foster
  • Andrew Foster's Avatar Topic Author
  • Offline
  • Posts: 15
Hi there,
We have set up a 2 node pool with XenServer 7.0 and the recommended version of the noSAN auto-installer. All working well until we started testing fail scenarios.

Unplugged management network from slave host. Slave fenced itself (POOL mode) and rebooted. VMs running on slave started up again automatically on master.

The slave host is now back up with network restored. DRBD did not start automatically. Slave does appear to have re-entered the pool automatically and logs look normal.

Running "ha-cfg get" on master now returns no output. Running it on slave is OK.

Any ideas what went wrong with DRBD, and how I can now return "ha-cfg get" functionality to my master?

This is the output of ha-cfg get on slave:
DISABLED_VAPPS=()
ENABLE_LOGGING=1
FENCE_ACTION=stop
FENCE_ENABLED=1
FENCE_FILE_LOC=/etc/ha-lizard/fence
FENCE_HA_ONFAIL=0
FENCE_HEURISTICS_IPS=172.16.14.165
FENCE_HOST_FORGET=0
FENCE_IPADDRESS=
FENCE_METHOD=POOL
FENCE_MIN_HOSTS=2
FENCE_PASSWD=
FENCE_QUORUM_REQUIRED=1
FENCE_REBOOT_LONE_HOST=0
FENCE_USE_IP_HEURISTICS=1
GLOBAL_VM_HA=0
HOST_SELECT_METHOD=0
MAIL_FROM=ha-lizard@removed.co.uk
MAIL_ON=1
MAIL_SUBJECT="SYSTEM_ALERT-FROM_HOST:$HOSTNAME"
MAIL_TO=removed@removed.co.uk
MGT_LINK_LOSS_TOLERANCE=5
MONITOR_DELAY=15
MONITOR_KILLALL=1
MONITOR_MAX_STARTS=20
MONITOR_SCANRATE=10
OP_MODE=2
PROMOTE_SLAVE=1
SLAVE_HA=1
SLAVE_VM_STAT=0
SMTP_PASS=""
SMTP_PORT="25"
SMTP_SERVER=msa.removed.co.uk
SMTP_USER=""
XAPI_COUNT=2
XAPI_DELAY=10
XC_FIELD_NAME='ha-lizard-enabled'
XE_TIMEOUT=10

Please Log in or Create an account to join the conversation.

ha-cfg commands not working after slave fenced 7 years 4 months ago #1083

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
If DRBD did not auto start on the recovered slave, it could be that the iscsi-ha service is not running (it is responsible for starting DRBD and placing it in the correct role). Can you check "service iscsi-ha status" and make sure it is running?

Also, what version of ha-lizard and iscsi-ha do you have installed?

On the master which fails "ha-cfg get" - does it hang or does it immediately return empty results? Are you logged in as root on the master when actioning "ha-cfg get"?

Please Log in or Create an account to join the conversation.

ha-cfg commands not working after slave fenced 7 years 4 months ago #1087

  • Andrew Foster
  • Andrew Foster's Avatar Topic Author
  • Offline
  • Posts: 15
ha-lizard 2.1.2
iscsi-ha 2.1.1

[root@xen-01 ~]# service iscsi-ha status
iscsi-ha running: 3599                    [  OK  ]

ha-cfg functionality has now come back to life and I haven't seen it since, so let's put that aside for now.

I've tried the same test the other way around, unplugged management network from the master. The slave did not promote itself to master despite logs stating it knew it had lost contact with master and was allowed to promote itself.

Rebooted master and everything except DRBD came back. Now failed VMs are attempting to start on the master but can't because DRBD is not running and slave is still in secondary mode.

Is this software tested on XenServer 7.0?

Please Log in or Create an account to join the conversation.

ha-cfg commands not working after slave fenced 7 years 4 months ago #1089

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
Your test scenarios have been extensively tested on xs7. Our official release which will be posted on our website for download has a couple of small bug fixes that are not in the version you are testing.

You can try to delete this file (there is a syntax error in the file.). Deleting it should help in recovery time.
/etc/systemd/system/tgtd.service.d/local.conf

Also, without logs it is hard to tell what is happening in your test

Please Log in or Create an account to join the conversation.

ha-cfg commands not working after slave fenced 7 years 4 months ago #1090

  • Andrew Foster
  • Andrew Foster's Avatar Topic Author
  • Offline
  • Posts: 15
I've updated to 2.1.3 for both components.

After removing management link from Master, failover did work correctly.

I brought everything back online, then put slave into maintenance mode before rebooting slave.

Now slave is back up again but DRBD still not running automatically.

iscsi-ha service is running.

I can bring up DRBD resource with "drbdadm up iscsi1"

Which logs would you like to see?

Please Log in or Create an account to join the conversation.

ha-cfg commands not working after slave fenced 7 years 4 months ago #1091

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
/var/log/user.log

You can trim the file down to the relative time of the test. Also, let me know the approximate time stamp near the event.

Please Log in or Create an account to join the conversation.