Forum
Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC:

How to properly change the pool master ? 8 years 9 months ago #461

Hi,
I had installed ha-lizard with ha-iscsi in default configuration on two computers,each with a single SATA drive.
Host list:
xenserver2 master
xenserver1 slave
Everything works well.

I need to change xenserver1 slave to master.
I made:
ha-cfg status disable yes
xe host-list
xe pool-designate-new-master host-uuid=eaaab0bc-2d57... (xenserver1).
ha-cfg status enable yes

After that:
In XenCenter pool I see master:xenserver1,
but on both hosts:
Scanning for Volume Group -> No Volume Group found
fdisk -l
/dev/sda1
Sdb disk disappeared.

/etc/iscsi-ha/scripts/replug_pbd did not help.

After reboot both hosts everything works well.

How to properly change the pool master without reboot ?

Please Log in or Create an account to join the conversation.

How to properly change the pool master ? 8 years 9 months ago #462

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
Your procedure looks correct.. To be sure, we tested it on our development environment and successfully switched roles without any issues.

Before re-enablng HA, we first reconnected to the pool with XenCenter and waited for the transition to complete.

Please Log in or Create an account to join the conversation.

How to properly change the pool master ? 8 years 9 months ago #487

Hi, I return to testing.
Add secondary disks to both hosts and reinstall xenserver and ha-lizard.

Test 1:
xenserver1 is master
xenserver2 is slave
xe pool-designate-new-master host-uuid=3655d4fb..(xenserver2)
After 120s all work fine !!! :)

After 10 minutes Test 2:
xenserver1 is slave
xenserver2 is master
xe pool-designate-new-master host-uuid==b02e355d..(xenserver1)

Not working :(
[root@xenserver1 ~]# pvdisplay
File descriptor 3 (/dev/tty) leaked on pvdisplay invocation. Parent PID 10745: bash
File descriptor 7 (pipe:[33485]) leaked on pvdisplay invocation. Parent PID 10745: bash

/var/log/messages
Jul 24 15:19:24 xenserver1 iscsi-ha-NOTICE-/etc/iscsi-ha/iscsi-ha.sh:   No volume groups found
Jul 24 15:19:24 xenserver1 iscsi-ha-18981-INFO-/etc/iscsi-ha/iscsi-ha.sh: Scanning for Volume Group -> iscsi-sr: 2b1f74e3-75ff-5c4d-ce6c-a93bb03d83af
Jul 24 15:19:24 xenserver1 iscsi-ha: 18998 DRBD Running on this host: version: 8.4.3 (api:1/proto:86-101) srcversion: 19422058F8A2D4AC0C8EF09 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:2272744 nr:316 dw:2273060 dr:243692 al:250 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
Jul 24 15:19:24 xenserver1 iscsi-ha: 18998 check_drbd_resource_state: DRBD Resource: iscsi1 in Primary mode
Jul 24 15:19:24 xenserver1 iscsi-ha: 18998 DRBD Resource: iscsi1 in Connected state
Jul 24 15:19:24 xenserver1 iscsi-ha: 18998 iSCSI target: /etc/init.d/tgtd status = OK. [tgtd (pid 11472 11470) is running...]
Jul 24 15:19:24 xenserver1 iscsi-ha: 18998 local_ip_list: Local IP list returned 127.0.0.1  192.168.26.31  10.10.10.1 10.10.10.3
Jul 24 15:19:24 xenserver1 iscsi-ha: 18998 Virtual IP: 10.10.10.3 discovered on host xenserver1
Jul 24 15:19:25 xenserver1 kernel: [ 5120.253433]  connection1:0: detected conn error (1020)
Jul 24 15:19:25 xenserver1 iscsi-ha: 3963 iscsi-ha Watchdog: iscsi-ha running - OK
Jul 24 15:19:25 xenserver1 ha-lizard: 3817 ha-lizard Watchdog: ha-lizard running - OK
Jul 24 15:19:25 xenserver1 tgtd: conn_close(101) connection closed, 0x15e3218 1
Jul 24 15:19:25 xenserver1 tapdisk[9222]: ERROR: errno -5 at vhd_complete: /dev/VG_XenStorage-2b1f74e3-75ff-5c4d-ce6c-a93bb03d83af/VHD-cd813361-15f5-4e43-b7f2-5906929b728e: op: 2, lsec: 2048, secs: 8, nbytes: 4096, blk: 0, blk_offset: 12559
Jul 24 15:19:25 xenserver1 tapdisk[9222]: ERROR: errno -5 at vhd_complete: /dev/VG_XenStorage-2b1f74e3-75ff-5c4d-ce6c-a93bb03d83af/VHD-cd813361-15f5-4e43-b7f2-5906929b728e: op: 2, lsec: 4458496, secs: 8, nbytes: 4096, blk: 1088, blk_offset: 33079
Jul 24 15:19:25 xenserver1 kernel: [ 5120.899088] sd 6:0:0:10: rejecting I/O to offline device
Jul 24 15:19:25 xenserver1 kernel: [ 5120.899117] sd 6:0:0:10: rejecting I/O to offline device
Jul 24 15:19:25 xenserver1 iscsid: conn 0 login rejected: initiator error - target not found (02/03)
Jul 24 15:19:26 xenserver1 tapdisk[9222]: tap-err:/dev/VG_XenStorage-2b1f74e3-75ff-5c4d-ce6c-a93bb03d83af/VHD-cd813361-15f5-4e43-b7f2-5906929b728e: vhd_complete: too many errors, dropped.
Jul 24 15:19:26 xenserver1 kernel: [ 5121.900302] sd 6:0:0:10: rejecting I/O to offline device
Jul 24 15:19:26 xenserver1 kernel: [ 5121.900322] sd 6:0:0:10: rejecting I/O to offline device
Jul 24 15:19:27 xenserver1 tgtd: conn_close(101) connection closed, 0x15e3218 1
Jul 24 15:19:27 xenserver1 kernel: [ 5122.901450] sd 6:0:0:10: rejecting I/O to offline device
Jul 24 15:19:27 xenserver1 kernel: [ 5122.901464] sd 6:0:0:10: rejecting I/O to offline device
Jul 24 15:19:28 xenserver1 kernel: [ 5123.257473]  connection1:0: detected conn error (1020)
Jul 24 15:19:28 xenserver1 tgtd: conn_close(101) connection closed, 0x15e3218 1
Jul 24 15:19:28 xenserver1 ha-lizard: 18734 Spawning new instance of ha-lizard
Jul 24 15:19:28 xenserver1 kernel: [ 5123.855337] sd 6:0:0:10: rejecting I/O to offline device
Jul 24 15:19:28 xenserver1 kernel: [ 5123.855366] sd 6:0:0:10: rejecting I/O to offline device
Jul 24 15:19:28 xenserver1 ha-lizard: 19115 Mail Spool Directory Found /dev/shm/ha-lizard-mail
Jul 24 15:19:28 xenserver1 ha-lizard: 19115 This iteration is count 339
Jul 24 15:19:28 xenserver1 ha-lizard: 19115 Checking if this host is a Pool Master or Slave
Jul 24 15:19:28 xenserver1 ha-lizard: 19115 This host's pool status = master
Jul 24 15:19:28 xenserver1 ha-lizard: 19115 Checking if ha-lizard is enabled for this pool
Jul 24 15:19:28 xenserver1 xapi: [ info|xenserver1|750 UNIX /var/xapi/xapi|session.login_with_password D:8d34663555d7|xapi] Session.create trackid=29f28f5e79d6074fc66f6f29f344a376 pool=false uname=root originator=cli is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49
Jul 24 15:19:28 xenserver1 xapi: [ info|xenserver1|750 UNIX /var/xapi/xapi|session.logout D:90de6813c6ba|xapi] Session.destroy trackid=29f28f5e79d6074fc66f6f29f344a376
Jul 24 15:19:28 xenserver1 xapi: [ info|xenserver1|752 UNIX /var/xapi/xapi|session.login_with_password D:ce0f3488a7d9|xapi] Session.create trackid=900404fc44aa5c448e2e3b7b34bfb204 pool=false uname=root originator=cli is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49
Jul 24 15:19:28 xenserver1 xapi: [ info|xenserver1|752 UNIX /var/xapi/xapi|session.logout D:6d9cc791d933|xapi] Session.destroy trackid=900404fc44aa5c448e2e3b7b34bfb204
Jul 24 15:19:28 xenserver1 ha-lizard: 19115 check_ha_enabled: Checking if ha-lizard is enabled for pool: 294f0840-9541-cfff-dd72-4c115f8e0a02
Jul 24 15:19:28 xenserver1 ha-lizard: 19115 check_ha_enabled: ha-lizard is disabled
Jul 24 15:19:28 xenserver1 ha-lizard: 19115 ha-lizard is disabled
Jul 24 15:19:28 xenserver1 iscsid: conn 0 login rejected: initiator error - target not found (02/03)

Something not work in network layer.
How to fix it without restarting hosts ?

Ok,I find:
service tgtd restart
resolve problem, iscsci SR appear:
[root@xenserver1 log]# pvdisplay
File descriptor 3 (/dev/tty) leaked on pvdisplay invocation. Parent PID 10745: bash
File descriptor 7 (pipe:[33485]) leaked on pvdisplay invocation. Parent PID 10745: bash
  --- Physical volume ---
  PV Name               /dev/sdc
  VG Name               VG_XenStorage-2b1f74e3-75ff-5c4d-ce6c-a93bb03d83af
  PV Size               232.83 GB / not usable 10.28 MB
  Allocatable           yes
  PE Size (KByte)       4096
  Total PE              59602
  Free PE               57547
  Allocated PE          2055
  PV UUID               U7eQnS-M1iG-FBAx-qs0M-R4X2-h7oD-0NAtjV

Unfortunately, I must restart the running VMs, because they have Read-only file system.

And by the way, I found some errors in iscsi-ha scripts:
Jul 24 17:47:14 xenserver1 iscsi-ha-NOTICE-/etc/iscsi-ha/iscsi-ha.sh: /etc/iscsi-ha/iscsi-ha.func: line 283: [: -eq: unary operator expected
Jul 24 17:47:14 xenserver1 iscsi-ha: 19999 email Sending ALERT email to root@localhost: check_ip_health: 10.10.10.3 response = FAIL
Jul 24 17:47:14 xenserver1 iscsi-ha-NOTICE-/etc/iscsi-ha/iscsi-ha.sh: Traceback (most recent call last):
Jul 24 17:47:14 xenserver1 iscsi-ha-NOTICE-/etc/iscsi-ha/iscsi-ha.sh:   File "/etc/iscsi-ha/scripts/email_alert.py", line 114, in ?
Jul 24 17:47:14 xenserver1 iscsi-ha-19986-INFO-/etc/iscsi-ha/iscsi-ha.sh: Sending email from: root@localhost
Jul 24 17:47:14 xenserver1 iscsi-ha-NOTICE-/etc/iscsi-ha/iscsi-ha.sh:     message = smtplib.SMTP(smtp_server, smtp_port, hostname)
Jul 24 17:47:14 xenserver1 iscsi-ha-19986-INFO-/etc/iscsi-ha/iscsi-ha.sh: Sending email to: root@localhost
Jul 24 17:47:14 xenserver1 iscsi-ha-19986-INFO-/etc/iscsi-ha/iscsi-ha.sh: Email Alert Subject: HA-Lizard noSAN SYSTEM ALERT - FROM HOST: xenserver1
Jul 24 17:47:14 xenserver1 iscsi-ha-NOTICE-/etc/iscsi-ha/iscsi-ha.sh:   File "/usr/lib64/python2.4/smtplib.py", line 244, in __init__
Jul 24 17:47:14 xenserver1 iscsi-ha-NOTICE-/etc/iscsi-ha/iscsi-ha.sh:     (code, msg) = self.connect(host, port)
Jul 24 17:47:14 xenserver1 iscsi-ha-NOTICE-/etc/iscsi-ha/iscsi-ha.sh:   File "/usr/lib64/python2.4/smtplib.py", line 306, in connect

Please Log in or Create an account to join the conversation.

Last edit: by Alfred.

How to properly change the pool master ? 8 years 8 months ago #489

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
It looks like the storage IP address may be getting dropped when transitioning the host roles.

When you performed your test - was iscsi-ha in manual mode?

Please Log in or Create an account to join the conversation.

How to properly change the pool master ? 8 years 8 months ago #490

iscsi-ha was manual-mode-disable

Please Log in or Create an account to join the conversation.

How to properly change the pool master ? 8 years 4 months ago #635

Hello,

I have the same problem with my servers.
I have converted the disks to ha storage during install.


The suspicious part is that the converted storage appears as SDB, while the disk it was converted from is SDA (with sda1 containing the root file system....)

Yours sincewrerly

CadilLACi

Please Log in or Create an account to join the conversation.

  • Page:
  • 1
  • 2