Forum
Welcome, Guest
Username: Password: Remember me

TOPIC:

HA-Lizard supported PV not available 5 years 11 months ago #1586

  • Devin Dorminey
  • Devin Dorminey's Avatar Topic Author
  • Offline
  • Posts: 2
Hi,

I have a problem with my Xenserver 6.5 with HA-Lizard Cluster. Due, I think, to an NFS server lock-up, one of my VM's (windows 2008) pegged CPU 0 and refused to respond to input. I force killed it, eventually using the 'destroy domain' trick to fully shut it down but it would not clear 'amber' status. Google suggested a reboot of the physical host would clear the issue. So i rebooted the physical host.

After a series of reboots, some very fervent prayers and a lot more time than seemed reasonable, the cluster came back up. In the interim, I had rebooted the NFS server and it and its SR were now available. Unfortunately, my HA-iscsi SR was not available. I was able to force it to repair and it mounted up.

However, when i tried to start the VM, I got 'the VDI is not available.' I can see the SR and its listing of VHD's but rescan fails as well.

At this point, some more googling led me to try PVScan with the following results...
[root@xen01 ~]# pvscan
  Couldn't find device with uuid Pf96EV-vSny-eQpw-SLlK-0uzf-Npqa-Uxmdc9.
  PV unknown device   VG VG_XenStorage-905cf1e0-a955-b220-feaf-e4151896e6e0   lvm2 [2.18 TB / 233.72 GB free]
  Total: 1 [2.18 TB] / in use: 1 [2.18 TB] / in no VG: 0 [0   ]

and vgscan reported...
[root@xen01 ~]# vgscan
  Reading all physical volumes.  This may take a while...
  Couldn't find device with uuid Pf96EV-vSny-eQpw-SLlK-0uzf-Npqa-Uxmdc9.
  Found volume group "VG_XenStorage-905cf1e0-a955-b220-feaf-e4151896e6e0" using metadata type lvm2

I've had absolutely no luck finding a way to recover the access to this data.

iSCSI-HA status says...
| VIRTUAL IP:             10.10.10.3 is not local                                |
| ISCSI TARGET:           tgtd is stopped [expected stopped]                     |
| DRBD ROLE:              iscsi1=Secondary                                       |
| DRBD CONNECTION:        iscsi1 in Connected state                              |
----------------------------------------------------------------------------------
Control + C to exit


---------------
| DRBD Status |
---------------
-------------------------------------------------------------------------
| version: 8.4.3 (api:1/proto:86-101)                                   |
| srcversion: 19422058F8A2D4AC0C8EF09                                   |
|  1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----   |
|     ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 |
-------------------------------------------------------------------------

DRBD Seems happy...

[root@xen01 ~]# drbd-overview
  1:iscsi1/0  Connected Secondary/Primary UpToDate/UpToDate C r-----

I've got a restore from backup running in parallel but would love to recover the data intact since there was work done since the last backup.

Any suggestions or guidance would be most welcome.

---Devin

Please Log in or Create an account to join the conversation.

HA-Lizard supported PV not available 5 years 11 months ago #1588

  • Devin Dorminey
  • Devin Dorminey's Avatar Topic Author
  • Offline
  • Posts: 2
** Issue resolved ***

I reached out to Sal via the support email and arranged for him to remote into my system. After quite a bit of muttering and exclamations of 'I have never seen THIS happen before...' Sal was able to determine that the metadata for the Physical volume housing the HA-iSCSI volume was missing. All of the data was there, CentOS just lost the instructions on how to find it.

With no small amount of trepidation and warnings of 'I'm not sure how this is going to go,' I authorized Sal to try to recreate the PV. He, as I understand it, destroyed and recreated the physical drive (its iSCSI so it didnt ACTUALLY destroy anything) using the original UUID. The allowed the rest of the LVM and suchlike to find the data and address it and restored my access to missing data.

I wouldn't wish any one this sort of failure but am very grateful to Sal for his efforts and persistence in troubleshooting and repairing this issue.

---Devin

Please Log in or Create an account to join the conversation.

HA-Lizard supported PV not available 4 years 3 months ago #1948

Hi

I have the same problem, how did You resolv it?

Best regards Marcin

Please Log in or Create an account to join the conversation.

Last edit: by Marcin.

HA-Lizard supported PV not available 4 years 3 months ago #1949

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
In this case the LVM metadata was missing from the iscsi backing storage device. I was able to restore the LVM metadata from a backup to resolve the issue. Use caution when restoring metadata as the operation could be destructive if done incorrectly or on the wrong disk /partition.

You should find at least one LVM backup in /etc/lvm/backup

Please Log in or Create an account to join the conversation.

HA-Lizard supported PV not available 5 months 2 hours ago #3023

After appliying a series of patches on both members of my two node HA-Lizard pool based on XCP-NG 8.2.1 I have lost my pv configuration too. I am prepared to try to restore the LVM backup as this seems to be the only possibility to get back a running pool.

Is there any detailed advice how to do this? Should I do this only on the primary iscsi-ha node.should I disable drbd services or TGT.
Any help today is welcomed as on Monday morning at 6 am the production should start, but without this pool nothing could be produced.

In the meantime I was trying the restore from the LVM Backup file the Metadata, but the underlying /dev/sdc is not found.
[16:59 IT2XCP-NG-MASTER1 ~]# pvcreate --test --uuid "3cxgfb-3W4w-Dx2Q-l1p2-4KpH-kjn7-mfAyjS" --restorefile /etc/lvm/backup/VG_XenStorage-3e6b208e-bbbc-170b-402c-f272c612be0a /dev/sdc
  TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated.
  Couldn't find device with uuid 3cxgfb-3W4w-Dx2Q-l1p2-4KpH-kjn7-mfAyjS.
  Device /dev/sdc not found.
[16:59 IT2XCP-NG-MASTER1 ~]# blkid 
/dev/sda1: LABEL="root-hlgnmj" UUID="f52447ce-cea3-44e8-86a4-1d2ab98da098" TYPE="ext3" PARTUUID="b3b6b29b-814a-4564-8dc5-88d6e556f290" 
/dev/sda3: UUID="h0N0HV-wfbn-WpSf-Cc03-Dp2f-qRPz-j29723" TYPE="LVM2_member" PARTUUID="6118b5ae-d778-4973-9927-61777d596806" 
/dev/sda5: LABEL="logs-hlgnmj" UUID="824debcc-f29d-4081-99d0-ed92933b01fc" TYPE="ext3" PARTUUID="989c2b87-c8ea-4633-92cb-81307eec547d" 
/dev/sda6: LABEL="swap-hlgnmj" UUID="b042269f-0a4e-491d-8eec-93b53ce55868" TYPE="swap" PARTUUID="01b71e1c-2020-4ec4-a4c2-2ba99427c29f" 
/dev/sdb: UUID="a8457c76650ccb45" TYPE="drbd" 
/dev/drbd1: UUID="3cxgfb-3W4w-Dx2Q-l1p2-4KpH-kjn7-mfAyjS" TYPE="LVM2_member" 
/dev/sda2: PARTUUID="41bcab07-fa08-49ec-b738-4c1c9c7559ca" 
/dev/sda4: PARTUUID="d4f4703d-d47a-494c-bafd-e263214f0c6b" 
[16:59 IT2XCP-NG-MASTER1 ~]# lsblk 
NAME                                                            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sdb                                                               8:16   0   2.2T  0 disk 
└─drbd1                                                         147:1    0   2.2T  0 disk 
sda                                                               8:0    0 223.1G  0 disk 
├─sda4                                                            8:4    0   512M  0 part 
├─sda2                                                            8:2    0    18G  0 part 
├─sda5                                                            8:5    0     4G  0 part /var/log
├─sda3                                                            8:3    0 181.6G  0 part 
│ └─VG_XenStorage--70b3650b--7c44--83ef--717c--9753f920e7ee-MGT 253:0    0     4M  0 lvm  
├─sda1                                                            8:1    0    18G  0 part /
└─sda6                                                            8:6    0     1G  0 part [SWAP]
[16:59 IT2XCP-NG-MASTER1 ~]# 

/dev/sdb/ and /dev/drbd1 are there but filtered out in /etc/lvm/lvm.conf

/dev/drbd1 has the same UUID as the previous available /dev/sdc mentioned in /etc/lvm/backup
creation_host = "IT2XCP-NG-MASTER1"	# Linux IT2XCP-NG-MASTER1 4.19.0+1 #1 SMP Wed Aug 9 11:41:08 CEST 2023 x86_64
creation_time = 1696952771	# Tue Oct 10 17:46:11 2023

VG_XenStorage-3e6b208e-bbbc-170b-402c-f272c612be0a {
	id = "u3c28Z-mYJ0-YuYh-DaX7-s3Ns-mKJK-R475bT"
	seqno = 42
	format = "lvm2"			# informational
	status = ["RESIZEABLE", "READ", "WRITE"]
	flags = []
	extent_size = 8192		# 4 Megabytes
	max_lv = 0
	max_pv = 0
	metadata_copies = 0

	physical_volumes {

		pv0 {
			id = "3cxgfb-3W4w-Dx2Q-l1p2-4KpH-kjn7-mfAyjS"
			device = "/dev/sdc"	# Hint only

			status = ["ALLOCATABLE"]
			flags = []
			dev_size = 4684108112	# 2.18121 Terabytes
			pe_start = 22528
			pe_count = 571787	# 2.18119 Terabytes
		}
	}

So how to get back /dev/sdc in order to restore the LVM Metadata?

Or do we have any other approach available?

I have sent you also a professional support request e-mail via contact form.

BR Andreas

Please Log in or Create an account to join the conversation.

Last edit: by ajmind.

HA-Lizard supported PV not available 4 months 4 weeks ago #3024

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
Are you certain that TGT is running? Once the iscsi target starts, that should expose/create the missing sdc

Please Log in or Create an account to join the conversation.