Forum
Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1

TOPIC:

Failed to locate PBD for SR 9 years 10 months ago #231

Hi All,

Looking for some help on a Failed to locate PBD for SR error. It doesn’t seem to be causing so much of an issue currently but worried it may one just fail so could anyone shed some light on the situation ?

Environment:
2 Node, HA XenServer Pool, with shared local storage.
Iscsi-ha 1.3.7
XCP 1.6
Ha-Lizard 1.7.6
DRBD 8.3

We have successfully implemented the design in the HA-Lizard reference guide. We have been running 7 VM’s on this system for 3 months now without issue.

In the past week we have begun to receive this alert:

iscsi-ha failed to spawn new instance after 6 attempts. MAX_STARTS is set to 5. Check Host

I have checked the iscsi services using the “iscsi-cfg status” command and all are running fine.
All our VM’s are running as expected at the minute as well.
However the real-time log “iscsi-cfg log” is showing the below messages.

Volume group for iscsi-sr – not found attempting to re-plug
Failed to locate PBD for SR


Log below:

Jun 9 11:34:42 HOST02 iscsi-ha: 2795 iscsi-ha Watchdog: iscsi-ha running - OK
Jun 9 11:34:49 HOST02 iscsi-ha: 2116 Spawning new instance of iscsi-ha
Jun 9 11:34:49 HOST02 iscsi-ha: 2116 check_logger_processes Checking logger processes
Jun 9 11:34:50 HOST02 iscsi-ha: 2116 check_logger_processes No processes to clear
Jun 9 11:34:50 HOST02 iscsi-ha: 2601 Mail Spool Directory Found /dev/shm/iscsi-ha-mail
Jun 9 11:34:50 HOST02 iscsi-ha: 2601 This iteration is count 729
Jun 9 11:34:50 HOST02 iscsi-ha: 2601 Checking if this host is a Pool Master or Slave
Jun 9 11:34:50 HOST02 iscsi-ha: 2601 This host's pool status = master
Jun 9 11:34:50 HOST02 iscsi-ha: 2609 auto_plug_pbd: Found LVMoISCSI SR List: 213e4416-6582-310b-0e71-0f3d55849588,c5ef58d0-c42d-8060-99bc-f32fef25ebf1
Jun 9 11:34:50 HOST02 iscsi-ha: 2609 DRBD Running on this host: version: 8.3.15 (api:88/proto:86-97) GIT-hash: 0ce4d235fc02b5c53c1c52c53433d11a694eab8c build by root@XCP-HA1, 2013-05-20 21:36:35 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r
ns:1148469720 nr:976700164 dw:2125169692 dr:1052178120 al:8299213 bm:58809 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
Jun 9 11:34:50 HOST02 iscsi-ha: 2609 check_drbd_resource_state: DRBD Resource: iscsi1 in Primary mode
Jun 9 11:34:50 HOST02 iscsi-ha: 2609 DRBD Resource: iscsi1 in Connected state
Jun 9 11:34:50 HOST02 iscsi-ha: 2609 iSCSI target: /etc/init.d/tgtd status = OK. [tgtd (pid 3158 3157) is running...]
Jun 9 11:34:50 HOST02 iscsi-ha: 2609 local_ip_list: Local IP list returned
Jun 9 11:34:50 HOST02 iscsi-ha: 2609 Virtual IP: x.x.x.x discovered on host LCG-LHC-vSERV-HOST02
Jun 9 11:34:51 HOST02 iscsi-ha-2596-INFO-/etc/iscsi-ha/iscsi-ha.sh: Scanning for Volume Group -> iscsi-sr: 213e4416-6582-310b-0e71-0f3d55849588,c5ef58d0-c42d-8060-99bc-f32fef25ebf1
Jun 9 11:34:51 HOST02 iscsi-ha-2596-INFO-/etc/iscsi-ha/iscsi-ha.sh: Volume Group for iSCSI-SR 213e4416-6582-310b-0e71-0f3d55849588,c5ef58d0-c42d-8060-99bc-f32fef25ebf1 not found - attemping to re-plug
Jun 9 11:34:51 HOST02 iscsi-ha-2596-INFO-/etc/iscsi-ha/iscsi-ha.sh: Failed to locate PBD for SR: 213e4416-6582-310b-0e71-0f3d55849588,c5ef58d0-c42d-8060-99bc-f32fef25ebf1 on HOST: dc737d74-33ff-44d9-8848-16d3eab73696

Please Log in or Create an account to join the conversation.

Failed to locate PBD for SR 9 years 10 months ago #232

Can you reply with the following:

1 - how frequently do you receive the email alert?

2 - has your configuration changed since the cluster was deployed? Possibly added an additional iscsi SR? The process replug-pbd appears to have a malformed UUID for the iscsi SR. it looks like there are 2 uuids being treated as a single entity. This could be a bug.

3 - run this script and post the output here:
/etc/iscsi-ha/scripts/replug_pbd

Please Log in or Create an account to join the conversation.

Failed to locate PBD for SR 9 years 10 months ago #233

Hi,

Yes you are correct I have just noticed that a second SR has been added in the last few weeks.

The UUID's (213e4416 & c5ef58d0) in the previous logs match the two attached SR's

We get the alert every few days.

I will repost the output from the script you mentioned later day.

Thanks
Mal

Please Log in or Create an account to join the conversation.

Failed to locate PBD for SR 9 years 10 months ago #234

OK - that is what I thought. this is a bug in our replug_pbd script which is not properly parsing the 2 UUIDs and is instead treating them as a single string. I will send you a patch shortly to try and roll it into a patch release.

Please Log in or Create an account to join the conversation.

Failed to locate PBD for SR 9 years 10 months ago #235

Great. Thanks very much.

We'll deploy the patch asap and let you know the result.

Thanks
Mal

Please Log in or Create an account to join the conversation.

Failed to locate PBD for SR 9 years 10 months ago #236

Please replace your current /etc/iscsi-ha/scripts/

replug_pbd with the attached. We duplicated the issue by adding a second SR to our dev environment and tested this updated file.

A patch release, version 1.4.3, will be posted shorlty which resolves this bug and another display bug when calling status from the CLI.

Please Log in or Create an account to join the conversation.

Last edit: by Pulse Supply. Reason: resolved
  • Page:
  • 1