Forum
Welcome, Guest
Username: Password: Remember me
This is the optional category header for the Suggestion Box.
  • Page:
  • 1
  • 2

TOPIC: Login I/O error failed to receive a PDU

Login I/O error failed to receive a PDU 3 weeks 6 days ago #1631

Hi,

After a electricity failure the servers lost connectivity to the SR, all VMs VDI disappear and now it is not possible to reconnect the SR. On one server all NICs were lost so we had to reinstall xenserver to recover the network, we are using v7.1.

We found that there was a small split brain but now the drives are synced. Both servers see each other, the SR IP pings ok, and even TELNET works but gets disconnected immediately.


All configurations seems ok, but we don't manage to reconnect the iSCI LUN and then try to recover all VDIs,

here some info:

# xe sr-probe type=lvmoiscsi device-config:target=10.10.10.3
Error code: SR_BACKEND_FAILURE_68
Error parameters: , ISCSI login failed, verify CHAP credentials,

# iscsiadm -m discovery -t st -p 10.10.10.3
iscsiadm: Connection to Discovery Address 10.10.10.3 failed
iscsiadm: Login I/O error, failed to receive a PDU
iscsiadm: connection login retries (reopen_max) 5 exceeded
iscsiadm: Could not perform SendTargets discovery: encountered iSCSI login failure

# telnet 10.10.10.3 3260
Trying 10.10.10.3...
Connected to 10.10.10.3.
Escape character is '^]'.
Connection closed by foreign host.

# service drbd status
drbd driver loaded OK; device status:
version: 8.4.5 (api:1/proto:86-101)
srcversion: D496E56BBEBA8B1339BB34A
m:res cs ro ds p mounted fstype
1:iscsi1 Connected Secondary/Primary UpToDate/UpToDate C


# service iscsi-ha status
iscsi-ha running: 16262 [ OK ]

# ha-cfg status
| ha-lizard Version: 2.1.4 |
| Operating Mode: Mode [ 2 ] Managing All VMs in Pool |
| Host Role: master |
| Pool UUID: e1dc6ec5-9fdd-a78d-e526-175461da4829 |
| Host UUID: cb0e5938-3432-4121-b317-f14ea1a92676 |
| Master UUID: cb0e5938-3432-4121-b317-f14ea1a92676 |
| Daemon Status: ha-lizard is running [ OK ] |
| Watchdog Status: ha-lizard-watchdog is running [ OK ] |
| HA Enabled: false |
Pool HA Status: DISABLED
ha-lizard is disabled. Enable? <yes or Enter to quit>

# iscsi-cfg status:

| iSCSI-HA Status: Running 16262 |
| Last Updated: Sat Jul 21 10:51:27 CEST 2018 |
| HOST ROLE: MASTER |
| DRBD ROLE: iscsi1=Primary |
| DRBD CONNECTION: iscsi1 in Connected state |
| ISCSI TARGET: Running [expected running] |
| VIRTUAL IP: 10.10.10.3 is local |

| DRBD Status |

| version: 8.4.5 (api:1/proto:86-101) |
| srcversion: D496E56BBEBA8B1339BB34A |
| 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r
|
| ns:0 nr:0 dw:0 dr:228 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
The administrator has disabled public write access.

Login I/O error failed to receive a PDU 3 weeks 5 days ago #1632

Is the server that was rebuilt the master? If so, did you install using the nosan installer script?
The administrator has disabled public write access.

Login I/O error failed to receive a PDU 3 weeks 4 days ago #1633

Hi Salvatore,

We manage to get the LUNS back by reinstalling the HA-Lizard using the halizard_nosan_installer_2.1.4 and booting both servers after. Is there any script available just to reinstall HA-Lizard? I mean, without asking to convert the local disk and just remapping the current SR?

It seems that during the split-brain sync, the VMS metadata was overwritten or lost; XenCenter does not find any information and wants to format the SR before attaching it.

Is there a way to reattach the SR without formatting, then find, and remap the VDIs of each machine?
The administrator has disabled public write access.

Login I/O error failed to receive a PDU 3 weeks 3 days ago #1634

That installer is only meant to be used for new installations and should not be used on an existing cluster with data. Did you re-install on both the master and the slave?

Did you reformat the storage when re-introducing the iscsi SR back into the pool?
The administrator has disabled public write access.

Login I/O error failed to receive a PDU 3 weeks 3 days ago #1635

Hi,

There was no other option, I followed the instruction on how to install on an existing system but it did not work. The disks were not reformat and not reintroduce to the system using xenserver as it wanted to format the LUN.

Not we recreated the volume from the backup using the same IDs as before, we managed to connect the servers to iSCSI and it shows the correct available and used space as before, however the VDIs are not visible, if we click Rescan it shows “The VDI is not available”:

# xe sr-scan uuid=8de2894b-4d64-36c3-2018-16c3b2583cc6
Error code: SR_BACKEND_FAILURE_46
Error parameters: , The VDI is not available [opterr=Error scanning VDI 5000a8fa-95a1-4a66-a671-0fe90271ea94],


Using ls -l /dev/disk/by-id/ it lists all disk from the recovered SR, the problem is how can we make them visible on the SR and mount them back or recover them on a secondary SR.
The administrator has disabled public write access.

Login I/O error failed to receive a PDU 3 weeks 3 days ago #1636

The install script will have erased the old lvm Metadata making it impossible to see each of the LVs. Check if you have an lvm backup file that can be restored. Backup lvm config is stored somewhere is /etc/lvm
The administrator has disabled public write access.
  • Page:
  • 1
  • 2
Time to create page: 0.127 seconds