Forum
Welcome, Guest
Username: Password: Remember me

TOPIC:

PLEASE HELP No iSCSI SR After Power Cut 7 years 3 months ago #1150

  • tyh-chris
  • tyh-chris's Avatar Topic Author
  • Offline
  • Posts: 21
We already lost the slave (see my other post HERE ), but a power cut just now has lost iSCSI storage on the master too!

Let's please ignore the slave for now and focus on getting iSCSI SR back on the master so that I can at least get some VMs running again.

iscsi-cfg reports the following:
--------------------------------
| iSCSI-HA Status              |
| Wed Jan 11 16:13:48 GMT 2017 |
--------------------------------
--------------------------------------------------------
| iSCSI-HA Status: Running 2496                        |
| Last Updated: Wed Jan 11 16:13:42 GMT 2017           |
| HOST ROLE:              MASTER                       |
| DRBD ROLE:              iscsi1=Primary               |
| DRBD CONNECTION:        iscsi1 in WFConnection state |
| ISCSI TARGET:           Running [expected running]   |
| VIRTUAL IP:             10.10.10.3 is local          |
--------------------------------------------------------
Control + C to exit


---------------
| DRBD Status |
---------------
^C
[root@xenserver-primary ~]#

cat /proc/drbd reports the following:
[root@xenserver-primary ~]# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101)
srcversion: FB3AC7056350AC64629E395

 1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:0 dr:25176 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:39197276

ha-cfg status reports the following:
[root@xenserver-primary ~]# ha-cfg status
-------------------------------------------------------------------
| ha-lizard Version:   2.1.2                                      |
| Operating Mode:      Mode [ 2 ] Managing Individual VMs in Pool |
| Host Role:           master                                     |
| Pool UUID:           16110ca2-8ddd-b81d-157f-ae7068593995       |
| Host UUID:           350620d0-7781-4fd2-9d33-397625efa0a5       |
| Master UUID:         350620d0-7781-4fd2-9d33-397625efa0a5       |
| Daemon Status:       [  OK  ]ha-lizard running: 2567            |
| Watchdog Status:     [  OK  ]ha-lizard-watchdog running: 2432   |
| HA Enabled:          true                                       |
-------------------------------------------------------------------
Pool HA Status: ENABLED
ha-lizard is enabled. Disable? <yes or Enter to quit>

service iscsi status reports the following:
[root@xenserver-primary ~]# service iscsi status
Redirecting to /bin/systemctl status  iscsi.service
● iscsi.service - Login and scanning of iSCSI devices
   Loaded: loaded (/usr/lib/systemd/system/iscsi.service; disabled; vendor preset: disabled)
   Active: active (exited) since Wed 2017-01-11 15:21:25 GMT; 55min ago
     Docs: man:iscsid(8)
           man:iscsiadm(8)
  Process: 8060 ExecStart=/sbin/iscsiadm -m node --loginall=automatic (code=exited, status=8)
  Process: 8053 ExecStart=/usr/libexec/iscsi-mark-root-nodes (code=exited, status=0/SUCCESS)
 Main PID: 8060 (code=exited, status=8)
   CGroup: /system.slice/iscsi.service

service drbd status reports the following:
[root@xenserver-primary ~]# service drbd status
drbd driver loaded OK; device status:
version: 8.4.3 (api:1/proto:86-101)
srcversion: FB3AC7056350AC64629E395
m:res     cs            ro               ds                 p  mounted  fstype
1:iscsi1  WFConnection  Primary/Unknown  UpToDate/DUnknown  C

When I try to repair the ha iscsi storage repository, I get "The storage repository is not available":

Please Log in or Create an account to join the conversation.

Last edit: by Super User.

PLEASE HELP No iSCSI SR After Power Cut 7 years 3 months ago #1151

  • tyh-chris
  • tyh-chris's Avatar Topic Author
  • Offline
  • Posts: 21
I know you guys are only really available in the evening (GMT), so if you could please leave me a load of different things to try the following morning, that would be great. Hopefully this is a simple scenario for you.

Please Log in or Create an account to join the conversation.

PLEASE HELP No iSCSI SR After Power Cut 7 years 3 months ago #1152

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
Here are 2 things to try:

1) rule out the firewall.
systemctl stop iptables

2) start services by hand to ensure they come up in the right order
service iscsi-ha-watchdog stop
service iscsi-ha stop
systemctl stop tgtd
service drbd stop

service drbd start
systemctl start tgtd

Please Log in or Create an account to join the conversation.

PLEASE HELP No iSCSI SR After Power Cut 7 years 3 months ago #1153

  • tyh-chris
  • tyh-chris's Avatar Topic Author
  • Offline
  • Posts: 21
Thank you for this, I will try this out tomorrow.

Once I have run these commands, do I try the "Repair" feature in XenCenter to get the storage back online? Is there anything I should look out for to ensure that it's all going in the right direction?

Please Log in or Create an account to join the conversation.

PLEASE HELP No iSCSI SR After Power Cut 7 years 3 months ago #1154

  • tyh-chris
  • tyh-chris's Avatar Topic Author
  • Offline
  • Posts: 21
Here are the results of the commands:


xenserver-primary:
[root@xenserver-primary ~]# systemctl stop iptables
[root@xenserver-primary ~]# service iscsi-ha-watchdog stop
Stopping iscsi-ha-watchdog (via systemctl):                [  OK  ]
[root@xenserver-primary ~]# service iscsi-ha stop
Stopping iscsi-ha (via systemctl):                         [  OK  ]
[root@xenserver-primary ~]# systemctl stop tgtd
[root@xenserver-primary ~]# service drbd stop
Stopping all DRBD resources: .
[root@xenserver-primary ~]# service drbd start
Starting DRBD resources: [
     create res: iscsi1
   prepare disk: iscsi1
    adjust disk: iscsi1
     adjust net: iscsi1
]
..........
***************************************************************
 DRBD's startup script waits for the peer node(s) to appear.
 - If this node was already a degraded cluster before the
   reboot, the timeout is 0 seconds. [degr-wfc-timeout]
 - If the peer was available before the reboot, the timeout
   is 0 seconds. [wfc-timeout]
   (These values are for resource 'iscsi1'; 0 sec -> wait forever)
 To abort waiting enter 'yes' [  66]:
.
[root@xenserver-primary ~]# systemctl start tgtd
Job for tgtd.service failed because the control process exited with error code.                                      See "systemctl status tgtd.service" and "journalctl -xe" for details.
[root@xenserver-primary ~]#

xenserver-secondary:
[root@xenserver-secondary ~]# service iscsi-ha-watchdog stop
Stopping iscsi-ha-watchdog (via systemctl):                [  OK  ]
[root@xenserver-secondary ~]# service iscsi-ha stop
Stopping iscsi-ha (via systemctl):                         [  OK  ]
[root@xenserver-secondary ~]# systemctl stop tgtd
[root@xenserver-secondary ~]# service drbd stop
Stopping all DRBD resources: Resource unknown
.
[root@xenserver-secondary ~]# service drbd start
Starting DRBD resources: [
     create res: iscsi1
   prepare disk: iscsi1
    adjust disk: iscsi1
     adjust net: iscsi1
]
.
[root@xenserver-secondary ~]# systemctl start tgtd
Job for tgtd.service failed because the control process exited with error code. See "systemctl status tgtd.service" and "journalctl -xe" for details.

And the result of systemctl status:
[root@xenserver-primary ~]# systemctl status tgtd.service
● tgtd.service - tgtd iSCSI target daemon
   Loaded: loaded (/usr/lib/systemd/system/tgtd.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/tgtd.service.d
           └─local.conf
   Active: failed (Result: exit-code) since Thu 2017-01-12 08:54:56 GMT; 3min 14s ago
  Process: 5334 ExecStop=/usr/sbin/tgtadm --op delete --mode system (code=exited, status=0/SUCCESS)
  Process: 5304 ExecStop=/usr/sbin/tgt-admin --update ALL -c /dev/null (code=exited, status=0/SUCCESS)
  Process: 5301 ExecStop=/usr/sbin/tgtadm --op update --mode sys --name State -v offline (code=exited, status=0/SUCCESS)
  Process: 5216 ExecStartPost=/usr/sbin/tgt-admin -e -c $TGTD_CONFIG (code=exited, status=22)
  Process: 5214 ExecStartPost=/usr/sbin/tgtadm --op update --mode sys --name State -v offline (code=exited, status=0/SUCCESS)
  Process: 4391 ExecStartPost=/bin/sleep 5 (code=exited, status=0/SUCCESS)
  Process: 4390 ExecStart=/usr/sbin/tgtd -f $TGTD_OPTS (code=exited, status=0/SUCCESS)
 Main PID: 4390 (code=exited, status=0/SUCCESS)

I tried to "Repair" the storage repository after starting drbd and got the same thing as before.

Please Log in or Create an account to join the conversation.

PLEASE HELP No iSCSI SR After Power Cut 7 years 3 months ago #1157

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
looks like the root cause is the iscsi target is failing to start on your master.
The following user(s) said Thank You: tyh-chris

Please Log in or Create an account to join the conversation.