Forum
Welcome, Guest
Username: Password: Remember me
  • Page:
  • 1
  • 2

TOPIC:

CRITICAL ERROR - DRBD Resource: iscsi1 in [StandAlone] state - expected 3 months 4 weeks ago #2479

  • Management
  • Management's Avatar Topic Author
  • Offline
  • Posts: 5
This message was blocked by the administrador.
Attachments:

Please Log in or Create an account to join the conversation.

Last edit: by Management.

CRITICAL ERROR - DRBD Resource: iscsi1 in [StandAlone] state - expected 3 months 4 weeks ago #2480

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 696
It appears as though you may have 2 issues to address.

1 - The reason why your VMs continue to start after you have disabled HA for the VM. By default, (on a new installation), ALL VMs are treated as having HA enabled if HA is enabled for the pool. This behavior is controlled by the setting GLOBAL_VM_HA which is set to 1 on a new installation. If you require more control over which VMs have HA enabled, you should set this to 0 which will then respect the true/false setting per VM to control whether the VM will get automatically started. From the CLI (on either host), issue the following command.
ha-cfg set global_vm_ha 0

2 - Regarding your DRBD error. DRBD is not working. Either your replication link is broken or DRBD has entered split brain. If you are sure of which host has the latest data, and have confirmed that your replication network is OK, you can run the following script, on both hosts, to resolve the DRBD error.
/etc/iscsi-ha/scripts/drbd-sb-tool

Please Log in or Create an account to join the conversation.

CRITICAL ERROR - DRBD Resource: iscsi1 in [StandAlone] state - expected 3 months 3 weeks ago #2481

  • Management
  • Management's Avatar Topic Author
  • Offline
  • Posts: 5
This message was blocked by the administrador.

Please Log in or Create an account to join the conversation.

Last edit: by Management.

CRITICAL ERROR - DRBD Resource: iscsi1 in [StandAlone] state - expected 3 months 2 weeks ago #2483

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 696
Hi Marcos,
Regarding Item 1: ha-cfg set global_vm_ha 0 has no affect on replication at all. It simply controls which VMs will be auto started (with or without an HA event). Please note, a 2 node hyper-converged cluster is achieved by running 2 of our packages, both of which get installed when running our automatic installer. HA-Lizard is the HA component. it replaces the XenServer/XCP native HA because it supports HA in 2 node pools. ISCSI-ha is the replication and storage management component. All "ha-cfg" commands are used solely to manage HA-Lizard. ISCSI-ha has its own CLI tool "iscsi-cfg".

- setting this to 1 (default shipped setting) will treat all VMs as having HA enabled - when HA is enabled for the system. SO, if HA is disabled for the pool, there is no effect. If HA is enabled, then all VMs will be started when they are not running (with or without an HA event)

- setting this to 0, as I suggested, allows you to explicitly set which VMs should have HA and which should not. This is done either through XenCenter/XCPcenter by exposing one custom DB paramter (please see the HA-Lizard admin guide for how to do this) OR from the CLI by using "ha-cfg set-vm-ha <vm-name> <true|false>. set-vm-ha is only respected when global_vm_ha is set to 0 and allows you to individually control HA per VM. You can also view the current setting for each VM with "ha-cfg get-vm-ha"

Regarding Item 2: I still think your replication is broken given the warning we are putting out. This explanation may clarify things a bit. The storage is exposed only on the master host. So, when running a VM on the slave, it is actually running from the storage on the master. Given this, All VMs will function as normal if the slave's replication has broken (as is your case). If you were to have an HA event while operating in this state, your slave's storage will not be up to date and cannot take over. You really need to address the warning which is telling you that replication is not currently working. Given this, your test is flawed as you are writing/reading only to the storage on the master in your test. To fix this, you need to first check your replication link to ensure that both hosts can communicate across it. If that is OK, you are very likely experiencing a DRBD split brain, which is safety mechanism built into DRBD which will stop replication in certain crash situations. This is likely your situation, and if so, you can recover from the warning by running /etc/iscsi-ha/scripts/drbd-sb-tool. Please be very careful when running this tool as you will need to positively identify which node node has the most up-to-date data. This should be your master so long as you are not running ISCSI-ha in manual mode.

Please Log in or Create an account to join the conversation.

Last edit: by Salvatore Costantino.

CRITICAL ERROR - DRBD Resource: iscsi1 in [StandAlone] state - expected 3 months 2 weeks ago #2487

  • Management
  • Management's Avatar Topic Author
  • Offline
  • Posts: 5
This message was blocked by the administrador.

Please Log in or Create an account to join the conversation.

Last edit: by Management.

CRITICAL ERROR - DRBD Resource: iscsi1 in [StandAlone] state - expected 3 months 4 days ago #2491

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 696
storage system status is better viewed with

iscsi-cfg status


if you are running the latest versions of ha-lizard and iscsi-ha, then you can get a full status report for all services with
ha-cfg cluster-status
The following user(s) said Thank You: Management

Please Log in or Create an account to join the conversation.

  • Page:
  • 1
  • 2