Forum
Welcome, Guest
Username: Password: Remember me

TOPIC:

XenServer 6.5 9 years 2 weeks ago #392

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
Regarding the performance/sync query:
The "cat /proc/drbd" posted in your note is showing the progress of the initial sync of the disks. When a DRBD resource is connected for the first time, the entire contents of the primary device must be copied to the secondary. This initial sync occurs only 1 time when the resource first becomes live. The time it takes depends on your replication network speed and the size of the block device. It is common for this to take several hours.. Especially for block devices in the 1TB+ range. So, what you saw is perfectly normal.

While initial sync is happening, it is possible to write to the resource, as you did by creating a VM. This works transparently with the only side effect being somewhat reduced performance due to the ongoing initial sync being performed in the background.

You can verify this by waiting for the sync to complete and then create a new VM. While the VM is being created (OS being written) you can watch DRBD status in real time with "watch cat /proc/drbd". The output should verify for you that the replication is happening in real time with negligible delay between the primary/secondary hosts.

Regarding the issue with the uppercase characters in the hostname:
I would like to better understand if this was an input error when running the script or a bug in our SW. Can you elaborate? From your post I am gathering that the hostnames had some uppercase characters and you entered the hostnames as lower case when prompted by the installer. Please clarify whether this is indeed the case.

Regarding the last issue with the slave being disconnected from the iscsi SR:
THis is not normal. Both the master and slave hosts should connect to the SR on the floating IP address. Given that DRBD is working, it is likely NOT a firewall issue. Please check your /etc/lvm/lvm.conf file to ensure that the filter is properly written. If you are unsure, you can simply post the file here.

Please Log in or Create an account to join the conversation.

XenServer 6.5 9 years 2 weeks ago #395

Hi,

i am going to bed now. Will do a better explanation of the problem with the "drbd" tomorrow. Its pretty sure a a brbd/xen problem and not related to your script.

A quick perf check under Win 7 32 Bit did give around 99 MB/s for sequential write.

cheers,

Robin

Please Log in or Create an account to join the conversation.

Last edit: by Robin.

XenServer 6.5 9 years 2 weeks ago #397

Hi,

heres more detail about the problems i did encounter:

After the installation script seemed to finish successfully i checked the status of all components and did see that the iscsi interface wouldnt come up.

Searching further i did find an error message from the tgtd in daemon.log
tgtd: backed_file_open(92) Could not open /dev/drbd1, Read-only file system 

So i figured theres a problem with drbd. "Checking with drbdadmin log" i could find out that theres a problem with the drbd.conf file. The error had been something like:
DRBD resources: WARN: no normal resources defined for this host

So ich checked drbd.conf which looked like this:
global { usage-count no; } 
common { syncer { rate 120M; } } 
resource iscsi1 { 
protocol C; 
net { 
after-sb-0pri discard-zero-changes; 
after-sb-1pri consensus; cram-hmac-alg sha1; 
shared-secret PUTyourSECREThere; 
} 
on host-xen-1 { 
device /dev/drbd1; 
disk /dev/sda3; 
address 10.10.10.1:7789; 
meta-disk internal; 
} 
on host-xen-2 { 
device /dev/drbd1; 
disk /dev/sda3; 
address 10.10.10.2:7789; 
meta-disk internal; 
} 
}

On Xenserver you have 2 hostnames. As hostnames, as well as domain names are per definition are NOT case sensitive i did name my serves "Host-Xen-1" and "Host-Xen-2". The drbd manual says the server name must correspond to the output of "uname -n" and must be "ping-able" under this name. Which had been the case. Still, it didnt work.

I also tried changing the server names to "Host-Xen-1" and "Host-Xen-2" without success.

The only thing which worked had been to change the xenserver label and xenserver service-name to lower case and change the server names back to lower case in drbd.conf

The according commands for this are:
xe host-param-set name-label=host-xen-1 uuid=YOURHOSTUUID
xe host-set-hostname-live host-name=host-xen-1 host-uuid=YOURHOSTUUID

Now the server config had been recognized by drbd. After this i did shutdown all HA-Lizard services and manually redo all the steps for setting up the drbd subsystem. After this (and a reboot) the system worked!

BTW: The drbd commands in the ha-lizard manual are partially outdated.

After restarting the servers this morning because all the VMs instantly keep restarting after i did shut them down (even with force) i did see the following sync going on.
 1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
    ns:3986188 nr:0 dw:20316 dr:4494836 al:16 bm:309 lo:1 pe:1 ua:0 ap:0 ep:1 wo:f oos:1082368
	[==============>.....] sync'ed: 78.7% (1056/4948)Mfinish: 0:00:45 speed: 24,000 (46,872) K/sec

Is this speed correct?

cheers,

Robin

Please Log in or Create an account to join the conversation.

Last edit: by Robin.

XenServer 6.5 9 years 2 weeks ago #398

// Deleted. Had been an RTFM Case

I did made an fresh install with all patches, then started the script, used lower case names for the servers and everything worked smoothly!

Great work guys! I suggest just doing some checks on the input. Like empty input and some basic syntax checks on ip-addresses. And maybe you echo something like "drdb mirroring is initialzing. This may take a couple of hours."

Otherwise excellent!

I get the following warning when executing "ha-cfg get-vm-ha"
ha-cfg get-vm-ha
Retreiving data.. please wait
Error: Key XenCenter.CustomFields.ha-lizard-enabled not found in map
Error: Key XenCenter.CustomFields.ha-lizard-enabled not found in map

cheers,

Robin

Please Log in or Create an account to join the conversation.

Last edit: by Robin.

XenServer 6.5 9 years 1 week ago #399

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
Regarding the iscsi target, By design, the iscsi target will be OFF (TGT not running) on the slave host under normal operating conditions. TGT should only be running on the master. If you attempt to start it, it is normal to see an error about not being able to mount a read only device. The reason for this is it that the master iscsi target has already mounted it in read/write mode. Also, even if you do manage to get it started on the slave, our SW will detect it and force stop it to ensure that the slave never mounts the storage. The only exception to this is when operating in manual-mode. In this case the target can be freely moved between hosts to allow for maintenance operations.

I did some checking and host names specified in drbd.conf are case sensitive which exlains your initial issue.

Regarding the speed of your sync - I am not sure whether the information in /proc/drbc is measured in bits or bytes, so from the data in your screenshot it is either syncing at 46Mbps or 368Mbps. The latter seems more likey.

On the issue with the VMs starting automatically after shutdown, this is HA-Lizard doing its job. It is a little different than XenServer HA and will not allow any VM to be in the off state while configured for HA. This is helpful also in the case of accidental shutdown. The default settings we ship basically treat ALL VMs in a pool as having HA on. If you prefer to control which VMs have HA enabled you will need to make a couple of small changes to the config.

First "ha-cfg set global_vm_ha 0"
then, you will need to set HA to "true" for all the VMs that should have it.
"ha-cfg set-vm-ha <vm_name_here> true"

and "false" for the ones that dont
"ha-cfg set-vm-ha <vm_name_here> false"

This HA per machine can also be controlled via XenCenter. Details on how to create a custom field mapped to ha-lizard are in the ha-lizard manual.

Lastly - your post from yesterday mentioned that xenserver reported the slave as being disconnected from the iscsi SR. Has this been resolved?

Please Log in or Create an account to join the conversation.

XenServer 6.5 9 years 1 week ago #400

  • Salvatore Costantino
  • Salvatore Costantino's Avatar
  • Offline
  • Posts: 722
Thanks for the update!

The error you are seeing "Error: Key XenCenter.CustomFields.ha-lizard-enabled" is harmless. Looks like we were redirecting stderr to the screen during development and forgot to redirect it back to the log file when preparing the release 1.8.5. We will fix it in the next release.

In the meantime, if you set vm-ha for each machine to "true" or "false" the error will go away. You are seeing it only when the value is set to null for a VM.

thanks for catching this.

Please Log in or Create an account to join the conversation.