VMware Cloud Community
NaeemKhan1
Contributor
Contributor
Jump to solution

VRanger Slow Backup

Hi All,

We have VRanger setup that backups up our On Prem VMware environment.  We have 5 ESXI hosts and with around 70 VM's.

I noticed the backups are extermely slow. (11mb's throughput tops!)  I would have though it would be much quicker than this. 

We have 2 synology NAS boxes that have a ISCSI connection to the server that has the VRanger Installed.  I've used CIFS to get them connected.  The connection seems solid if I copy from the NAS to another location (90mbs average)

I contacted quest and they did some tests with me (copying large files to the Virtual Appliance from the NAS boxes) and I only get 6 to 7MBs)

They've turned round and saids it something wrong with our VMWare environment.

I would really appreciate it anyone can shed some light on this.

 

Thanks in Advanced.

0 Kudos
2 Solutions

Accepted Solutions
a_p_
Leadership
Leadership
Jump to solution

In case you are interested, these are the settings that I use for Dell SC storage with the iSCSI software adapter:

Settings per host:

Conditions for round robin path changes, all datastores on host (section 6.9.1.1)
esxcli storage nmp satp rule add -s VMW_SATP_ALUA -V COMPELNT -P VMW_PSP_RR -o disable_action_OnRetryErrors -e "Dell EMC SC Series Claim Rule" -O "policy=iops;iops=3"
esxcli storage core claimrule load

Software iSCSI Queue Depth (section 4.2.2)
esxcli system module parameters set -m iscsi_vmk -p iscsivmk_LunQDepth=255

Software iSCSI login timeout (section 4.2.2)
esxcli iscsi adapter param set -A=vmhba## -k=LoginTimeout -v=5

Disable Delayed ACK (section 3)
esxcli iscsi adapter param set -A vmhba##-k DelayedAck -v false
Command from earlier versions of the document: vmkiscsi-tool -W -a delayed_ack=0 -j vmhba##

Settings per cluster:

HA Cluster Settings (section 4.3)
esxcli system settings kernel set -s terminateVMonPDL -v TRUE

HA Cluster Settings (section 4.3)
esxcli system settings advanced set -o "/Disk/AutoremoveOnPDL" -i 1

Advanced options HA Cluster setting (section 4.3)
das.maskCleanShutdownEnabled = True

Reference: https://downloads.dell.com/manuals/common/sc-series-vmware-vsphere-best-practices_en-us.pdf

André

View solution in original post

0 Kudos
NaeemKhan1
Contributor
Contributor
Jump to solution

Hi Andre,

 

Thank you for your response.  This is very useful information.  Dell eventually got back to me and they did advised an element of Best Practice is disabling DelayedACK.  I am in the process of getting an consultant to assist with this.  I will mark this as the best answer.

 

Thank you once again.

 

Naeem

View solution in original post

0 Kudos
13 Replies
a_p_
Leadership
Leadership
Jump to solution

Without more details one can just guess.
What may be worth checking is whether there's an MTU size mismatch with the backup network. What also often cause speed issues is the "DelayedAck" configuration.

André

0 Kudos
NaeemKhan1
Contributor
Contributor
Jump to solution

Thank you for getting back to me. 

What implications can occour if I disabled the DelayedACK ? 

 

Many Thanks

0 Kudos
a_p_
Leadership
Leadership
Jump to solution

This question cannot be answered that easy, because the setting depends on the storage vendor's recommendation.

Disabling DelayedAck for a storage system that supports it, may put additional load on it.
Leaving it enabled for storage systems which do not support it, usually shows high latency.

So please check with your storage vendors what they recommend. Note that if you have your ESXi hosts connected to multiple storage systems with different DelayedAck support, you can configure this setting per target (see e.g. https://kb.vmware.com/s/article/1002598).

André

0 Kudos
NaeemKhan1
Contributor
Contributor
Jump to solution

Hi,

Thank you for the repsonses.   I've checked the DelayedACK and it is Selected. 

I have now contact DELL regarding this. We have SC Series SAN with all the datastores on there.  I will wait for there response.

From what I've read, It's a case of putting each ESXI host into Maintenance Mode and then deselecting this option.  I take it this can be done one by one assuming DELL come back and advised this option needs deselecting?

 

Many Thanks

0 Kudos
a_p_
Leadership
Leadership
Jump to solution

In case you are interested, these are the settings that I use for Dell SC storage with the iSCSI software adapter:

Settings per host:

Conditions for round robin path changes, all datastores on host (section 6.9.1.1)
esxcli storage nmp satp rule add -s VMW_SATP_ALUA -V COMPELNT -P VMW_PSP_RR -o disable_action_OnRetryErrors -e "Dell EMC SC Series Claim Rule" -O "policy=iops;iops=3"
esxcli storage core claimrule load

Software iSCSI Queue Depth (section 4.2.2)
esxcli system module parameters set -m iscsi_vmk -p iscsivmk_LunQDepth=255

Software iSCSI login timeout (section 4.2.2)
esxcli iscsi adapter param set -A=vmhba## -k=LoginTimeout -v=5

Disable Delayed ACK (section 3)
esxcli iscsi adapter param set -A vmhba##-k DelayedAck -v false
Command from earlier versions of the document: vmkiscsi-tool -W -a delayed_ack=0 -j vmhba##

Settings per cluster:

HA Cluster Settings (section 4.3)
esxcli system settings kernel set -s terminateVMonPDL -v TRUE

HA Cluster Settings (section 4.3)
esxcli system settings advanced set -o "/Disk/AutoremoveOnPDL" -i 1

Advanced options HA Cluster setting (section 4.3)
das.maskCleanShutdownEnabled = True

Reference: https://downloads.dell.com/manuals/common/sc-series-vmware-vsphere-best-practices_en-us.pdf

André

0 Kudos
NaeemKhan1
Contributor
Contributor
Jump to solution

Hi Andre,

 

Thank you for your response.  This is very useful information.  Dell eventually got back to me and they did advised an element of Best Practice is disabling DelayedACK.  I am in the process of getting an consultant to assist with this.  I will mark this as the best answer.

 

Thank you once again.

 

Naeem

0 Kudos
NaeemKhan1
Contributor
Contributor
Jump to solution

Hi Andrè 

I have manage to set these settings to all hosts, confirmed Round Robin is active and also delayed ACK is disabled. 

Also I upgraded VRanger to the latest version 7.8.3

Have attempted a full backup this weekend and it's still showing the same speed. I've checked all the nics and they seem to be connected correctly. 

Really puzzled as to why this isn't speeding up. 

0 Kudos
IRIX201110141
Champion
Champion
Jump to solution

We are a Dell shop for ages and a former Vizioncore partner (so plenty of experience with vRanger over a very looooooooooonng period of time).....  i never seen a storage Vendor which doesnt give the advise for disabling DelaysAck in the past 13 years.

If you disabled it just know it will not harm your environment because most likely it only effects new LUN/Devices even if you perform the necessary esxi reboot. Reason is that the Devise specifiy options are stored in the iscsi database of your ESXi and are propertly not updated. So check the settings after reboot.

Can you post the log of your vRanger or verifying which Backupmode youre currently use? I speak about (SSL)NBD, HotAdd or SAN Mode.

Regards,
Joerg

0 Kudos
NaeemKhan1
Contributor
Contributor
Jump to solution

Thanks Joerg. 

I can confirm I disabled delayedACK as advised by Dell and rebooted the servers and then check via ssh and all the iscsi connections were showing 0. 

 Below is a log of one of our VM's being backed up. This is a full backup which too over 7 hours.  

[2021-10-01 23:30:28.890]: vRanger Backup & Replication - v7.8.3.0

[2021-10-01 23:30:28.890]: Selected Options: Backup powered on machines only. | Check destination for free space. | Update notes with the latest backup results. | Enable Active Block Mapping™ (ABM). | Enable Change Block Tracking (CBT). | Selected SpaceSavingTech: Incremental.

[2021-10-01 23:30:28.906]: Task for virtual machine sccm-srv was queued.

[2021-10-02 00:54:52.477]: SourceVm:sccm-srv | Uuid:421ffd2e-4a92-ea97-0d31-567789843285 | VC:vcenter-srv.domain.local, Host:esxi2.domain.local [ESXi 6.0.0]

[2021-10-02 00:54:52.477]: Beginning backup task for Backup Plan 2 NAS C-sccm-srv

[2021-10-02 00:54:52.493]: Starting task validation.

[2021-10-02 00:54:52.493]: Connection to vcenter-srv.domain.local was properly validated.

[2021-10-02 00:54:52.493]: esxi2.domain.local is properly licensed.

[2021-10-02 00:54:52.493]: Validating virtual appliance VA2-vRangerVA completed.

[2021-10-02 00:54:52.509]: Test connection to repository C-Vranger-Backup-Volume starting...

[2021-10-02 00:54:53.305]: Test connection to repository C-Vranger-Backup-Volume successful!

[2021-10-02 00:54:53.305]: Ending task validation... success!

[2021-10-02 00:54:53.305]: Beginning initialization of backup information.

[2021-10-02 00:54:53.352]: Retrieving the tasks parent information.

[2021-10-02 00:54:53.384]: Retrieving save points for any full backups associated with this job.

[2021-10-02 00:54:53.540]: Full backup days policy is enforced.

[2021-10-02 00:54:53.540]: Finished initialization of backup information successfully.

[2021-10-02 00:54:53.540]: Initialization was sucessful. Backup type to run: Full

[2021-10-02 00:54:57.587]: Retrieving the VM BIOS configuration completed.

[2021-10-02 00:54:59.524]: Checking free space on C-Vranger-Backup-Volume completed.

[2021-10-02 00:55:05.727]: Creating a snapshot for vRanger completed.

[2021-10-02 00:55:16.133]: Loading virtual machine 'sccm-srv' information completed.

[2021-10-02 00:55:19.305]: Local machine is a VMware virtual machine.

[2021-10-02 00:55:19.383]: Backup task will attempt to use VA-based VDDK HotAdd.

[2021-10-02 00:55:37.914]: Using filter type(s) change block and active map for disk: vix:1:r:vcenter-srv.domain.local\:443:0:[VM-SAN-EMC-18] sccm-srv/sccm-srv.vmdk

[2021-10-02 08:29:43.350]: Backing up disk 'vix:1:r:vcenter-srv.domain.local\:443:0:[VM-SAN-EMC-18] sccm-srv/sccm-srv.vmdk:moref=vm-73142:snapshot-85256:4' completed.

[2021-10-02 08:31:51.926]: Removing snapshot for vRanger completed.

[2021-10-02 08:31:53.551]: Committing savepoint data to C-Vranger-Backup-Volume completed.

[2021-10-02 08:31:58.598]: Verifying the content of the repository completed.

[2021-10-02 08:31:58.614]: Checking and enforcing retention policy completed.

[2021-10-02 08:31:59.411]: Updating notes for sccm-srv completed.

[2021-10-02 08:31:59.411]: Updating VM notes completed.

[2021-10-02 08:31:59.504]: Setting VM event completed.

Thanks again for looking at this. 

It is much appreciated.  

0 Kudos
swaheed1239
Enthusiast
Enthusiast
Jump to solution

Hello!

There are lot of questions/factors which needs to be looked into in these kind of issues:

- Have you configured MTU 9000 on all the components between your NAS storage and your Esxi hosts. Like, physical switch, Virtual standard/distributed switch. The right way to do it is to make sure all the hops are configured with MTU 9000 aka jumbo frames in between your NAS storage and your esxi hosts. 

- Have you updated the drivers/firmware of the iscsi adapter.

- I'd suggest you to use iscsi initiator and mount the iscsi LUN directly to your backup server and use that as your target for backups.

- You can also check and tweak que depths from inside the guest OS.

- What is the transport mode you are using to backup- NBD, HotAdd or SAN. If NBD what is your network speed 1G or 10G.

 

Let us know more information in order to analyze the issue efficiently.

 

Thanks 🙂

0 Kudos
NaeemKhan1
Contributor
Contributor
Jump to solution

Hi swaheed,  

 

Thank you for your suggestions. 

 

I checked on out network and there is jumbo frames active throughout our vmware environment, meaning, the configuration between our datastores to our NAS has mtu set to 9000.  The upgrade of vranger has made a slight improvement,  however, I would expect it to be far more. 

We have a 1G connection. 

We are using virtual appliances with the HotAdd method to backup our VM's 

Is there anything I can do in terms of diagnosis to test the speed between the nas/virtual apliances/hosts. 

Any suggestions would be highly appreciated.

I'm not clued up with the whole vmware environment and still learning. 😀

Thanks 

Naeem

0 Kudos
swaheed1239
Enthusiast
Enthusiast
Jump to solution

Hello,

What kind of network adapter is being used by the virtual backup appliance? Is it VMXNET3 or E1000? Make sure it is VMXNET3 as it is the enhanced vNIC recommended by VMware.

You mentioned in your post that you are using CIFS. How are you using iscsi and cifs both at a time as iscsi is block storage protocol and cifs is file storage protocol.

Thanks!

0 Kudos
swaheed1239
Enthusiast
Enthusiast
Jump to solution

Furthermore, you can check the throughput/transfer rates using the esxtop utility when your backup job is running to check for any packet drops or latency on the iscsi adapter if using iscsi and vnic of the backup appliance if using cifs.

Check the usage of esxtop utility on vmware docs and you will be able to navigate for analysis/diagnosis.

Thanks!

0 Kudos