VMware Cloud Community
str1ker
Contributor
Contributor

iSCSI-Connection issues when zeroing (shrinking) VM-Disks

Hi,

i'm having the following Issue:

We have a new Installation with an DataCore-SSV Storageserver in one room and a ESXi-Host in another.

Both are connected via D-Link DGS-3100 Switches via separate VLANs für iSCSI, Management and VM-Traffic. Interconnection are via two separate Fibre-Optic Links (one for iSCSI and one for all other Traffic).

All Servers use HP-Hardware and thus, Broadcom NICs (NC382i on VMware and NC375i on Storage-Side).

Right not I only use a single-iSCSI Connection and a single VM on the ESXi-Host to exclude as much potential error-sources as possible.

The VM only has one Disk served from the Storage (via a separate Datastore with only this one Disk) and the System on a local Datastore on the ESXi-Host

Now I just start a reclaim (Shrink-Disks) on the VM (Windows Server 2008 R2) via the VMware-Tools and after some time (differs from 1-2 Minutes to about 10-15) the iSCSI-Connection breaks and reconnects. One this starts, if often happens multiple times afterwards with a delay of 1-2 Minutes).

I tried to isolate the Log of one of there Interruptions -> see attachment.

I already tried a lot of stuff like disabling RSS/ToE on the Storage-Server and Upgrading the bnx2 Driver on the VMware-Host or disabling a iSCSI-Multipathing-Configuration via multiple Subnets. So far nothing helped. Now I want to Isolate single Componente to find the cause.

While preparing to so this (I have another Host in the Room with the Storage which I can use to create a iSCSI-Target and test that against the VMware-Host), I noticed, that I can't reproduce the error with another workload so far.

I tried using "sdelete" and did not see and Interruptions. Same with Iometer with I tried with a sequential write Load (which should be pretty similar to what the VMware Tools do on Shrinking). also tried the HP-CreateData-Tool without success.

So I'm a bit confused here.

Fact is, that even during an iSCSI-connection Loss no vmkpings on the Storage are lost or delayed (avg of 0,3 ms ; max is 0,6 ms).

Does the Preparation for Shrinking in Vmware-Tools put some illegal workload on the Software-iSCSi-Stack (I highly doubt that) ?

I have to make sure, I don't have a general problem here before this goes into production.

DataCore-Support hasn't been of much help so far.

Does anyone know a good emulation of the VMware-Tools Shrink Command ?

Basically it just seems to create 2Gb Files filles with zeros on the Disk. But as the Transfer Rate fluctuates more than when I use IOmeter and sdelete, it has to do something different.

0 Kudos
0 Replies