Hi,
i'm having the following Issue:
We have a new Installation with an DataCore-SSV Storageserver in one room and a ESXi-Host in another.
Both are connected via D-Link DGS-3100 Switches via separate VLANs für iSCSI, Management and VM-Traffic. Interconnection are via two separate Fibre-Optic Links (one for iSCSI and one for all other Traffic).
All Servers use HP-Hardware and thus, Broadcom NICs (NC382i on VMware and NC375i on Storage-Side).
Right not I only use a single-iSCSI Connection and a single VM on the ESXi-Host to exclude as much potential error-sources as possible.
The VM only has one Disk served from the Storage (via a separate Datastore with only this one Disk) and the System on a local Datastore on the ESXi-Host
Now I just start a reclaim (Shrink-Disks) on the VM (Windows Server 2008 R2) via the VMware-Tools and after some time (differs from 1-2 Minutes to about 10-15) the iSCSI-Connection breaks and reconnects. One this starts, if often happens multiple times afterwards with a delay of 1-2 Minutes).
I tried to isolate the Log of one of there Interruptions -> see attachment.
I already tried a lot of stuff like disabling RSS/ToE on the Storage-Server and Upgrading the bnx2 Driver on the VMware-Host or disabling a iSCSI-Multipathing-Configuration via multiple Subnets. So far nothing helped. Now I want to Isolate single Componente to find the cause.
While preparing to so this (I have another Host in the Room with the Storage which I can use to create a iSCSI-Target and test that against the VMware-Host), I noticed, that I can't reproduce the error with another workload so far.
I tried using "sdelete" and did not see and Interruptions. Same with Iometer with I tried with a sequential write Load (which should be pretty similar to what the VMware Tools do on Shrinking). also tried the HP-CreateData-Tool without success.
So I'm a bit confused here.
Fact is, that even during an iSCSI-connection Loss no vmkpings on the Storage are lost or delayed (avg of 0,3 ms ; max is 0,6 ms).
Does the Preparation for Shrinking in Vmware-Tools put some illegal workload on the Software-iSCSi-Stack (I highly doubt that) ?
I have to make sure, I don't have a general problem here before this goes into production.
DataCore-Support hasn't been of much help so far.
Does anyone know a good emulation of the VMware-Tools Shrink Command ?
Basically it just seems to create 2Gb Files filles with zeros on the Disk. But as the Transfer Rate fluctuates more than when I use IOmeter and sdelete, it has to do something different.