Hi there,
in preparation of our next planned SRM failover we test the attach - mount - unmount - detach procedure.
We found that detaching SCSI deviced from our hosts takes ~30 sec per device per host. In a cluster with 11 hosts and 40 devices to be detached this will last ~3.5 hours 😞
Is this behaviour expected?
We didn't find the root cause for the process taking so long. What we saw is that disabling PP/VE and VAAI will speed up the process but only to the times I mentioned above. With PP/VE and VAAI activated this process takes up to 4min per device per host.
We trigger the process via Power CLI.
ESXi5 U1 latest patches
EMC VNX 5700 FLARE 5.31.000.5.716
Thanks for any idea how to speed up the process or to find the cause for this taking so long.
Best regards,
daniel
Anyone saw this behaviour too?
Hi Daniel, i got some issues with VAAI once and ATS was the problem. When you said you disabled VAAI, what did you exactly do ?
Those LUN are native VMFS 5 or VMFS 3 upgraded to 5 or still VMFS 3 ?
Any things in vmkernel logs ?
Raphaël.
If this was iSCSI I could point you to a possible solution unfortunately its not.
With iSCSI storage there is a timeout setting that was exposed with patch ESXi v5 patch: ESXi500-201112001. This setting lets you control the login timeout of the iSCSI connections. What has been seen is that with alot of volumes the ESXi won't finish connecting to them in time and you will need to let it slow down and time out. In your case it would be the opposite.
This time out can be changed by the CLI as follows: esxcli iscsi adapter param set -A <vmhbaX> -k LoginTimeout -v 60
This would set the timeout to 60 seconds for iSCSI connection timeouts.
More information about this command can be found here: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=200933...
Unfortunately that does not apply in your case since you are using FC. But hopefully helps others or helps the tech's as they look for possible solutions.
Sorry that your SRM is going slowly.
Cheers,
Tony
Followed the procedure in KB 1033665.
I assume there might be something in the logs but I couldn'd find it.
We talk about fresh created VMFS5 - not upgraded from VMFS3
did you disable the ATS-only setting while disabling VAAI ?
I disabled all three primitives following KB 1033665
We are having the same problem. Did you happen to find a solution?
Open an SR with VMware Support.
A Hotpatch is available and the fix will be GA with vSphere 5.0 U2 end of the year.
The problem should be already fixed in vSphere 5.1.
Thanks Daniel, we have to remove like 20 luns from 60 servers and it is taking about 3-7 minutes per detach. Is this the time you were experiencing or longer?
Does the hotfix require a reboot do you know?
Hmm, we nee to detach ~50 LUNs from 15 hosts.
We faced detach times from 12 minutes to 30 seconds in different configurations.
Do you have EMC PowerPath / VE in place?
Just curious has VMware support come back with anything?
I am running into a similar issue we have 100 SRM LUNs that need to be detach and it takes anywhere from 30 second to 10 minutes per detach to complete. SRM clean up eventually fails after 5 hours and I see the same results when I attempt to do a SRM force clean up.
One odd thing I observed is when you manually detach a LUN it sits at "THe deivce does not contain a diagnotic partaion" for 2-5 mins before it compeltes the detach.
Currently the hotfix has been tested only in the VMware Support Labs. We are finishing the configuration of our test environment and looking forward testing the hotfix at our side end of Oktober.
Did you open an SR on this?
Main,
I installed the patch that was for SRM scsi issues, but detaches still take quite a bit of time. So, no it has not been resolved and unfortunately mass detaches against a lot of hosts is taking some time.