VMware Cloud Community
pfuhli
Enthusiast
Enthusiast

detach SCSI LUN slow

Hi there,

in preparation of our next planned SRM failover we test the attach - mount - unmount - detach procedure.

We found that detaching SCSI deviced from our hosts takes ~30 sec per device per host. In a cluster with 11 hosts and 40 devices to be detached this will last ~3.5 hours 😞

Is this behaviour expected?

We didn't find the root cause for the process taking so long. What we saw is that disabling PP/VE and VAAI will speed up the process but only to the times I mentioned above. With PP/VE and VAAI activated this process takes up to 4min per device per host.

We trigger the process via Power CLI.

ESXi5 U1 latest patches

EMC VNX 5700 FLARE 5.31.000.5.716

Thanks for any idea how to speed up the process or to find the cause for this taking so long.

Best regards,

daniel

0 Kudos
14 Replies
pfuhli
Enthusiast
Enthusiast

Anyone saw this behaviour too?

0 Kudos
RS_1
Enthusiast
Enthusiast

Hi Daniel, i got some issues with VAAI once and ATS was the problem. When you said you disabled VAAI, what did you exactly do ?

Those LUN are native VMFS 5 or VMFS 3 upgraded to 5 or still VMFS 3 ?

Any things in vmkernel logs ?

Raphaël.

0 Kudos
wondernerd
Enthusiast
Enthusiast

If this was iSCSI I could point you to a possible solution unfortunately its not.

With iSCSI storage there is a timeout setting that was exposed with patch ESXi v5 patch: ESXi500-201112001. This setting lets you control the login timeout of the iSCSI connections. What has been seen is that with alot of volumes the ESXi won't finish connecting to them in time and you will need to let it slow down and time out. In your case it would be the opposite.

This time out can be changed by the CLI as follows: esxcli iscsi adapter param set -A <vmhbaX> -k LoginTimeout -v 60

This would set the timeout to 60 seconds for iSCSI connection timeouts.

More information about this command can be found here: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=200933...

Unfortunately that does not apply in your case since you are using FC. But hopefully helps others or helps the tech's as they look for possible solutions.

Sorry that your SRM is going slowly.

Cheers,

Tony

Cheers,

Tony

Standard junk here... This post is offered as is where is and may not apply to your environment. Views are mine & not my employer. http://www.wondernerd.net
0 Kudos
pfuhli
Enthusiast
Enthusiast

Followed the procedure in KB 1033665.

I assume there might be something in the logs but I couldn'd find it.

0 Kudos
pfuhli
Enthusiast
Enthusiast

We talk about fresh created VMFS5 - not upgraded from VMFS3

0 Kudos
RS_1
Enthusiast
Enthusiast

did you disable the ATS-only setting while disabling VAAI ?

0 Kudos
pfuhli
Enthusiast
Enthusiast

I disabled all three primitives following KB 1033665

0 Kudos
markdjones82
Expert
Expert

We are having the same problem.  Did you happen to find a solution?

http://www.twitter.com/markdjones82 | http://nutzandbolts.wordpress.com
0 Kudos
pfuhli
Enthusiast
Enthusiast

Open an SR with VMware Support.

A Hotpatch is available and the fix will be GA with vSphere 5.0 U2 end of the year.

The problem should be already fixed in vSphere 5.1.

0 Kudos
markdjones82
Expert
Expert

Thanks Daniel, we have to remove like 20 luns from 60 servers and it is taking about 3-7 minutes per detach.  Is this the time you were experiencing or longer?

Does the hotfix require a reboot do you know?

http://www.twitter.com/markdjones82 | http://nutzandbolts.wordpress.com
0 Kudos
pfuhli
Enthusiast
Enthusiast

Hmm, we nee to detach ~50 LUNs from 15 hosts.

We faced detach times from 12 minutes to 30 seconds in different configurations.

Do you have EMC PowerPath / VE in place?

0 Kudos
mainMN
Contributor
Contributor

Just curious has VMware support come back with anything?

I am running into a similar issue we have 100  SRM LUNs that need to be detach and it takes anywhere from 30 second to 10 minutes per detach to complete. SRM clean up eventually fails after 5 hours and I see the same results when I attempt to do a SRM force clean up.

One odd thing I observed is when you manually detach a LUN it sits at "THe deivce does not contain a diagnotic partaion" for 2-5 mins before it compeltes the detach.

0 Kudos
uklvirtual
Contributor
Contributor

Currently the hotfix has been tested only in the VMware Support Labs. We are finishing the configuration of our test environment and looking forward testing the hotfix at our side end of Oktober.

Did you open an SR on this?

0 Kudos
markdjones82
Expert
Expert

Main,

  I installed the patch that was for SRM scsi issues, but detaches still take quite a bit of time.  So, no it has not been resolved and unfortunately mass detaches against a lot of hosts is taking some time.

http://www.twitter.com/markdjones82 | http://nutzandbolts.wordpress.com
0 Kudos