VMware Cloud Community
JShroll
Contributor
Contributor

Rescan for Datastores causes host management failure

All -


I had a very unusual experience just recently.

I was in the final stages of implementing a new EMC VNX Series Fibre Channel SAN into a 5-host vSphere Cluster.   All the production Virtual Machines had been moved over to the new Storage LUNs I had created, and I began the process of reconfiguring new RAID vDisks on the older HP P2000 iSCSI SAN. 


Just shortly after deleting the older Datastores from vCenter, I carved out a new vDisk via the HP SAN Management page and then performed a Rescan for Datastores... in vCenter.  

Unusually - the HBA Rescan was then taking an abnormal amount of time - the Recent Tasks pane showed the process as 'In Process' for all 5 hosts for at least 45 minutes.   Then, the hosts each showed a warning:

Virtual machine creation may fail because agent is unable to retrieve VM creation options from the host

I looked up this error and then went to restart the mgmt-vmware service on each of the hosts via the console - but just after logging into the console, nothing would happen.   Each host would just freeze upon login.  I couldn't SSH into the host(s) either, because it was disabled and I couldn't re-enable it.   Any vCenter task would just hang.

Going into vSphere hosts directly was possible, but the hosts were just as unresponsive to other tasks like shutting down VMs. 

In the end - production virtual machines were still running, but we had to shut down the VMs controlled by each host one-by-one and then cold restart each host.   Three hours later, the hosts seem normal again but I don't have a clue what happened. 


When going to Add Storage... from a host I can now see the vDisk that I was originally scanning for - but now I'm hesitant to do it because I don't want this rescan hang to happen again once the New Storage is added.  

Any speculation?

0 Kudos
7 Replies
a_p_
Leadership
Leadership

Which vCenter Server/ESX(i) version do you use? Did you unmount the iSCSI LUNs before you unpresented/deleted them on the storage system?

André

PS: Discussion moved from VMware vSphere Hypervisor to VMware vSphere™ Storage

JShroll
Contributor
Contributor

They are using vCenter 5.0.0 with ESXi 5.0.0 721882 hosts.

Yes, after removing the last of the Virtual Machines from the Datastore on the old HP SAN, I either performed an Unmount from VCenter or a Delete.  Now I can't remember.  Would it matter?  Either way, it was gone from the active Datastore list.

0 Kudos
JShroll
Contributor
Contributor

Upon further examination of this KB Article ....

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=200460...

I didn't perform the Detach function after Unmounting the VMFS Datastore.  I don't think I've ever done this - simply because I didn't know I had to.  It has always worked for me to just delete the VMFS Datastore and that's it. 

0 Kudos
chriswahl
Virtuoso
Virtuoso

You've hit one of the more annoying parts to working with vSphere storage. The process to properly remove a datastore from ESXi is much less complicated in version 5 than it was in 4, but you still need to follow the entire procedure before issuing a rescan. If you dump the vmkernel.log file, you would probably see a lot of errors and warnings as it tried to scan the path to a "dead" LUN after you reconfigured your old storage array.

Also, in vSphere 5, going to detach a LUN goes through a GUI "wizard" that checks to ensure the LUN is completely ready to be removed. If all the checks light up green, it allows the action to occur.

VCDX #104 (DCV, NV) ஃ WahlNetwork.com ஃ @ChrisWahl ஃ Author, Networking for VMware Administrators
JShroll
Contributor
Contributor

Thanks Chris -

I'll certainly be more careful...

Now that the hosts have been restarted - I think it's safe to rescan the available LUNs again.   I have checked for the available device paths and I only see the LUNs that I'm supposed to.   Agree?

0 Kudos
chriswahl
Virtuoso
Virtuoso

Click on a Host > Configuration > Storage Adapters > click the HBA. As long as you don't see any greyed out and italicized devices on your HBA, you're fine. Everything listed should be in a "mounted" state or not visible at all. You can also right click on the HBA and rescan it instead of issuing a Rescan All command, should you want to limit to one host (to regain comfort with the process).

VCDX #104 (DCV, NV) ஃ WahlNetwork.com ஃ @ChrisWahl ஃ Author, Networking for VMware Administrators
0 Kudos
JShroll
Contributor
Contributor

That's a good idea, I'll do that....

Thanks Chris!

0 Kudos