VMware Communities > VMTN > VMware Infrastructure™ > VI: ESX 3.0 > Discussions

This Question is Possibly Answered

1 "correct" answer available (10 pts)
9 Replies Last post: Oct 20, 2008 8:03 AM by BUGCHK
Reply

ESX 3.0.2 host hung after re-scan LUNs

Oct 15, 2008 11:48 AM

Click to view Beijinger's profile Novice Beijinger 15 posts since
Sep 11, 2008

Hi there,

We have 5 ESX 3.0.2 server hosts, one day I did re-scan entire HBAs because I want to add two new LUNs into the datastore, but

the server host hung. After rebooting, the ESX server host looks like fine again. How to find the root cause of hang? I read the ESX

server log, cannot find any clues, are there any other log files or coredump files? how to get them? thanks.

Reply Re: ESX 3.0.2 host hung after re-scan LUNs Oct 15, 2008 11:50 AM
Click to view ZippyDaMCT's profile Master ZippyDaMCT 1,372 posts since
Jun 3, 2005
I was told this is a bug when you select Rescan uncheck 1 of the boxes then run it, then Rescan using the other option, this was addressed in a patch which escapes me.
Reply Re: ESX 3.0.2 host hung after re-scan LUNs Oct 15, 2008 11:56 AM
in response to: ZippyDaMCT
Click to view Beijinger's profile Novice Beijinger 15 posts since
Sep 11, 2008
could you tell me the patch number or patch description? thx.
Reply Re: ESX 3.0.2 host hung after re-scan LUNs Oct 15, 2008 12:05 PM
Click to view Cl3gh0rn's profile Enthusiast Cl3gh0rn 54 posts since
Jan 6, 2007

Hi, I havent checked but the VIClient logs might give you something on the host you rescanned "/var/log/vmware/hostd.*"

I have taken to scanning each HBA individually as have found this to be a common problem; well resulting in timeouts when rescanning all HBAs anyway.

I have had this before when the VMFS3 was corrupted but you will usually see this on the COS after startup has been successful.

Reply Re: ESX 3.0.2 host hung after re-scan LUNs Oct 15, 2008 12:22 PM
in response to: Cl3gh0rn
Click to view ZippyDaMCT's profile Master ZippyDaMCT 1,372 posts since
Jun 3, 2005
I can't remember I will ask our resident brain tomoz as I know he had this loads of times- poor old bold g*t
Reply Re: ESX 3.0.2 host hung after re-scan LUNs Oct 15, 2008 1:13 PM
in response to: ZippyDaMCT
Click to view Cl3gh0rn's profile Enthusiast Cl3gh0rn 54 posts since
Jan 6, 2007
Was that bold or bald; after having this issue he has maybe pulled all his hair out! :0
Reply Re: ESX 3.0.2 host hung after re-scan LUNs Oct 15, 2008 1:57 PM
in response to: Cl3gh0rn
Click to view jftwp's profile Hot Shot jftwp 240 posts since
Oct 27, 2005

Scan the hosts by right-clicking each HBA (select scan) in each host. I do. Even after applying a patch that seems to have fixed the issue. I just find it to be a 'sure thing' when I scan on each host's HBA directly. Sometimes I have to do that TWICE per HBA. You can also resort to cmd line scanning, but many folks prefer the gui, and you don't have to connect to each and every ESX host when you use VC of course, versus your favorite SSH client for the cmd line stuff.

Anyway, in the end, I prefer the right-click-each-HBA and scan approach. It works. I'm going to 3.5 soon and will be scanning galore(!) in my dev environment to see if it's 'safe' to go back to the Rescan dialog box (with those checkboxes). Until then, I'm set in my safe ways.

Reply Re: ESX 3.0.2 host hung after re-scan LUNs Oct 17, 2008 1:21 AM
in response to: jftwp
Click to view Ph.Seifert's profile Enthusiast Ph.Seifert 31 posts since
Sep 7, 2007
Can I do the (re)scan of the HBA by putting the host in maintenance mode and then reboot it? So after reboot the new LUN must be found? Then i would the migrate the VMs (on the existing LUN - not the new) to the fresh rebooted host an do the same with the other host. After it i can use the new LUN by all hosts? Is this a possible solution?
Reply Re: ESX 3.0.2 host hung after re-scan LUNs Oct 17, 2008 3:31 PM
in response to: Ph.Seifert
Click to view jftwp's profile Hot Shot jftwp 240 posts since
Oct 27, 2005

That seems like an awful lot of work(around) just to get your HBA's re-scanned. I have some suggested reading for you. You are NOT alone in your problem, but there are solutions herein:

http://communities.vmware.com/thread/67309

In particular (and because I don't want you to have to wade through the NUMEROUS pages of this thread where a bunch of us posted our experiences/woes/frustrations while this issue, you'll be interested in this: http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=10229

In the end, I do believe the patch made it into 3.0.2 but I had put the patch onto my 3.0.1 hosts (and subsequently upgraded to 3.0.2) and have not had the problem since.

But--ha--out of sheer paranoia I STILL rescan my HBA's one at a time (right click on HBA, select 'Rescan'). Half the time, I have to do this a SECOND time for the new LUN/s to be recognized.

Reply Re: ESX 3.0.2 host hung after re-scan LUNs Oct 20, 2008 8:03 AM
in response to: jftwp
Click to view BUGCHK's profile Master BUGCHK 953 posts since
Nov 7, 2005

Sometimes I have to do that TWICE per HBA

It is required for QLogic adapters, because it is an asynchronous operation with this driver.

I always unselect the scanning for VMFS volumes and have never seen a hang that way. Funny thing is - it finds and mounts VMFS volumes anyway...

Actions