VMware Cloud Community
AntonioProietti
Contributor
Contributor

stuck at vmw_vaaip_cx loaded successfully after upgrade to 4.1

These days I was updating a couple of  ESXi 4.0  servers (cluster with vCenter) to 4.1.0 (build 260247).

Servers are 2 DELL PowerEdge R610 with FC SAN DELL AX4-5F (it 's an EMC storage rebranded...)

The update of the first node worked fine, and it seemed to be the same also for the second node.

After a reboot the esxi server stucks at vmw_vaaip_cx loaded successfully message.

So I take away this node from cluster and made a clean installation of esxi 4.1. After the reboot the same behavior: server stucks with the same message... Hitting ALT + F12 I see some warning messages about LUN Could not open device 'naa.600...

This is like this other post of the community:    http://communities.vmware.com/thread/304397

If I unplug the fibers to the SAN from the server, it completes the startup in few seconds. And if I made a rescan all is working fine.

I am amazed because the servers are identical and I can not hold in a production environment a server that I can't restart...

Any idea? I would like to avoid reinstalling the version 4.0 that has worked well for months.

Reply
0 Kudos
3 Replies
ransalman
Contributor
Contributor

Do you have MSCS RDMs ? if so try changing the Scsi.CRTimeoutDuringBoot to 1

Read VMware KB 1016106 http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=101610...

AntonioProietti
Contributor
Contributor

Thank you very much for your reply.
Yes,actually on the host with that problem a guest is running that is part of a 2003 Microsoft Cluster (MSCS), with many RDM LUNs. I had not read about the patch until your report. I'm going to update my ESXi hosts with 4.1 Update 2.This upgrade should include the patch on the vmware KB 1016106, but before upgrading I'll try the command suggested as a workaround, to ensure it is effective.
Thanks again for your feedback.

Reply
0 Kudos
AntonioProietti
Contributor
Contributor

OK, installing 4.1 Update 2, that include also the specific patch, doesn't resolve this issue.

Changing the Scsi.CRTimeoutDuringBoot parameter to 1, has definitively solved this problem.

Despite what was stated in the KB 1016106 (this issue is resolved by the VMware ESX/ESXi 4.1

patch released 2011-07-28 and there is a workaround changing the advanced parameter), this

parameter change seems to be mandatory to solve this issue. Thanks again Ransalman for

your useful hint.

Reply
0 Kudos