dekib
Contributor
Contributor

Recurring problems after upgrading to vSphere 5.5

I have 2 HP Servers and each had vSphere 5.0

To upgrade them I did the following:

1. I installed vCenter Server on a physical machine.

2. Moved all the VMs from first ESX to secoond  ESX

3. Installed vSphere 5.5 on first server

4. Moved all the VMs to first ESX

5. Installed vSphere 5.5 on the second server

6. Put back VMs on second ESX

For 2 days all was working well!

Then suddenly all the VMs became inaccessible on server 2. vCenter showed a message that there is a problem accessing one of the LUNs and the message next to the VMs that are located on that LUN said inaccessible, but I can't reach any of the VMs on that host.
It does not want to open the console even to those VMs that are supposedly not affected. The only way to solve this is a reboot. After the reboot, the problematic LUN is not listed in the storage view of my host configuration and it takes many tries to

add it back.

And then the next day the same thing happened to server 1.

After reboot and after I added the missing storage I thought that all is well - end of story.

It happened again and again... one day server 1, next day server 2 and so forth (3 times so far)

I don't have any vMotion or other advanced configuration applied.

I don't even know where to start troubleshooting this issue (Log files, where are they? never used the command prompt on vSphere)... today I can't even add the missing Disk/LUN. I go through all the steps as before, I select the storage the system finds but it does not add it!

Any ideas?

Deki

Tags (1)
0 Kudos
3 Replies
NealeC
Hot Shot
Hot Shot

Hi Dekib,

You're right to go for the logs first.

To find where your logs on 5.5 are the following KB should help

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=203207...

You obviously want to be looking around the time you see the disconnects.

How are you connecting to your LUNs? Fibre channel? iSCSI? NFS?

/var/log/esxupdate.log: ESXi patch and update installation logs.

This log might show any issues that occurred during your upgrade (if you did an upgrade rather than fresh install of 5.5)

The next two key logs for diagnosis would be

  • /var/log/vmkernel.log: Core VMkernel logs, including device discovery, storage and networking device and driver events, and virtual machine startup.

  • /var/log/vmkwarning.log: A summary of Warning and Alert log messages excerpted from the VMkernel logs.

These will hopefully show you errors when the LUN "drops off".

Did you use the HP 5.5 ISO to upgrade your servers? as that will have HP specific drivers for NICs/HBAs etc. and may be where you have introduced the instability.

Regards

Chris

-------------- If you found this or any other answer useful please consider the use of the Helpful or Correct buttons to award points. Chris Neale VCIX6-NV;vExpert2014-17;VCP6-NV;VCP5-DCV;VCP4;VCA-NV;VCA-DCV;VTSP2015;VTSP5;VTSP4 http://www.chrisneale.org http://www.twitter.com/mrcneale
dekib
Contributor
Contributor

Hi Chris,

I found out how to download all the log files from vCenter.

Now I'm waiting for one of them to fail again Smiley Wink

I did a clean install and the first time I did it I used HP image. The second time I did it I (making sure that it's not the HP's image the fault) I used the generic image and I rearranged the RAID configuration but it didn't help.

My LUNs are all made from local disk.

Now I'm doing some changes in networking part: on my previous configuration VMkernel port was together with on same NIC as Virtual Machine Port Group, so I created another standard switch containing only VMKernel.

Thanks for your input.

Regards,

Deki

0 Kudos
dekib
Contributor
Contributor

I just had another event.

All the VMs looked OK, only the host had an ISSUE.

Of course I was not able to either open the console from vCenter: Unable to connect to the MKS:  Could not connect to pipe \\.\pipe\vmware-authdpipe within retry period

Remote Desktop (TeamViewer) is not working either. Basically all the VMs are off line!

I tried to export log files via vCenter but the process failed. After reboot it seems that some of the log files get wiped.

Here are some screenshots of vCenter and of ssh session with host.

0 Kudos