We have a very small installation running ESXi-5.1.0-799733-standard with a vSphere Essentials license on a single physical machine. We run 10 very different servers that support our software development needs.
We haven't had any problems with the hypervisor, perhaps for over a year. However, this morning first we noticed some of the VMs stopped responding, and about a half-hour later the hypervisor itself (accessed through vSphere Client) became unresponsive, including the hardware console itself.
We power-cycled the hypervisor hardware and everything came up fine with no error conditions or messages. All the VMs are working fine now.
What happened? How do I find out what happened? How would I copy the system logs to another system where I can study them?
The "Events" tab in the vSphere Client displays only a limited span of the most recent events. Is there a way to enlarge this log? Is there a file behind this display that retains more history?
Thanks for any advice,
Bob
Check the vmkernel.log of the host. /var/log
Feel free to share some of the log entries if you want.
Try pressing [ALT]+[F12] on the console of the host.
If there is a bunch of messages in reverse text (black text on a white background), then you are likely looking at a storage issue. Get some of the messages and it would be possible to diagnose the issue more precicely.
I've attached vmkernel.log, perhaps that indicates something.
I just increased the number of buffers in the buffer cache, since it appears that there are sometimes no buffers available.
[ALT]+[F12] on the console does list a bunch of messages in reverse text--I have no idea what they mean or how to show them to you.
You need to identify the datastore associated with this naaID ... naa.5000c5004f6cf510 ... and make sure it is properly mounted. Check under Configuration --> Storage Adapters --> Devices. If it is there but greyed out, you shoudl check with your storage admin to identify the LUN and make sure it is A) Still being made available to the WWN's of that particular host. If it is not, but should be, have him add it back to the proper storage group (or whatever, depending on your array type). If it has been destroyed on the array, or you simply don't need it, you need to get rid of it so that the host stops looking for it and trying to connect to it. This is what is eating up your buffer cache more than likely.
rjf7r, any update on what you found with the datastore with the naaID naa.5000c5004f6cf510 that was constantly flagging in the vmkernel logs?
Interesting. Since it is a local disk I would suspect a possible driver issue. What kind of driver and driver version are you using?
just the drivers that came with ESXi 5.1. How would I determine the driver identity and version?
The driver in use would vary with the HBA being used. Some use Emulex, some use QLogic, etc, and within those families are multiple driver types.
Check this article to help determine what you have, and then we can troubleshoot more effectively.
Our storage drivers appear to be (according to "esxcli storage core adapter list") to be identified as
8086:1d02 15d9:0637 and 8086:1d6b 15d9:0637.
Using VMware Compatibility Guide: I/O Device Search I don't get an exact match.
It appears that our VIB dates back to system installation, 3013-04-23.
Not sure if this means anything.
Bob
What is the output of the commands " esxcfg-scsidevs -a" and "vmkload_mod -s HBADriver |grep Version"
Did you install ESXi with a vendor-supplied ISO (from Dell or HP) or with an ISO from VMware?
We used the VMware iso, since SuperMicro does not have a vendor-specific ISO.
(Since this was two years ago, is there any way to check from the running system? I know that we did download the HP flavor since we were doing some testing on a separate piece of HP hardware before assembling our current system.)
You can see the image you used in the ESXi Status page of the vSphere C# Client
So I guess it's:
ESXi-5.1.0-799733-standard
Does that mean it was installed from a VMware-distributed iso?
OK, so that is a general availability release. Have you since done any updates? What is the current build version of your ESX server?
How do I find the build version? I assume that this is something different from the "Image Profile".
We've done Update 3, "VMware-VMvisor-Installer-5.1.0.update03-2323236.x86_64".
Bob
In the VI Client it's at the top of the page if you are on a host. Via CLI, you can find out by running the command vmware -v.