Has anyone come across an issue where a host (2x in my case) that were previously fully patched and 100% compliant, suddenly reports missing patches (in my case 30x missing, some from a year ago) in the Update Manager tab, yet shows the correct build number (ie. current)?
~ # esxcli system version get
Product: VMware ESXi
The only changes prior to this issue starting was the upgrading of the Dell VC plugin and deploying the Dell OpenManage Offline Bundle and VIB for ESXi 5.0 (version 7.1) via VUM. OpenManage version 7.0 was already installed on these hosts and this upgrade worked successfully on 14 other hosts.
The Dell extension also shows as missing in VUM, so I logged onto the DCUI and ran "esxcli software vib list" to see if OpenManage had been removed, but not upgraded. To my suprise, this is all that gets returned. I would expect to see alot more, and do on other hosts;
~ # esxcli software vib list
Name Version Vendor Acceptance Level Install Date
----------- ----------------- ------ ---------------- ------------
tools-light 5.0.0-1.22.821926 VMware VMwareCertified 2012-10-02
If I try and remediate the host, it fails with the following error;
The host returns esxupdate error code:15.
The package manager transaction is not successful. Check the Update Manager log files and esxupdate log files for more details.
I can see these errors in the esxupdate.log;
2012-10-25T12:46:05Z esxupdate: Metadata.pyc: INFO: Unrecognized file vendor-index.xml in Metadata file
I'm reluctant to use these 2x hosts at the moment as they are part of a production 5x host cluster and I'm not sure they are stable enough to be depended on. They are currently sitting in maintenace mode, and I could rebuild them, but wanted to resolve this without a rebuild so that we understand what went wrong.
I do have an open support call with VMware, but thought that I would throw it out into the community jsut incase anyone else has seen this and knows how to resolve it without rebuilding the host.
Thanks (as always),
Unfortunately the hosts need to be rebuilt (as per my support ticket).
I was asked to verify if the imgdb.tgz could be corrupt (steps below);
1) Connect to the ESXi host via an SSH session
2) Change directory to /vmfs/volumes:
3) Search for the imgdb.tgz file:
find * | grep imgdb.tgz
Note: This command normally results in two matches. For example:
4) Run this command on each match:
ls -l match_result
ls -l 0ca01e7f-cc1ea1af-bda0-1fe646c5ceea/imgdb.tgz
-rwx------ 1 root root 26393 Jul 20 19:28 0ca01e7f-cc1ea1af-bda0-1fe646c5ceea/imgdb.tgz
The default size for the imgdb.tgz file is approximately 26 KB. If one of the files is only a couple of bytes, it indicates that the file is corrupt.
If the file is corrupt then the only option to solve the problem is to rebuilt the host unfortunately.
In my case;
/vmfs/volumes # ls -l 102e94e3-ca3fac95-338e-6112021b7785/imgdb.tgz
-rwx------ 1 root root 197 Oct 23 13:01 102e94e3-ca3fac95-338e-6112021b7785/imgdb.tgz
/vmfs/volumes # ls -l 67de8dee-fd1e8cf5-20a9-0d2688903ce2/imgdb.tgz
-rwx------ 1 root root 30008 Oct 2 13:04 67de8dee-fd1e8cf5-20a9-0d2688903ce2/imgdb.tgz
I guess I will be rebuilding these hosts today ... happy Friday!