VMware Cloud Community
GalNeb
Enthusiast
Enthusiast

FDM fails cluster election after 7.0.3.18644231 update

After installing the update to 7.0.3, my cluster will no longer configure HA. It just keep retrying the cluster election over and over again.  I see that a couple days after that rollup patch, a new FDM patch was released.  It will not install via LCM.  Thus I tried installing it manually at the command line and got the error: "Expected 1 component, found 2"
 
After review of the /var/run/log/esxupdate.log (note: I do not have a fdmupdate.log) I found that the VMware_bootbank_vmware-fdm_7.0.3-18751801.vib file does appear to have 2 components in it. The update log shows that it is installing the VMware tools 11.3.5 first, then craps out when it finds another componet (the VMware-fdm component) in the same .vib file.
My supposition is that VMware was trying to release both the tools and fdm fix at the same time as the major esxi rollup 7.0.3.18644231 included only the 11.3.0 tools and not the 11.3.5 tools. But instead of releasing 2 vibs, they screwed up and put them both in the same vib file. So will someone at VMware please fix this.
Old enough to know better, young enough to try anyway
0 Kudos
5 Replies
GalNeb
Enthusiast
Enthusiast

esxupdate.log section referred to above

 

2021-10-25T17:02:30Z esxupdate: 2336641: vmware.runcommand: INFO: runcommand called with: args = '['/sbin/esxcfg-advcfg', '-q', '-g', '/UserVars/EsximageNetTimeout']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
2021-10-25T17:02:30Z esxupdate: 2336641: vmware.runcommand: INFO: runcommand called with: args = '['/sbin/esxcfg-advcfg', '-q', '-g', '/UserVars/EsximageNetRetries']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
2021-10-25T17:02:30Z esxupdate: 2336641: vmware.runcommand: INFO: runcommand called with: args = '['/sbin/esxcfg-advcfg', '-q', '-g', '/UserVars/EsximageNetRateLimit']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
2021-10-25T17:02:30Z esxupdate: 2336641: root: INFO: Command = vib.list
2021-10-25T17:02:30Z esxupdate: 2336641: root: INFO: Options = {'depot': None, 'viburl': None, 'nameid': None, 'profile': None, 'baseimageversion': None, 'addon': None, 'softwarespec': None, 'level': None, 'updateonly': False, 'noliveinstall': False, 'nomaintmode': False, 'force': False, 'dryrun': False, 'oktoremove': False, 'proxy': None, 'nosigcheck': False, 'pending': None, 'rebooting': False, 'downgrade': None, 'nohwwarning': False}
2021-10-25T17:02:32Z esxupdate: 2336641: HostImage: INFO: Installers initiated are {'live': <vmware.esximage.Installer.LiveImageInstaller.LiveImageInstaller object at 0x344bea9190>, 'boot': <vmware.esximage.Installer.BootBankInstaller.BootBankInstaller object at 0x344bea9370>, 'locker': <vmware.esximage.Installer.LockerInstaller.LockerInstaller object at 0x344c135250>}
2021-10-25T17:02:32Z esxupdate: 2336641: imageprofile: INFO: Adding VIB VMware_locker_tools-light_11.3.5.18557794-18558696 to ImageProfile (Updated) ESXi-7.0.0-15843807-standard
2021-10-25T17:02:32Z esxupdate: 2336641: imageprofile: DEBUG: Adding Component TOOLS-18558696 to ImageProfile (Updated) ESXi-7.0.0-15843807-standard
2021-10-25T17:02:32Z esxupdate: 2336641: root: DEBUG: Finished execution of command = vib.list
2021-10-25T17:02:32Z esxupdate: 2336641: root: DEBUG: Completed esxcli output, going to exit esxcli-software
2021-10-25T17:02:59Z esxupdate: 2336671: vmware.runcommand: INFO: runcommand called with: args = '['/sbin/esxcfg-advcfg', '-q', '-g', '/UserVars/EsximageNetTimeout']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
2021-10-25T17:02:59Z esxupdate: 2336671: vmware.runcommand: INFO: runcommand called with: args = '['/sbin/esxcfg-advcfg', '-q', '-g', '/UserVars/EsximageNetRetries']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
2021-10-25T17:02:59Z esxupdate: 2336671: vmware.runcommand: INFO: runcommand called with: args = '['/sbin/esxcfg-advcfg', '-q', '-g', '/UserVars/EsximageNetRateLimit']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
2021-10-25T17:02:59Z esxupdate: 2336671: root: INFO: Command = vib.install
2021-10-25T17:02:59Z esxupdate: 2336671: root: INFO: Options = {'depot': None, 'viburl': ['/vmfs/volumes/HA/VMware_bootbank_vmware-fdm_7.0.3-18649296.vib'], 'nameid': None, 'profile': None, 'baseimageversion': None, 'addon': None, 'softwarespec': None, 'level': None, 'updateonly': False, 'noliveinstall': False, 'nomaintmode': False, 'force': False, 'dryrun': False, 'oktoremove': False, 'proxy': None, 'nosigcheck': False, 'pending': None, 'rebooting': False, 'downgrade': None, 'nohwwarning': False}
2021-10-25T17:03:01Z esxupdate: 2336671: HostImage: INFO: Installers initiated are {'live': <vmware.esximage.Installer.LiveImageInstaller.LiveImageInstaller object at 0xd5bc7001c0>, 'boot': <vmware.esximage.Installer.BootBankInstaller.BootBankInstaller object at 0xd5bc98c2b0>, 'locker': <vmware.esximage.Installer.LockerInstaller.LockerInstaller object at 0xd5bc9e4580>}
2021-10-25T17:03:01Z esxupdate: 2336671: imageprofile: INFO: Adding VIB VMware_locker_tools-light_11.3.5.18557794-18558696 to ImageProfile (Updated) ESXi-7.0.0-15843807-standard
2021-10-25T17:03:01Z esxupdate: 2336671: imageprofile: DEBUG: Adding Component TOOLS-18558696 to ImageProfile (Updated) ESXi-7.0.0-15843807-standard
2021-10-25T17:03:01Z esxupdate: 2336671: HostImage: DEBUG: Deferring initiating installers
2021-10-25T17:03:03Z esxupdate: 2336671: HostImage: INFO: Installers initiated are {'live': <vmware.esximage.Installer.LiveImageInstaller.LiveImageInstaller object at 0xd5bc928b80>, 'boot': <vmware.esximage.Installer.BootBankInstaller.BootBankInstaller object at 0xd5bcc8cee0>, 'locker': <vmware.esximage.Installer.LockerInstaller.LockerInstaller object at 0xd5bcf9c400>}
2021-10-25T17:03:03Z esxupdate: 2336671: imageprofile: INFO: Adding VIB VMware_locker_tools-light_11.3.5.18557794-18558696 to ImageProfile (Updated) ESXi-7.0.0-15843807-standard
2021-10-25T17:03:03Z esxupdate: 2336671: imageprofile: DEBUG: Adding Component TOOLS-18558696 to ImageProfile (Updated) ESXi-7.0.0-15843807-standard
2021-10-25T17:03:03Z esxupdate: 2336671: Transaction: DEBUG: Metadata is provided, skip download
2021-10-25T17:03:03Z esxupdate: 2336671: Transaction: INFO: Skipping installed VIBs
2021-10-25T17:03:03Z esxupdate: 2336671: Transaction: INFO: Final list of VIBs being installed: VMware_bootbank_vmware-fdm_7.0.3-18649296
2021-10-25T17:03:03Z esxupdate: 2336671: imageprofile: INFO: Adding VIB VMware_bootbank_vmware-fdm_7.0.3-18649296 to ImageProfile (Updated) ESXi-7.0.0-15843807-standard
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: Traceback (most recent call last):
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: File "/usr/lib/vmware/esxcli-software", line 773, in <module>
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: main()
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: File "/usr/lib/vmware/esxcli-software", line 764, in main
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: ret = CMDTABLE[command](options)
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: File "/usr/lib/vmware/esxcli-software", line 601, in VibInstallCmd
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: res = t.InstallVibsFromSources(viburls, [], nameids,
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: File "/lib64/python3.8/site-packages/vmware/esximage/Transaction.py", line 965, in InstallVibsFromSources
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: inst, removed, exitstate = self._installVibs(curprofile,
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: File "/lib64/python3.8/site-packages/vmware/esximage/Transaction.py", line 1207, in _installVibs
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: hasConfigDowngrade = checkFdmConfigDowngrade(curProfile, newProfile)
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: File "/lib64/python3.8/site-packages/vmware/esximage/Transaction.py", line 1122, in checkFdmConfigDowngrade
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: compDowngrades = curProfile.GetCompsDowngradeInfo(newProfile)
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: File "/lib64/python3.8/site-packages/vmware/esximage/ImageProfile.py", line 2416, in GetCompsDowngradeInfo
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: curComp = self.components.GetComponent(name)
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: File "/lib64/python3.8/site-packages/vmware/esximage/Bulletin.py", line 1276, in GetComponent
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: raise ValueError('Expected 1 component, found %u'
2021-10-25T17:03:03Z esxupdate: 2336671: root: ERROR: ValueError: Expected 1 component, found 2

Old enough to know better, young enough to try anyway
0 Kudos
GalNeb
Enthusiast
Enthusiast

The above log shows when I tried to re-install the older FDM patch which gave the same error as the latest one.  Something is really funny going on here.  after removing the older version of FDM to try to manually install the new one, LCM no longer recognizes that anything is missing at all.  So I am currently stuck with no FDM component installed on one of my hosts.

Old enough to know better, young enough to try anyway
0 Kudos
MGB22
Contributor
Contributor

We had a similar issue that was related to having 2 i40en vibs installed, one was Intel and one was VMware.  This is what we did to resolve the HA issue:

Enable SSH on the Host

1. Check for the current High Availability vib
esxcli software vib list |grep fdm

Should return something like: vmware-fdm 7.0.3-18649296 VMware VMwareCertified 2021-10-06
If nothing returns check the /tmp directory
ls /tmp

Update 3a includes VMware_bootbank_vmware-fdm_7.0.3-18751801.vib, we found it in the /tmp directory after the install failed
Use WinSCP to copy it to your desktop. If it doesn't exist, download it.

3. Check for the i40en vib and make sure there aren't 2 versions installed:
esxcli software vib list |grep i40en

4. If Intel i40en and VMware i40enu both exist, remove the Intel vib.
esxcli software vib remove -n "i40en"

5. reboot

6. After the system reboots you'll need to use WinSCP to copy VMware_bootbank_vmware-fdm_7.0.3-18751801.vib to the \tmp file to install it via esxcli:
esxcli software vib install -v /tmp/VMware_bootbank_vmware-fdm_7.0.3-18751801.vib -f

installation doesn't require a reboot

7. Now check Baseline complaince against the Non-Critical Host Patches and Remediate
8. After reboot, re-run the Check Compliance against all non-critical, hosts security and critical patches and then do a remediation
This should then give you a compliant image

0 Kudos
GalNeb
Enthusiast
Enthusiast

Thank you for the nice detailed instructions that I could cut and paste easily.  Unlike so many of the posts that I found on the i40en issue, your post was the most easily read and used.  this fix did get me started.  Once I fixed the i40en/u issue that still left a number of strange patches left to install/reinstall.  In fact it actually told me that portions of 7.0.2 were missing and needed to be reinstalled.  In all I did 3 rounds of patching with the lifecycle manager before everything came up clean again.

 

On the other hand, 4 of the patches it said were missing never did clear.  I found this article that just blew me away:

https://kb.vmware.com/s/article/85701

This KB actually told me to create a duplicate non-critical patches baseline so that I could exclude the offending 4 patches as VMware has a circular reference of vib's that each think they should be replacing each other.

Old enough to know better, young enough to try anyway
0 Kudos
MGB22
Contributor
Contributor

Glad it helped, this was my first post.  When we did our update there wasn't much out there about how to fix the busted HA.  It looks like they released some updated documents on the issue last week where they mentioned it was the enu not the en driver, either way the HA vib didn't like having 2 so you may want to keep the steps for the next update ...

KB Article:  https://kb.vmware.com/s/article/85982

Release Notes: https://docs.vmware.com/en/VMware-vSphere/7.0/rn/vsphere-esxi-70u3a-release-notes.html

On a side note, we had excluded the nvme vibs from the initial update 3 install which I forgot to mention in the post so they didn't show up when we remediated for the HA issue.  Probably should have mentioned that earlier. 🙂

0 Kudos