After I upgraded to ESXi 7.0 Update 3l I ran into issues with guests not having network connectivity. All the settings were still correct and I also checked the switch it was connected to. I upgraded from ESXi 7.0 Update 3k using update in vCenter. I tried rebooting and that didn't help so I ended up rolling back and it's working again. Just curious if anyone had seen the same issue and if you did what the resolution was. Thanks in advance.
I'm running only Intel NIC cards so if I'm correct I'm not even using the ntg3 driver.
I received an update on my case to try this: https://kb.vmware.com/s/article/88875
I will try it on Wednesday. If anyone who is experiencing the issue tries it before then, let me know how it goes, please.
I had the same issue, my portgroups reverted to originating instead of Route based on IP Hash.
I ran the following in powerCLI to fix the issue:
#Change vcenter.domain.net to your fqdn of your vcenter server.
Connect-VIServer -Server vcenter.domain.net
#Change hostname.domain.net to the fqdn of the host that you are updating.
$vmhost = 'hostname.domain.net'
#Update vSwitch0 to your vSwitch hat is experiencing the issue.
$portgroups = Get-VirtualPortGroup -VirtualSwitch vSwitch0 -VMHost $vmhost | select -ExpandProperty Name | sort
#loop through each portgroup on your vSwitch.
foreach ( $portgroup in $portgroups ) {
#This line will output the current portgroup
$portgroup
Write-Output ""
#get the teaming policy and place into policy1 variable.
$policy1 = Get-VirtualPortGroup -VirtualSwitch vSwitch0 -VMHost $vmhost -Name $portgroup | Get-NicTeamingPolicy
#List the current Policy, mainly used for review.
Write-Output "Load Balancing Policy: "
$policy1.LoadBalancingPolicy
#Set the policy to Route based on IP Hash.
$policy1 | Set-NicTeamingPolicy -LoadBalancingPolicy LoadBalanceIP -WhatIf #remove the -WhatIf when you are ready to run it.
Write-Output ""
Write-Output ""
}
This issue (as well as the ntg3 driver issue) is fixed in ESXi 7.0 Update 3m build-21686933 released TODAY 2023-05-03
From:
https://docs.vmware.com/en/VMware-vSphere/7.0/rn/vsphere-esxi-70u3m-release-notes.html
PR 3164897: After an upgrade to ESXi 7.0 Update 3l, some ESXi hosts and virtual machines connected to virtual switches might lose network
After an upgrade to ESXi 7.0 Update 3l, some ESXi hosts, their VMs, and other VMkernel ports, such as ports used by vSAN and vSphere Replication, which are connected to virtual switches, might lose connectivity due to an unexpected change in the NIC teaming policy. For example, the teaming policy on a portgroup might change to Route Based on Originating Virtual Port from Route Based on IP Hash. As a result, such a portgroup might lose network connectivity and some ESXi hosts and their VMs become inaccessible.
AND
PR 3182870: After upgrading the ntg3 driver to version 4.1.9.0-4vmw, Broadcom NICs with fiber physical connectivity might lose network
Changes in the ntg3 driver version 4.1.9.0-4vmw might cause link issues for the fiber physical layer and connectivity on some NICs, such as Broadcom 1Gb, fails to come up.
This issue is resolved in this release. ESXi 7.0 Update 3m provides ntg3 driver version 4.1.9.0-5vmw. The fix also adds a module parameter, fifoElastic, which you can enable in case of jumbo frame drops in certain Dell switches. To enable the parameter, use the following command:
esxcli system module parameters set -p 'fifoElastic=1' -m ntg3
If you are already experiencing this problem with Teaming and Failover load balancing due to applying 7.0 U3l, I do not believe that applying new release 7.0 U3m will revert any now-incorrect teaming policy back to what it was prior to your application of 7.0 U3l - you will likely need to do that manually. But honestly I have not yet applied U3m so I cannot say for sure.
Thank you for the post. I put one of my hosts back on build 3l a second time so that I could work with VMware on a solution. The experience the second time was different than the first. During the first upgrade, VMs were unreachable on the upgraded host. After the second upgrade, my dedicated vmotion nic couldn't communicate with the other host (we moved the vmotion service to a working nic and VMs were usable on the upgraded host). After hours of troubleshooting with VMware network support, I was told to work with my internal network team as everything appeared correct on the VMware side of things. I was calling it a day when I saw your post. Today I reverted my upgraded host, upgraded it to 3m, and everything worked as expected. I migrated all of my VMs to the upgraded host and updated the other host to 3m and rebalanced the load. All good now.
We installed the update 7.0.3L on a Nutanix platform yesterday and got an issue too. Yes, some group ports on few hosts got their teaming settings overridden but for us, it was the same setting so no issue there. What happened is that few port groups on some host got their VLAN ID removed. Even some port groups had their VLAN ID replaced by a new one. We put back the good VLAN ID's and everything is ok now.
I opened a case. They're redirecting me to the KB but I think we encountered another bug.
Hope this helped!
Confirmed from PatrickDLong's post that u3m appears to fix the teaming issues introduced in u3l on an HP Proliant DL380 G9.
I upgraded yesterday to ESXi 7.0 Update 3m with no problem also.
I'm facing the same problem with same hardware as yours!
So if I understand the only solution now is to revert back to the previous esxi version ?
Now it's working after patching the server.