We are having very similar issues in our environment now but only after upgrading to 6.5 from 6.0. We've upgraded our emulex drivers to 11.2.1269 and network drivers to 11.2.1149 but continue to have issues with VMs dropping communication with VMs outside of the host on the same port-group (can communicate with VMs on the same port-group on the same host). Our VC firmware version is on 4.45 but it seems from the dialogue that the VC isnt the problem. Additionally, these VMs cannot talk to any other port-group on the same or different host either. It's not until we vmotion the VM or disable / enable the VM NIC that the RARP brings the VM online again - with the upstream switches / gateway that is. We may have a VM or VMs go down all within a short time, or we may go a couple of days without an issue - we dont see any pattern to what is triggering this event.
BL460C Gen9 blades
FlexFabric 20Gb 2-port 650FLB Adapter 11.2.1269 / 11.2.1149
Virtual Connect firmware 4.45
esxi 6.5 w/ distributed switches
We have done a lot to try and stabilize the situation, including:
Initially upgraded our emulex drivers from 10.5 to 11.2.1269
Recreated the port-groups for the original migrated DVS
Recreated the DVS from scratch along with all of the port-groups.
Rebooted the upstream switches
Changed the port-group load balancing method to 'route based on originating virtual port' from 'NIC load'
Created static MAC address entries on 2 VMs to test communication between each other (failed)
Created interface IP(s) on the upstream switch(es) on the failed VM subnet to test connectivity to the VM (failed)
Removed MAC address entry in the address table on the upstream switch
Upstream switches do not show any issues with flapping during a failure event
VMWare logs/Log InSight/vROPS/ have no visibility into the issue as no events are logged during these failures
We had VMs fail on both sides of the chassis/VC
Any update to your own situation would be appreciated.
Did anyone get an answer to this issues? Does anyone have an HPE case number I can reference my local HPE support team with?
1 person found this helpful
Have you tried doing this ?
To reduce burst traffic drops in Windows Buffer Settings:
- Click Start > Control Panel > Device Manager.
- Right-click vmxnet3 and click Properties.
- Click the Advanced tab.
- Click Small Rx Buffers and increase the value. The default value is 512 and the maximum is 8192.
- Click Rx Ring #1 Size and increase the value. The default value is 1024 and the maximum is 4096.
This is applicable for vmxnet3
and most of the time this resolves the issue
Anyone get answer from HPE or VMWare ?
We had like same issue using Flexfabric 650M.
But the issue has gone after reboot host a few times or down/up vmnic usng esxcli command.
The issue is happened on E1000 adapter.
Guest‘s MAC address record on Flex-10 did not change from old port to new port when I did vMotion.
I think Flex-10 does not receive RARP or something packets for updating MAC address table...
we are also experiencing the same issue...
you have to update firmware
Remove the NIC from profile add new one . configure ESxi host with nic , it should fix the issue .RAJESH RADHAKRISHNAN
VCA -DCV/WM/Cloud,VCP 5 - DCV/DT/CLOUD, ,VCP6-DCV, EMCISA,EMCSA,MCTS,MCPS,BCFA
Mark my post as "helpful" or "correct" if I've helped resolve or answered your query!
Is it possible for you to provide vmkernel.log, hostd.log & VM's vmware.log file?
1- Are all virtual machines isolated from the network or just one?
2- When one virtual machine is isolated from network, can you ping it from a different VM from the same VLAN and see if it's reachable?
3- I noticed that you mentioned about replacement of VC module, did you try to roll back your change?
4- Are you running VM snapshot based backups?
Our g9 servers are still stable on CNA firmware: 126.96.36.199 and driver: 188.8.131.52
But recently I got several new g10 servers with "HP FlexFabric 20Gb 2-port 650FLB Adapter".
I use VMware-ESXi-6.5.0-Update1-7388607-HPE-650.U184.108.40.206.23-Feb2018.iso for esxi installation. And there are no new software/firmware on 2017.10.1 spp for g10 servers, so nothing to update here.
Will try g10 with firmware version: 11.4.1231.6 Drivers & Software - HPE Support Center.
and driver version: 11.4.1205.0 (this version comes with hpe esxi iso)