1 person found this helpful
Further to the above, VMware recommended to disable the igbn driver and use the native igbo driver.
We made these changes on one host and the host experienced disconnections almost every night with the management network going down and we had to manually restart the management network every time to get connectivity to the VC restored.
Raising this with VMware they still don't have an ETA for the a fix and they along with Dell, have advised that they now have a test driver available which is not yet released for production.
Due to the long delays in this issue, we have decided to roll back to the old igbn driver version 4.1 as it seems to be more stable compared to version 1.4.7
We also have PSOD issue with I350 and VMware-VMvisor-Installer-6.7.0.update02-13006603.x86_64-DellEMC_Customized-A00.iso
Are you running the cards using the igbn drivers? If yes what version?
Can you please post the screenshot of the PSOD. It might be helpful.
Please note that VMware have come back with the below.
PSOD on boot multiple hosts
The Intel igbn driver in combination with VMWare's internal network handling has resulted in this scenario
The codelevel fix has been released for in 6.5u3 and is slated for 6.7U3 which is pending release shortly.
vm7user - maybe this is impacting you too. Have you hear back from VMware at all reg the PSOD's?
I have recently migrated two servers from 5.5->6.5->6.7, and the two of them are near identical with the exception of one card having an Intel I350 adapter. The server with the I350 adapter became very unstable and any serious sustained network traffic would crash the management interface.
This was troublesome as vCenter was also on this server (it's been moved to the more stable server) and everything became unresponsive. Restarting the management network via console did not help.
One thing that would always trigger it is a full backup (not an incremental backup) as it would transfer a few TB off of the host. It would always crash about 4-5 hours in the job necessitating a host restart as the management network would not reset.
I think perhaps part of the issue in our case is the onboard networking requires the igb driver and the I350 wanted the igbn driver. I've blacklisted the igbn driver and now ESXi is using the igb driver for everything.
I'm currently monitoring the backup but it's now in the verifying stage so I don't believe it will crash the management interface any more.
Hopefully 6.7u3 will be released soon - I will try unblocking the igbn driver then to see if the problem is resolved.
1 person found this helpful
Intel have released a new driver for igbn, please see the below link for it. Read through the release notes and see if you can test this on one of your ESXi hosts.
You will also need to upgrade the firmware on the Intel cards to ensure interop with this driver version.
Give this a crack and hopefully it fixes your issue. Or else wait for vSphere 6.7 Update3 as they are making changes in the way network is handled.
Intel i350 Driver igbn 1.4.10 https://my.vmware.com/web/vmware/details?downloadGroup=DT-ESXI67-INTEL-IGBN-1410&productId=742
Keep us posted with results if you do try the above driver.
We have seen the same PSOD about 3-4 times since April on a Dell PowerEdge R740xd. Just applied the update and will keep you posted.
Different HW than OP, but I am also seeing similar on 6.7u2. Would like to be kept abreast of any developments/fixes.
I posted full logs and additional detail of my error on Reddit: https://www.reddit.com/r/vmware/comments/dan3k3/pf_exception_14_in_world_2481950vmnic00tx_ip/
The system has not PSOD'ed since we applied the patch - for 20 days straight now. But in the past we had already seen uptimes of up to 40 days before it crashed. So it's looking good until now, but I cannot say 100% sure that it's fixed for good.