VMware Cloud Community
wolf
Enthusiast
Enthusiast
Jump to solution

E1000 virtual nic on Win 2003 R2 causing PSOD on ESXi 5.1 (PF Exception 14)

Hello,

we migrated a VM running 2003 R2 64 bit from Hyper-V to ESXi 5.1 (build 1117900) using VMware Converter for V2V migration, which configured an Intel E1000 as NIC card (with Microsoft driver inside the VM)

The migration went smooth and machine run fine for more than a month, but will little stress conditions.


As soon as the machine started being stressed a bit, with some network transfers, we started getting PSOD as follows on the ESX node:

psod.png

After some reading in forums where this issue seemed more related to vmxnet3 than to e1000 (in fact many suggesting to go for e1000 ...), we did the following as per some findings around:

1) Upgraded ESXi 5.1 to latest build (1157734, this was done without any specific indication this could sort the issue out)

2) Installed Intel drivers instead of Microsoft ones (Intel Pro Set)

3) Disabled TCP offloading in Intel Drivers.

I will let you know if node will run more stable, but in the mean time I am asking for suggestions, if any.

Unfortunately no ESX kernel dump was there so it is useless to open a call to VMware at present..

26 Replies
cryohead
Contributor
Contributor
Jump to solution

Disabling RSS resolved the issue for me (no crash even after a week) but impacted performance as network traffic was no longer multi-threaded but bound to a single thread, causing random network lag spikes. The true solution for me was to move to VMXNET3.

bertus02
Contributor
Contributor
Jump to solution

I was having the identical issues happen to me once a twice a day with one of our hosts PSOD bringing down all the guests. Initially thought it was a host issue. The issues kept occurring even after a vmotion to a different host. I tracked down the issue and found it was indeed the E1000 NIC on a Server 2012 R2 guest causing the issues. I changed the virtual NIC to vmxnet3 and reconfigured the static IP and we haven't had an issue in three days. In my 5+ years of working with esx/esxi I have never seen a guest crash an entire host! I guess there is a first for everything.

admin
Immortal
Immortal
Jump to solution

Hi All,

The fix for this PSOD Problem has been released for vsphere 5.1 as part of  ESXi 5.1 update 2.

Thanks,

Avinash

Reply
0 Kudos
bhwong7
Contributor
Contributor
Jump to solution

We just purchased 5.5 where this bug has yet to be fixed. Should we downgrade to 5.1 to avoid this since we have a huge number of VMs on E1000 that need to be running 24 by 7?

Reply
0 Kudos
YANIX
Contributor
Contributor
Jump to solution

Hi,

You can stay with 5.5 and simply install latest patches. It has been fixed in the last Update 1

Reply
0 Kudos
Jas0nBourne
Contributor
Contributor
Jump to solution

Just wanted to say this thread was greatly informative, since I just experienced this in one of my clusters. I realized it started at random not too long ago as I started to introduce more and more 2012 R2 VMs(each of which had the E1000 family of net adapters).

I'm slowly moving to VMXNET3 and hoping it resolves the problem. I'm glad there's an update patch out now too for 5.1 & 5.5.

Reply
0 Kudos
crescendas
Enthusiast
Enthusiast
Jump to solution

Do avoid the 5.5u1 if you are on NFS: VMware KB: Intermittent NFS APDs on ESXi 5.5 U1

Reply
0 Kudos