VMware Cloud Community
wolf
Enthusiast
Enthusiast
Jump to solution

E1000 virtual nic on Win 2003 R2 causing PSOD on ESXi 5.1 (PF Exception 14)

Hello,

we migrated a VM running 2003 R2 64 bit from Hyper-V to ESXi 5.1 (build 1117900) using VMware Converter for V2V migration, which configured an Intel E1000 as NIC card (with Microsoft driver inside the VM)

The migration went smooth and machine run fine for more than a month, but will little stress conditions.


As soon as the machine started being stressed a bit, with some network transfers, we started getting PSOD as follows on the ESX node:

psod.png

After some reading in forums where this issue seemed more related to vmxnet3 than to e1000 (in fact many suggesting to go for e1000 ...), we did the following as per some findings around:

1) Upgraded ESXi 5.1 to latest build (1157734, this was done without any specific indication this could sort the issue out)

2) Installed Intel drivers instead of Microsoft ones (Intel Pro Set)

3) Disabled TCP offloading in Intel Drivers.

I will let you know if node will run more stable, but in the mean time I am asking for suggestions, if any.

Unfortunately no ESX kernel dump was there so it is useless to open a call to VMware at present..

1 Solution

Accepted Solutions
wolf
Enthusiast
Enthusiast
Jump to solution

I got official reply from VMware, case 13387941310  

Workaround: shift to vmxnet3 (after install) or disable RSS in driver

A patch is due in Q1 2014 (ESX 5.1 U2)

View solution in original post

26 Replies
SatyS
Hot Shot
Hot Shot
Jump to solution

This is caused due to a page fault.

Just have a look into following kb

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=102018...

Regards,

SatyS

If you find this useful,mark the answer as correct/helpful

If you find this useful,please mark the answer as correct/helpful

Regards,
SatyS
http://myvirtuallearning.wordpress.com/

0 Kudos
wolf
Enthusiast
Enthusiast
Jump to solution

Sure it is a page fault and a bug (a VM should not crash the hypervisor), but that KB is general and to fully debug a dump is required (which is not available)

Here specifically the purple screen of death is talking about E1000 nic and PollRxRing....so I was wondering if any setting could "mitigate" the issue till a fix will be available.

We are setting up a kernel dump target but till this is not available....

0 Kudos
admin
Immortal
Immortal
Jump to solution

Hello ,

you know you can actually generate the core dump manually. as per KB   VMware KB: Manually regenerating core dump files in VMware ESXi/ESX 

and then open up call to VMware, They might confirm if its a bug or a fix is available for the same after the analysis of dump

And to add to the PSOD, it seems to be caused by the E1000 Nics, Try changing the adapter type to VMXNET3

and hopefully that should resolve the situation.

Hope this helps.

Thanks,

Avinash

zXi_Gamer
Virtuoso
Virtuoso
Jump to solution

Hmm.. thats the second time, I am seeing this PSOD in 2 days. I think something has been messed up by VMWare on one of the patches. Have a look at here PF Exception PSOD iSCSI HP Blade BL620 G7

I am also pinging some VMware folks to update about this rising issue.

0 Kudos
wolf
Enthusiast
Enthusiast
Jump to solution

Up to now I can confirm that after 24 hrs no more PSODs occurred (while we had 2 in the previous 24).

This of course does not mean that latest version (1157734) sorts out the issue compared to version 1117900 (july): the bug may still sit there and be mitigated by Intel drivers (instead of Microsoft's) or disabling TCP offloading in the VM (Intel drivers' option).

So said, we are proceeding  as per Avinash21 suggestion for dump collection and report the PSOD to VMware... I will report case number here as soon I got it opened

0 Kudos
wolf
Enthusiast
Enthusiast
Jump to solution

Ok, exactly same issue on Windows 2012 R2 on updated Vsphere node

So updating does not sort the issue out; it is Intel driver OR disabling offloading

Opening another ticket

0 Kudos
zXi_Gamer
Virtuoso
Virtuoso
Jump to solution

Hi Wolf, one more thread deals with the same PSOD and the issue is happening only with win2012.

0 Kudos
wolf
Enthusiast
Enthusiast
Jump to solution

I did open a support request concerning both 2003 and 2012 (it is the same issue): will update here and on 2012 thread as well

0 Kudos
zXi_Gamer
Virtuoso
Virtuoso
Jump to solution

Hmmm.. from here, it is a known issue and to be addressed soon.

Re: Pink screened vSphere 5.1 (twice) by installing Notepad++ on Server 2012 R2

0 Kudos
wolf
Enthusiast
Enthusiast
Jump to solution

Yes support told me to move to vmxnet3... but this is a workaround, exactly as installing Intel drivers.

When creating machines with default settings:


Windows 2003 machines get E1000 when P2V or V2V
Windows 2008 machines get E1000

Windows 2012,8,8.1,2012R2 get E1000E

Since the issue is random, it is like having a time bomb in vsphere node... since when the issue occurs the whole node is crashed (not only the VM).

I believe it should be a top priority issue sorting this thing out.


wolf
Enthusiast
Enthusiast
Jump to solution

I got official reply from VMware, case 13387941310  

Workaround: shift to vmxnet3 (after install) or disable RSS in driver

A patch is due in Q1 2014 (ESX 5.1 U2)

mauser_
Enthusiast
Enthusiast
Jump to solution

Hi,

same problem here.

Changed now from e1000 to vmxnet3.


But when i forget for a new VM there will be freeze again.

What will be a final solution from VMWare (wolf?)

Here is te problem om esxi 5.1 and esxi 5.5

Thx

0 Kudos
wolf
Enthusiast
Enthusiast
Jump to solution

A patch is expected in Q1 2014

For new machines, make a master image, install VMware tools, change nic to vmxnet3 and sysprep it; then any new machine will be with vmxnet3....

0 Kudos
mauser_
Enthusiast
Enthusiast
Jump to solution

Hi Wolf,

this is no option for me.

New machines will be deployed using MDT 2013.

The change from e1000 to vmxnet3 is a good workarround. But is this a known issue @ VMWare? Is there a link available for this issue?

I am having this issue on esxi 5.1 and 5.5. But in the past i didn`t have any problems with esxi 5.1....

Thx

0 Kudos
admin
Immortal
Immortal
Jump to solution

Hello,

yes sir this is a Known problem at VMware and Patches for 5.0 and 5.1 is expected to be by Jan 2014 tentatively.

Thanks,

Avinash

0 Kudos
mauser_
Enthusiast
Enthusiast
Jump to solution

Hello,

is there a relation with problems with PXE-boot on ESXI?


PXE-boot is working well on Physical-hardware, but not on esxi-machines.

Thx

0 Kudos
jfh777
Enthusiast
Enthusiast
Jump to solution

Disabling RSS is mentioned in the previous posts.  Can anyone confirm that this solves the problem?

Regards.

0 Kudos
jfh777
Enthusiast
Enthusiast
Jump to solution

The answer to my own question is yes - disabling Receive Side Scaling (RSS) on Windows Server 2012 hosts with the E1000E adapter stops the PSOD issue I was having.  Passed my results along to VMware technical support.

Sckoobie
Contributor
Contributor
Jump to solution

I am installing a new domain in a HP ML350G8 with ESXi 5.5. The domain is a SBS2011 (Windows Server 2008 R2) and a second server with Windows Server 2012 w/ SQL 2012.

All works fine but when I instal a VM with Windows 8.1 the server stops with na PSOD like all others above. Windows 8 works fine too. is there any update to resolve the problem or we have to wait.

0 Kudos