Hello,
we migrated a VM running 2003 R2 64 bit from Hyper-V to ESXi 5.1 (build 1117900) using VMware Converter for V2V migration, which configured an Intel E1000 as NIC card (with Microsoft driver inside the VM)
The migration went smooth and machine run fine for more than a month, but will little stress conditions.
As soon as the machine started being stressed a bit, with some network transfers, we started getting PSOD as follows on the ESX node:
After some reading in forums where this issue seemed more related to vmxnet3 than to e1000 (in fact many suggesting to go for e1000 ...), we did the following as per some findings around:
1) Upgraded ESXi 5.1 to latest build (1157734, this was done without any specific indication this could sort the issue out)
2) Installed Intel drivers instead of Microsoft ones (Intel Pro Set)
3) Disabled TCP offloading in Intel Drivers.
I will let you know if node will run more stable, but in the mean time I am asking for suggestions, if any.
Unfortunately no ESX kernel dump was there so it is useless to open a call to VMware at present..
I got official reply from VMware, case 13387941310
Workaround: shift to vmxnet3 (after install) or disable RSS in driver
A patch is due in Q1 2014 (ESX 5.1 U2)
This is caused due to a page fault.
Just have a look into following kb
Regards,
SatyS
If you find this useful,mark the answer as correct/helpful
Sure it is a page fault and a bug (a VM should not crash the hypervisor), but that KB is general and to fully debug a dump is required (which is not available)
Here specifically the purple screen of death is talking about E1000 nic and PollRxRing....so I was wondering if any setting could "mitigate" the issue till a fix will be available.
We are setting up a kernel dump target but till this is not available....
Hello ,
you know you can actually generate the core dump manually. as per KB VMware KB: Manually regenerating core dump files in VMware ESXi/ESX
and then open up call to VMware, They might confirm if its a bug or a fix is available for the same after the analysis of dump
And to add to the PSOD, it seems to be caused by the E1000 Nics, Try changing the adapter type to VMXNET3
and hopefully that should resolve the situation.
Hope this helps.
Thanks,
Avinash
Hmm.. thats the second time, I am seeing this PSOD in 2 days. I think something has been messed up by VMWare on one of the patches. Have a look at here PF Exception PSOD iSCSI HP Blade BL620 G7
I am also pinging some VMware folks to update about this rising issue.
Up to now I can confirm that after 24 hrs no more PSODs occurred (while we had 2 in the previous 24).
This of course does not mean that latest version (1157734) sorts out the issue compared to version 1117900 (july): the bug may still sit there and be mitigated by Intel drivers (instead of Microsoft's) or disabling TCP offloading in the VM (Intel drivers' option).
So said, we are proceeding as per Avinash21 suggestion for dump collection and report the PSOD to VMware... I will report case number here as soon I got it opened
Ok, exactly same issue on Windows 2012 R2 on updated Vsphere node
So updating does not sort the issue out; it is Intel driver OR disabling offloading
Opening another ticket
Hi Wolf, one more thread deals with the same PSOD and the issue is happening only with win2012.
I did open a support request concerning both 2003 and 2012 (it is the same issue): will update here and on 2012 thread as well
Hmmm.. from here, it is a known issue and to be addressed soon.
Re: Pink screened vSphere 5.1 (twice) by installing Notepad++ on Server 2012 R2
Yes support told me to move to vmxnet3... but this is a workaround, exactly as installing Intel drivers.
When creating machines with default settings:
Windows 2003 machines get E1000 when P2V or V2V
Windows 2008 machines get E1000
Windows 2012,8,8.1,2012R2 get E1000E
Since the issue is random, it is like having a time bomb in vsphere node... since when the issue occurs the whole node is crashed (not only the VM).
I believe it should be a top priority issue sorting this thing out.
I got official reply from VMware, case 13387941310
Workaround: shift to vmxnet3 (after install) or disable RSS in driver
A patch is due in Q1 2014 (ESX 5.1 U2)
Hi,
same problem here.
Changed now from e1000 to vmxnet3.
But when i forget for a new VM there will be freeze again.
What will be a final solution from VMWare (wolf?)
Here is te problem om esxi 5.1 and esxi 5.5
Thx
A patch is expected in Q1 2014
For new machines, make a master image, install VMware tools, change nic to vmxnet3 and sysprep it; then any new machine will be with vmxnet3....
Hi Wolf,
this is no option for me.
New machines will be deployed using MDT 2013.
The change from e1000 to vmxnet3 is a good workarround. But is this a known issue @ VMWare? Is there a link available for this issue?
I am having this issue on esxi 5.1 and 5.5. But in the past i didn`t have any problems with esxi 5.1....
Thx
Hello,
yes sir this is a Known problem at VMware and Patches for 5.0 and 5.1 is expected to be by Jan 2014 tentatively.
Thanks,
Avinash
Hello,
is there a relation with problems with PXE-boot on ESXI?
PXE-boot is working well on Physical-hardware, but not on esxi-machines.
Thx
Disabling RSS is mentioned in the previous posts. Can anyone confirm that this solves the problem?
Regards.
The answer to my own question is yes - disabling Receive Side Scaling (RSS) on Windows Server 2012 hosts with the E1000E adapter stops the PSOD issue I was having. Passed my results along to VMware technical support.
I am installing a new domain in a HP ML350G8 with ESXi 5.5. The domain is a SBS2011 (Windows Server 2008 R2) and a second server with Windows Server 2012 w/ SQL 2012.
All works fine but when I instal a VM with Windows 8.1 the server stops with na PSOD like all others above. Windows 8 works fine too. is there any update to resolve the problem or we have to wait.