yzennezy
Enthusiast
Enthusiast

Server Crash/Reboot: vnetflt.sys

Jump to solution

Hi,

Does anyone know or has anyone heard or seen any issue using the VMXNET3 driver with Windows 2003 R2?

We're running ESXi 5.5 with a mix of Windows and Linux guests. We have migrated everything from VMware Server 2 using the VMware vCenter Converter to bring everything up to VM version 10. VMware Tools have been installed.

Since the migration, the Windows guests have been very unstable:

* Windows 7 guest has 'locked' or 'frozen' once and we had to reboot the host to get it back up and running

* Windows 2003 Server R2 guest has also 'frozen' requiring a reboot of the host

* A different Windows 2003 R2 guest is crashing and rebooting with a TCPIP issue.

All three guests are configured with the VMXNET3 Adapter Type. All three machines have experienced 'lockup'. Last night both the Windows Server 2003 R2 guests have crashed and rebooted on STOP ERROR 7f.

I have analysed the crash memory dump for both Windows Server 2003 R2 guests and they both point the finger directly at vnetflt.sys. The guests are configured with Adapter Type VMXNET3 and the underlying hardware on the ESXi host is an Intel 82599EB 10-Gigabit SFI/SFP+. The guests' interfaces come up using the vmxnet3 Ethernet Adapter driver version 1.5.1.0 (vmxnet3n51x86.sys) and report 10Gbps speed. Also the check-box for power management is on ('Allow the computer to turn off this device to save power' option). Perhaps that should be switched off?

Can anyone say or does anyone know or has heard about problems with the vnetflt.sys. Is it a component of the VMXNET3? Should I be using a different driver?


Any clues are much appreciated.

Kind regards,
Tom

1 Solution

Accepted Solutions
chistv
Contributor
Contributor

There is a memory leak caused by VmWare tools VMCI driver VMware vShield Endpoint TDI manager:

The problem is still there after update to ESXi5.5 U1

vnetflt.sys 5.5.0.0 build-1191373

Solution: Uninstall this driver via vmware tools setup change

detected by windows support tools poolmon.exe (press p util type= NonP, press b to sort on Bytes) the Tag VNet is the vnetflt.sys driver

Tag  Type     Allocs            Frees            Diff   Bytes       Per Alloc

VNet Nonp     486393 ( 875)    347795 ( 625)   138598 5561088 ( 10000)     40

MFEm Nonp     251716 (   0)    251683 (   0)       33 5245248 (     0) 158946

hope this helps

View solution in original post

0 Kudos
22 Replies
Josh26
Virtuoso
Virtuoso

The obvious question here is going to be, is your server AND network card on the HCL?

0 Kudos
yzennezy
Enthusiast
Enthusiast

Hi Josh,

Thanks for your feedback.

I had a look and found this:

VMware Compatibility Guide: I/O Device Search

It lists the following properties for a compatible device:

VID:    8086

DID:    10FB

SVID:    8086

SSID:    0000


I'm not sure how to match these on the host but I've tried the following:

# vmkchdev -l | grep -i 10fb
0000:81:00.0 8086:10fb 8086:0003 vmkernel vmnic2
0000:81:00.1 8086:10fb 8086:0003 vmkernel vmnic3

# esxcfg-nics -l | grep 82599EB
vmnic2    0000:81:00.00 ixgbe       Up   10000Mbps Full   90:e2:ba:3f:cf:30 1500   Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic3    0000:81:00.01 ixgbe       Down 0Mbps     Half   90:e2:ba:3f:cf:31 1500   Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection


I'm assuming that I have the following:

VID:    8086
DID:    10FB
SVID:    8086
SSID:    0003

If I've got that right (please confirm), what is the significance of the SSID difference (HCL SSID=0000, my SSID=0003)?

Regards,

Tom

0 Kudos

Have you checked the supported GOS list at Intel� Server Adapters — Guest Operating System Support for SR-IOV in VMware vSphere* 5.1 ...

Please consider marking this answer "correct" or "helpful" if you found it useful.
0 Kudos
john23
Commander
Commander

If you want to get the information about VID,DID etc from your system, run this command.. lspci -vvvvvv

It will show the information, and then do the matches from compatibility guide.

-A

Thanks -A Read my blogs: www.openwriteup.com
0 Kudos
john23
Commander
Commander


esxcfg-info command also provide VID,DID information...

Thanks -A Read my blogs: www.openwriteup.com
0 Kudos
yzennezy
Enthusiast
Enthusiast

Hi,

Thanks for all your feedback. Sorry for not responding for so long, I have been unwell.

I changed the guest's configured Network Adapter Type to 'Flexible' and Installed the device as 'VMware Accelerated AMD PCNet Adapter', 'Driver Version 2.2.0.0'. The adapter shows up in the guest at 1Gbps speed. Since this change, it has been more stable but I still get the occasional 'lock-up' or crash (stop 7f).

Apart from the Sub-Device Id (mine is 0x0003), everything else matches up for the supported cards being supported. Is this purely a guest compatibility issue?

Kind regards,

Tom

esxcfg-info shows the following (it's dual port so there are two entries)

           \==+PCI Device :

               |----Segment.........................................0x0000

               |----Bus.............................................0x81

               |----Slot............................................0x00

               |----Function........................................0x00

               |----Runtime Owner...................................vmkernel

               |----Has Configured Owner............................false

               |----Configured Owner................................

               |----Vendor Id.......................................0x8086

               |----Device Id.......................................0x10fb

               |----Sub-Vendor Id...................................0x8086

               |----Sub-Device Id...................................0x0003

               |----Vendor Name.....................................Intel Corporation

               |----Device Name.....................................82599EB 10-Gigabit SFI/SFP+ Network Connection

               |----Device Class....................................512

               |----Device Class Name...............................Ethernet controller

               |----PIC Line........................................11

               |----Old IRQ.........................................11

               |----Vector..........................................56

               |----PCI Pin.........................................0

               |----Spawned Bus.....................................0

               |----Flags...........................................513

               \==+BAR Info :

                  \==+BAR0 :

                     |----Type......................................0x00000003

                     |----Address...................................0x00000000fb880000

                     |----Size......................................524288

                     |----Flags.....................................0x0000000c

                  \==+BAR1 :

                     |----Type......................................0x00000004

                     |----Address...................................0

                     |----Size......................................0

                     |----Flags.....................................0

                  \==+BAR2 :

                     |----Type......................................0x00000001

                     |----Address...................................0x000000000000f020

                     |----Size......................................32

                     |----Flags.....................................0x00000001

                  \==+BAR3 :

                     |----Type......................................0

                     |----Address...................................0

                     |----Size......................................0

                     |----Flags.....................................0

                  \==+BAR4 :

                     |----Type......................................0x00000003

                     |----Address...................................0x00000000fb904000

                     |----Size......................................16384

                     |----Flags.....................................0x0000000c

                  \==+BAR5 :

                     |----Type......................................0x00000004

                     |----Address...................................0

                     |----Size......................................0

                     |----Flags.....................................0

               |----Module Id.......................................4120

               |----Chassis.........................................0

               |----Physical Slot...................................5

               |----VmKernel Device Name............................vmnic2

               |----Slot Description................................Chassis slot 5.00

               |----Passthru Capable................................true

               |----Parent Device...................................PCI 0:128:1:0

               |----Dependent Device................................PCI 0:129:0:0

               |----Reset Method....................................1

               |----FPT Shareable...................................true

            \==+PCI Device :

               |----Segment.........................................0x0000

               |----Bus.............................................0x81

               |----Slot............................................0x00

               |----Function........................................0x01

               |----Runtime Owner...................................vmkernel

               |----Has Configured Owner............................false

               |----Configured Owner................................

               |----Vendor Id.......................................0x8086

               |----Device Id.......................................0x10fb

               |----Sub-Vendor Id...................................0x8086

               |----Sub-Device Id...................................0x0003

               |----Vendor Name.....................................Intel Corporation

               |----Device Name.....................................82599EB 10-Gigabit SFI/SFP+ Network Connection

               |----Device Class....................................512

               |----Device Class Name...............................Ethernet controller

               |----PIC Line........................................10

               |----Old IRQ.........................................10

               |----Vector..........................................57

               |----PCI Pin.........................................0

               |----Spawned Bus.....................................0

               |----Flags...........................................513

               \==+BAR Info :

                  \==+BAR0 :

                     |----Type......................................0x00000003

                     |----Address...................................0x00000000fb800000

                     |----Size......................................524288

                     |----Flags.....................................0x0000000c

                  \==+BAR1 :

                     |----Type......................................0x00000004

                     |----Address...................................0

                     |----Size......................................0

                     |----Flags.....................................0

                  \==+BAR2 :

                     |----Type......................................0x00000001

                     |----Address...................................0x000000000000f000

                     |----Size......................................32

                     |----Flags.....................................0x00000001

                  \==+BAR3 :

                     |----Type......................................0

                     |----Address...................................0

                     |----Size......................................0

                     |----Flags.....................................0

                  \==+BAR4 :

                     |----Type......................................0x00000003

                     |----Address...................................0x00000000fb900000

                     |----Size......................................16384

                     |----Flags.....................................0x0000000c

                  \==+BAR5 :

                     |----Type......................................0x00000004

                     |----Address...................................0

                     |----Size......................................0

                     |----Flags.....................................0

               |----Module Id.......................................4120

               |----Chassis.........................................0

               |----Physical Slot...................................5

               |----VmKernel Device Name............................vmnic3

               |----Slot Description................................Chassis slot 5.01

               |----Passthru Capable................................true

               |----Parent Device...................................PCI 0:128:1:0

               |----Dependent Device................................PCI 0:129:0:1

               |----Reset Method....................................1

               |----FPT Shareable...................................true

0 Kudos
chistv
Contributor
Contributor

There is a memory leak caused by VmWare tools VMCI driver VMware vShield Endpoint TDI manager:

The problem is still there after update to ESXi5.5 U1

vnetflt.sys 5.5.0.0 build-1191373

Solution: Uninstall this driver via vmware tools setup change

detected by windows support tools poolmon.exe (press p util type= NonP, press b to sort on Bytes) the Tag VNet is the vnetflt.sys driver

Tag  Type     Allocs            Frees            Diff   Bytes       Per Alloc

VNet Nonp     486393 ( 875)    347795 ( 625)   138598 5561088 ( 10000)     40

MFEm Nonp     251716 (   0)    251683 (   0)       33 5245248 (     0) 158946

hope this helps

View solution in original post

0 Kudos
sofasurfer
Contributor
Contributor

Hi guys,

We're also experiencing a memory leak with this driver on Windows Server 2012 - though we're running build 1397926. Does anyone know if there is a fix for this - other than uninstalling the driver.

0 Kudos
venugs
Contributor
Contributor

Hi,

We had a similar issue, it definitely related to the VMCI driver. However, the temporary workaround for the Guest OS to stop crashing repeatedly is to disable the VMCI driver in Device Manger by loggin into Safe Mode with Networking.

I am not sure about the adverse impacts of disabling it, but the guest OS recovered and we have migrated it to a different host which is earlier version of 5.1 and things were fine. ( not to mention we had a critical application that needs 2 E1000 and 2 VMXNET3 n/w adapters. )

hope it helps someone, somewhere.

Regards,

Venu

0 Kudos
Bleeder
Hot Shot
Hot Shot

Well, you could uninstall the 5.5 tools and install the latest 5.1 tools if you really need the vShield drivers.  Otherwise, there's no fix for the 5.5 tools yet.

For reference: http://kb.vmware.com/kb/2077302

0 Kudos
Bleeder
Hot Shot
Hot Shot

Here's another VMware KB article regarding this problem:

http://kb.vmware.com/kb/2081616

0 Kudos
DrNickT
Contributor
Contributor

I think I may be having this same issue.  I have some 2003 VM's, when they first boot up they say that vnetflt will not start.  Then they run for a while and then go completely unresponsive.  No errors in the event log.  they just need to be power cycled to get back online.

I'm running Cisco UCS B200 M3 Blades, ESXi 5.1 1612806

0 Kudos
Wh33ly
Hot Shot
Hot Shot

On Windows 2008 Server we had the same problems on a few machines where the Full VMware tools install was done, which include the vShield drivers.

We use ESXi version 5.1

I noticed that after removing/uninstalling the VMware tools the VMware vShield driver is still available and not completely removed.

Removed them manually:

Open device manager -> View -> Show Hidden Devices

Now an item appears : Non-Plug and play drivers, when you collapse this one you find the vNetFilter driver, this is the one that causes the problem. You can manually uninstall it here to be sure it isn't installed anymore. (right click uninstall)

My workflow

- Note IP settings (as vmxnet driver will also be removed which causes network disconnection)

- Remove VMware tools

- Remove vNetFilter in Device manager

- Reboot

- Install VMware Tools automatic or typical

- Reboot

- Reconfigure NIC

- Reboot (optional) because some application services require an active connection when starting etc.

Haven't found a similar workaround for Windows 2012

So to wrap it up

Option 1) Remove the leaky driver manually or through the VMware tools change option

VMware KB: Removing modules for VMware Tools during unattended install or upgrade

Option 2) Leave leaky driver, update VMware tools, driver is patched this is what Bleeder mentioned

For more information, see VMware ESXi 5.1, Patch Release ESXi510-201404001 (2070666) which fixes this leak.

Option 3) Option 1 + Install "typical" VMware tools without vShield drivers.

Currently there is no resolution for ESXi 5.5.

Because we don't use anything from vShield I preferred option 3 to be sure no non-essential components are installed which can cause problems in the future.

If you use vShield drivers, it's best to upgrade as soon as possible.

yzennezy
Enthusiast
Enthusiast

Thanks Christv,

Solution: Uninstall this driver via vmware tools setup change

Uninstalling VmWare tools VMCI driver VMware vShield Endpoint TDI manager fixed the problem

Regards,

Tom

Cyberfed27
Hot Shot
Hot Shot

Just happened to us on a freshly built 2012 R2 server.

We use Trend Micro for agent-less AV.

The DMP file points to the vmware tools driver as the culprit.

We are going to try uninstalling/reinstalling the VMware tools.

We have never had any issues ever until now.

Running ESXi 5.1, build 19000470

0 Kudos
Matt_B1
Enthusiast
Enthusiast

Did the VMware tools fix your issue?  We are on ESXi v5.5 build 1892794 and noticed this issue on a small subset of VMs that had their VMware tools upgraded.

0 Kudos
capsoc
Contributor
Contributor

We are no longer seeing memory leak issues, but since updating Vmware Tools to 9.4.6 (ESXi 5.5 1892794) we are having a experiencing a small subset of servers getting a black screen of death crash.  We've had to remove Trend Deep Security protection & Vmware Tools VMCI drivers which prevents this black screen of death issue. I hope Vmware addresses this ASAP in a forthcoming patch/bug fix.

0 Kudos
sofasurfer
Contributor
Contributor

Hi Guys,

Check out VMware KB: Windows virtual machine installed with vShield Endpoint Thin Agent (vsepflt.sys) and vShi... - we were affected by the memory bug issue in the vshield drivers, upgraded to the latest tools which was supposed to fix this issue, but started to get servers randomly reboot or become totally non-responsive... we contacted VMWare and mentioned that KB article above, and were given a new set of vshield endpoint drivers (vsepflt.sys and vnetflt.sys) - File version 5.5.2, Product version 5.5.2 build- 1904019. We've rolled this out to quite a few affected machines and have had any problems since... VMWare Tech. Support said they were looking at rolling this into a future update - but not the next immediate one.

Hope this helps.

Matt_B1
Enthusiast
Enthusiast

A couple things worked for us in fixing the issue with vnetflt.sys:

  • For non-ESXi v5.5 VMs, uninstalling VMware tools, reinstalling but using custom option and deselecting the vShield drivers.
  • For ESXi v5.5 VMs, upgrading the VM to virtual hardware version 10.
0 Kudos