VMware Communities
supdood
Contributor
Contributor

VMWare Workstation Pro crashing host?

Starting in the beginning of March, I’ve been experiencing frequent crashes on my PC while running VMware. It’s been working fine for over a year and I don’t update anything so I’m not sure what’s causing the issue, maybe some stealth windows update?

At first it started off as a random freeze and reboot. I’ve tried updating to workstation 16, and now I get BSODs with the error message NMI Hardware Failure. Sometimes the BSOD mentions vmx86.sys

Anyone else with the same issues or know what’s causing it? Windows 10 Pro

Thank you!

24 Replies
RaSystemlord
Expert
Expert

This seems to be something related to Windows 10 or hardware. I assume that your Host is Windows 10.

If you look into the link, you will find one interpretation of the matter:

https://windowsreport.com/nmi_hardware_failure-windows-10

I would add to this discussion the following:

1.
You state that you haven't Updated anything. That is a problem that is mentioned in the list above.

On the other hand, I don't know what you really mean by that, because:
- you cannot omit all Updates in Windows 10 with regular ways. It will always do security updates
- if you have disabled updates in the registry, well, see the list above
- every software install will update your system (in terms of some OTHER software using same .dll's) as well, in an unpredictable way in the general case

2. 
There is one additional problem, which may SEEM like an application problem

That is a memory problem. Windows (at least used to) uses memory in sequential order. Thus some application that uses more memory than what is normally used, will crash the system, if that part of the memory is faulty. This looks application specific, while it is not.

You can use (almost) any Linux distro media (like Ubuntu, Puppy Linux) for starting a memory test. If that finds a problem, there surely is a problem. If it doesn't find a problem, there still might be, because testing memory is much harder in a reliable way, they say. Anyway, checking memory banks (if you have a workstation, where chips may get lose) AND changing the order of memory chips, is also one way to try to find out, if there is a problem. Using another application that reserves lots of memory, is a third possible way to test memory.

So, in the end, there are quite a few possibilities to check, but some of these things are very quick to test. If there is some VMware specific thing to check, I hope somebody else knows about them. Typically, a well-known official application does not crash a healthy system like that, it is very uncommon.

0 Kudos
Deathspawner
Enthusiast
Enthusiast

I have been experiencing this same issue, although it seems to really only creep up while I am gaming if VMs hum away in the background. I feel like some of it relates to a Windows update that came out in December (KB4592438), as it made changes to virtualization. Not long after that update was released, I moved from one SSD to another in a vain attempt to fix the issue, not realizing at the time that my issue was virtualization-related.

If I stress test my PC's CPU and memory, no issues arise, even after hours of running the stress-test tools. But, in the past month, I've blue-screened five times while playing a game because I forgot to shut down my VMs first. Once I close Workstation, I can freely game without the risk of a blue screen.

This is annoying me to the point that I'm considering switching PCs and simply hope I don't encounter the issue there. I too upgraded from Workstation 15 to 16, and began seeing that same NMI Hardware Failure error. It's getting frustrating.

0 Kudos
RaSystemlord
Expert
Expert

Given these comments here and a rather large number of members having the same problem, this leads into questions.

Is the reason why this haven't been solved in the application, that this is a Windows 10 regression in its system files? Thus solving this on application level might be rather difficult.

Similar matters have been happening with Windows 10 versions, somewhere else.

Tags (1)
0 Kudos
supdood
Contributor
Contributor

Appreciate the response. I've been using VirtualBox instead for the past week and haven't encountered a crash, so I'm almost certain it's a VMWare specific issue and nothing to do with my hardware. Hopeful this somehow resolves itself with a future Windows update as I'm not getting the same performance from VirtualBox that VMWare provides.

0 Kudos
e2489
Contributor
Contributor

Same issue here for the past month. Last time I updated anything beforehand was December, and this just started at the end of February / beginning of March. Updated recently to see if it would solve the problem, and it did not.

I've seen NMI_HARDWARE_FAILURE and PFN_LIST_CORRUPT, with VMware kernel driver (vmx86.sys) being show as the culprit. If someone could look into this, it would be greatly appreciated, because it's getting pretty frustrating.

0 Kudos
steve44
Contributor
Contributor

Still having this problem as well. So ridiculous.

0 Kudos
scott28tt
VMware Employee
VMware Employee

Has anyone in this thread submitted a support request?

VMTN is a user community forum, not a communication channel to the support team.

 


-------------------------------------------------------------------------------------------------------------------------------------------------------------

Although I am a VMware employee I contribute to VMware Communities voluntarily (ie. not in any official capacity)
VMware Training & Certification blog
0 Kudos
Chuckle123
Contributor
Contributor

I first noticed this issue early this year.  I could not pinpoint it. After the 3rd or 4th time it happened I began to see a pattern.  It is definitely occurring if VMWare is open while gaming.  I am going to open a ticket, but wanted to check if someone already opened one.  Thanks.

ce

0 Kudos
Chuckle123
Contributor
Contributor

Scott,

    I purchased VMWare Workstation Pro back in November, but now it won't let me submit a ticket.  That is very odd, as you figure they would want bug reports.

ce

0 Kudos
RaSystemlord
Expert
Expert

I'm not sure why this bug reporting is suggested in the first place since this is Player Discussion and Player does not include Support. To my understanding this IS the place for getting information on bug fixes and rising the interest to fix them.

As commented before - IF (nobody has acknowledged this) this is a Windows 10 regression, it might be very hard for the application to fix it. The reason is that if the problem is in Windows system files, an application cannot start changing or modifying them - not in Windows architecture. There was a similar problem with Common Dialog, that took a very long time for Microsoft to fix - some application functionality was impossible without rewriting the application. An obvious way to go round this in VMware, is to use a Linux host. It's not expensive nor time-consuming.

0 Kudos
Chuckle123
Contributor
Contributor

So does that mean that VMWare is going to change their recommended setup to only list Linux as Host OS?  What you threw out there is not acceptable.

0 Kudos
Deathspawner
Enthusiast
Enthusiast

Out of curiosity, what platform are you running? I doubt it's directly related to a specific platform, but I had the issue on an Intel Core i9-7980XE rig. I just built a new rig with AMD Ryzen 9 5900X, and have not experienced a crash while playing the same game (Apex Legends) with my VMs running. That's a relief for me, but I am not sure what it is about other configurations that triggers this bug.

For all I know, the problem may have gone away simply because I installed Windows fresh, but it's truly hard to say for sure. To that end, when I experienced crashes, it ONLY happened while playing Apex Legends. I played Forza Horizon 4 and Division 2 a lot with the same VMs running, and didn't experience a single crash. It's such a specific issue.

0 Kudos
RaSystemlord
Expert
Expert

Since it was asked what the recommendation for Host is, I need to clarify that I DO NOT recommend anything in behalf of VMware. I have no ties to VMware or any other software company.

I was just mentioning - given the outspoken frustration - that if you need to get rid of the problem, a good way - which may or may not work - is to do something about the Host. There are two things, mentioned as the very first thing in my postings:

- change host OS. This is very quick and affordable if the target is Linux. Basically, it only requires one more system SSD-disk and knowledge on Linux. This allows an option to try this out, without committing to a change. If knowledge is missing, this is obviously a no-go.

- change hardware, as kind of speculated in the previous post. This isn't as simple as that. Hardware problems are typically hardware+driver problems. Thus changing the OS might be a more simple cure,  because the driver is also changing if OS is changed.

What I am SUGGESTING is that somebody would commit into a workflow, which would lead to investigating this bug situation and assigning a valid priority for the bug to get fixed. Based on the writings in this thread, there is no guarantee that we are talking about one single bug, there may be many different ones. Investigating in a meaningful way, requires to my understanding, a considerable amount of work and expertise. Logging in a bug report, requires some person with a bought VMware product with valid Support contract. This is something for a VMware employee to establish, how exactly this needs to be done.

The latest post in here suggests that the problem only happens with certain games, most likely Windows-only games. In that case, the way to go round this problem is quite obvious and requires only consideration what to do simultaneously. However, getting that kind of bug fixed, is still requiring the steps mentioned above, for investigation, prioritizing and bug fixing.

I hope that this clarifies the thing a bit. One shouldn't confuse different things, like a workaround suggestion, bug fixing, VMware company recommendations - they are not the same thing.

Chuckle123
Contributor
Contributor

I use to have an Intel 7700k; had the issue.  I just built a new system using AMD 5950x; same issue.  My host is Windows 10 pro and my VMs are all Windows 10 Pro. The original post mentions the exact same errors I get.  Yes, I am playing games while numerous VMs are running.  The game is Rust, and it uses the Unity engine and is very CPU and GPU heavy.  The odd thing, is I can run just my Linux VM and play games all day.  It is definitely Windows 10.

0 Kudos
RaSystemlord
Expert
Expert

For a finite answer, perhaps a proper study by software owners would be needed. As a speculative answer, which may or may not help any ...

... based on the latest answer, it looks like a problem with all of this involved:

For the Host if I understood correctly: Windows 10 OS - Drivers in Windows 10 OS - Specific brand of Graphics Adapters (chipset really) - using some application using the graphics adapter functionality to the fullest. To my knowledge, a faulty driver is OK to crash Windows 10 OS since Windows NT 4.0 ... it changed then, a sacrifice for speed and losing stability.

If Linux VM does not have any such problem, using VMware "3D hardware acceleration" on the VM, is a different thing than running the game on the Host.

IF, I'm saying IF, this was true in some particular case (no guarantee that there is only one case), cannot see any other easy solution than try to change the graphics adapter drivers on the Host. Typically, there are drivers from many different sources and the chipset is the deciding matter for selecting the correct driver (manufacturer of the card or computer is not relevant, exceptions may prevail, not sure of all the current ones).

Obviously, if this is limited to VMware being active, then VMware does something to invoke this. A real analysis by VMware owners would be needed for a more permanent solution, which may be in VMware or Windows 10. I hope somebody here is in the position to make this happen.

0 Kudos
Chuckle123
Contributor
Contributor

I agree completely, let's hope VMWare uses this feedback as an opportunity to make their product better.

0 Kudos
e2489
Contributor
Contributor

I'm still getting this error 5 months later. Here's the latest one from today. VMWare Workstation latest. Windows 10 host (i7-8700k), Linux guest. All Windows updates are installed, and using all the latest drivers. Happened while playing Rust. I've definitely had crashes while not gaming, but it's less common. 

On Sun 8/22/2021 3:38:05 PM your computer crashed or a problem was reported
crash dump file: C:\Windows\MEMORY.DMP
This was probably caused by the following module: vmx86.sys (vmx86+0x200A) 
Bugcheck code: 0x80 (0x4F4454, 0x0, 0x0, 0x0)
Error: NMI_HARDWARE_FAILURE
file path: C:\Windows\system32\drivers\vmx86.sys
product: VMware kernel driver
company: VMware, Inc.
description: VMware kernel driver
Bug check description: This bug check indicates that a hardware malfunction has occurred. 
A third party driver was identified as the probable root cause of this system error. It is suggested you look for an update for the following driver: vmx86.sys (VMware kernel driver, VMware, Inc.). 
Google query: vmx86.sys VMware, Inc. NMI_HARDWARE_FAILURE


Obviously VMWare is the culprit, as I don't get blue screens at all unless I'm running a VM. I can stress test for hours, game, etc. with no issues.

Other than switching to a Linux host, what can we do? I use too many Windows-only applications for this to be a viable solution.

Has anyone reported this to VMWare? Do they even know about it? I can't even find a way to do that. Seems like it's not even on their radar, and if that's the case it's probably not going to get fixed.

0 Kudos
RaSystemlord
Expert
Expert

"Obviously VMWare is the culprit, as I don't get blue screens at all unless I'm running a VM. I can stress test for hours, game, etc. with no issues.

Other than switching to a Linux host, what can we do? I use too many Windows-only applications for this to be a viable solution.

Has anyone reported this to VMWare? Do they even know about it? I can't even find a way to do that. Seems like it's not even on their radar, and if that's the case it's probably not going to get fixed."

 

Exactly. This matter has been questioned and discussed to death already within this community. How I can see the next professional steps being:

- Support taking this matter in its hand, using these Community member, how appropriate, who seem to be more than willing to help
- Case assigned to a person with enough knowledge on VMware product internals and probably access to VMware Development to discuss more

- Support creating a reproducible case with the before-mentioned actions

- Support taking into account that this might not be a Player-only problem, but might be true with Workstation Pro just as well
- Assignment to get fixed, along with VMware company process of such work (possible Product Management level decision, amount of work estimate, priority of a fix, analysis of what fixing vs. not fixing means ... and all the rest)

- Development work, Internal testing work, Support organization verification, Release 

Having said this, IF this is something impossible or next to impossible to do (meaning, for instance, that possible regressions are too difficult to predict), given Windows 10 architecture (possibly also VMware architecture), we should NEVER expect an announcement of that being a case. I mean, VMware can never say aloud that Windows 10 is the fault here.

Chuckle123
Contributor
Contributor

I love your answers. You are all over the place. Now Windows 10 is to blame. Maybe VMWare should just remove Windows 10 from the list if it can not be properly virtualized. I assume you were given that tag of "Expert" because of the number of posts.

0 Kudos