VMware Communities
cb831
Enthusiast
Enthusiast

Re: "Soft lockup, CPU stuck" on Red Hat guests after upgrade to Fusion 12.2.0

I've seen the exact same behavior the last two weeks on Ubuntu 22 running as VM on Windows 10 host.

After some time I loose contact to the VM and cannot use keyboard and mouse. Sometimes I can restore responsiveness using suspend/resume, sometimes not. But no matter what the timestamp in the ubuntu top is always updated on resume.

Sometimes I cannot even suspend but have to shutdown.

Sometimes VMWare Workstation greys out on VM suspend or shutdown and I have to end the VMWare Workstation and start it again.

When I inspect the syslog if find lots of soft lockups for each CPU with increasing seconds. On one VM all 4 CPU's are locked and on the other only CPU 1-2-3 is locked but the result is the same - no response to keyboard and mouse.

I suspect either updates to windows host or to ubuntu vm two weeks ago either jeopardized the host-guest compatibility or the Windows-VMWare compatibility.

I noticed that the loss of responsiveness is more frequent when running Thunderbird or DBeaver.

I hope this can be nailed soon - it is very annoying.

0 Kudos
14 Replies
Technogeezer
Immortal
Immortal

@cb831 Since you are experiencing lockups while running VMware Workstation,  you should post this in the VMware Workstation  forums. Workstation is using a different hypervisor implementation (either VMware's own hypervisor or Hyper-V) than Fusion 12 (macOS hypervisor frameworks).

I'll ask an admin to move it so this post gets a more appropriate audience.

 

- Paul (Technogeezer)
Editor of the Unofficial Fusion Companion Guides
0 Kudos
cb831
Enthusiast
Enthusiast

OK, sorry for posting here, let me know if I should rephrase it now it will move context... ?

Update: I moved it here Re: VMWare Workstation 16 Pro + Ubuntu 22.04.1 - V... - VMware Technology Network VMTN

0 Kudos
wila
Immortal
Immortal

Impossible to diagnose with the current information.

Please attach a vmware.log file from a VM that displays the problem.

--
Wil

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
cb831
Enthusiast
Enthusiast

Quite a lot in that file I'm reluctant to publish. I have xxxxxx out some of it and attached the file.

I think the VMTOOLS and Ubuntu 22 has become incompatible when Ubuntu was updated 2 weeks ago

The Ubuntu logs stuff like

Sep 26 11:21:14 ubuntu-dev-c16 kernel: [27812.387241] watchdog: BUG: soft lockup - CPU#3 stuck for 1114s! [ksoftirqd/3:34]
Sep 26 11:21:14 ubuntu-dev-c16 kernel: [27816.379165] watchdog: BUG: soft lockup - CPU#1 stuck for 1118s! [thunderbird:5198]
Sep 26 11:21:14 ubuntu-dev-c16 kernel: [27820.383165] watchdog: BUG: soft lockup - CPU#2 stuck for 1121s! [tracker-miner-f:3381]
Sep 26 11:21:14 ubuntu-dev-c16 kernel: [27840.386853] watchdog: BUG: soft lockup - CPU#3 stuck for 1140s! [ksoftirqd/3:34]
Sep 26 11:21:14 ubuntu-dev-c16 kernel: [27844.378801] watchdog: BUG: soft lockup - CPU#1 stuck for 1144s! [thunderbird:5198]
Sep 26 11:21:14 ubuntu-dev-c16 kernel: [27848.382765] watchdog: BUG: soft lockup - CPU#2 stuck for 1147s! [tracker-miner-f:3381]

along with dumps for processes involved

 

 

0 Kudos
wila
Immortal
Immortal

Hi,

OK.. the first thing, the log has this line.

2022-09-30T10:34:53.930Z In(05) vmx Monitor Mode: ULM

Which means that you are running in User Level Mode and thus Workstation cannot use VMware's hypervisor, but instead it has to go through the Hypervisor API that Microsoft provides.

Only when Monitor mode returns CPL0 it is running in ring 0, a.k.a. no hypervisor.

Monitor mode CPL0 (Current Privilege Level 0) is required for VMware Workstation to be able to use their own hypervisor.

See also:

https://communities.vmware.com/t5/VMware-Workstation-Pro/MikroTik-RouterOS-boot-speed-is-drastically...

In order to turn off ULM/Hyper-V mode, run the following command at the host in windows command-line with Administrator privileges:

bcdedit /set hypervisorlaunchtype off

Reboot the system to activate your changes.

If you want to go back to Hyper-V mode again, then you can enable it like this:

bcdedit /set hypervisorlaunchtype auto


See also:

Note that you also might have to disable Memory Integrity.

Windows Security -> Device Security -> Core Isolation details

Don't forget to reboot the host after making any of these changes.

--
Wil

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
wila
Immortal
Immortal

Hi again,

Next issue.

vmx E1000: E1000 rx ring full, drain packets.

You're using E1000. Back in time that would be a good network adapter to use.

Nowadays one should use a better one, try using either e1000e or vmxnet3

--
Wil

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
cb831
Enthusiast
Enthusiast

Thanks for the analysis !

I have been running like this for almost two years without issues.
It only started 2 weeks ago.

0 Kudos
cb831
Enthusiast
Enthusiast

Tracking the links I find 

What does this mean to you?

VMware Workstation/Player can now run when Hyper-V is enabled. You no longer have to choose between running VMware Workstation and Windows features like WSL, Device Guard and Credential Guard. When Hyper-V is enabled, ULM mode will automatically be used so you can run VMware Workstation normally. If you don’t use Hyper-V at all, VMware Workstation is smart enough to detect this and the VMM will be used.

So if I follow your suggestion I loose the above features.

0 Kudos
wila
Immortal
Immortal


@cb831 wrote:

Thanks for the analysis !

I have been running like this for almost two years without issues.
It only started 2 weeks ago.


There's a variety of reasons why the hypervisor might have been switched to ULM.

Either changes in settings in Windows or some updates.

Without your logs from BEFORE the problems there's not much else I can suggest atm.

For now, it is the most likely candidate for the reason of your problems.

--
Wil

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
wila
Immortal
Immortal

You'll loose Hyper-V at the host yes. If you need that.. you're most likely out of luck.

--
Wil

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
cb831
Enthusiast
Enthusiast

So you think that the vmware ran CPL0 before when it worked...
I have backups - so I can try to find a vmware.log from a month ago.

Did you notice all the timeouts and failures around VMTools in the log - I guess those could explain the kernel waits in the guest that produces the soft lockups in syslog.

0 Kudos
wila
Immortal
Immortal

A kernel wait will cause vmware tools to time out, not the other way around.

--
Wil

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
cb831
Enthusiast
Enthusiast

0 Kudos
gordo32
Contributor
Contributor

I made two changes to my system (then rebooted) which seems to have resolved this. Not sure if both are necessary, but I'll leave others to experiment:

1. Added "Authenticated Users" to the "__vmware__" group. NOTE: I added Authenticated Users, because my workstation is joined to AzureAD, and adding my individual account is a convoluted process, and this seemed simpler (yet still safe).

2. As someone suggested below:

bcdedit /set hypervisorlaunchtype off

My VM has been running for several hours now, without issues.

0 Kudos