deraldg
Contributor
Contributor

After upgrading Mac OS to Big Sur, I receive numerous CPU stalls on Linux

My ESET Security Management Center Server Appliance continues to receive CPU stalls even after I disabled troubleshooting and set CPU to 2.  I even set compatibility back a few versions.  I am running macOS Big Sur version 11.4 with 32 GB RAM on iMac Late 2014.  I set the HD to single file, disabled all in the Processor Advanced options, and set memory of guest OS to 4GB.  I've seen many complaints about Big Sur causing problems, but nothing that truly explains why there are problems.  I might be having similar problems with the other guest OS's (Windows Server 2016) but the seem to recover whereas my Linux requires reboot.  This occurs at least once daily, and used to happen every hour until I disabled troubleshooting.  I created a new VM using the template from ESET for a fresh new copy, but the same problem occurs.  I am running most current VMware Fusion 12.1.2.  Any help in finding a fix would be greatly appreciated.

0 Kudos
7 Replies
dlhotka
Champion
Champion

What's the CPU in the host?  If it's a dual-core model, then there probably isn't enough capacity to run a two-core guest (big sur really wants to have 2 cores for itself).

0 Kudos
TValentic
Contributor
Contributor

I have been running into the same problem on a similar iMac (27" late 2014 with 32GB RAM, 1TB fusion drive). The CPU is a quad core Haswell i5 with the VMs set to use 2 of the cores. I had been running Fusion 12 on Catalina with no issues, but after upgrading to Big Sur ran into the CPU stall issue. My VMs are various Linux distributions (CentOS, Ubuntu and Fedora) with the Fedora one being my daily driver where I work most of the time. Throughout the day everything was fine but each night the VM would become unresponsive with the CPU stall issue. The system was set not to sleep, just dim the display. The particular VM didn't matter - same thing with newly created ones with the latest distribution release or more vintage ones. If multiple VMs were running, they all stopped. Tried reinstalling Fusion, reset NVRAM, etc.

Frustrated, this weekend I did a clean reinstall of Big Sur (wiped the hard drive, etc), installed Fusion 12 and reloaded just the VM images to see if that made any difference. Quite honestly, I just expecting to roll back to Catalina but figured I'd try this first. It looks like this worked. I haven't seen the CPU stall issue since. The VMs have been up and running since the reinstall with no problems.

This is the first reinstall of the host OS. I had been just doing the normal system upgrades from the original Yosemite, so maybe there was some cruft that accumulated.

0 Kudos
deraldg
Contributor
Contributor

3.5 Ghz Quad Code i5 - It had been running great on Catalina and Fusion Pro 11. Problems started when I upgraded to Big Sur, then to Fusion 12. I just added an exclusion for my Anti Virus for the Virtual Machine folder as I believe the AV was updated after upgrading to Big Sur.  When I look at the activity monitor, the system is only at 5-9 % utilized at any given time so I do not see any contention for CPU/Memory at all.  

0 Kudos
dlhotka
Champion
Champion

Hmmm. that should be able to run a 2 core 4GB VM without issue.  Long shot, but that's a pretty old machine, have you run hardware diags on it?  Maybe the drive or RAM is starting to fail.

0 Kudos
deraldg
Contributor
Contributor

I did perform a diagnostic scan on the iMac and there were no problems

0 Kudos
deraldg
Contributor
Contributor

TValentic, since you reloaded a fresh copy of Big Sur, have you had any problems with CPU Stalls?  I am considering the route you took as the symptoms you described match mine exactly.  When you reloaded, what steps did you take - would like to follow as close as possible.

Thank You for sharing!

0 Kudos
TValentic
Contributor
Contributor

Unfortunately, the clean reinstall of Big Sur didn't fix the problem. Initially, it looked it made a difference. I didn't get stalled CPU for the first day or so, but it has come back just like before. Now any of my guests (all various flavors of Linux) will stop running a few hours after there is no activity on the host system. As long as I'm on the system, everything works fine. That includes some big workloads through the day. As soon as I step away, within a few hours the guests will lock up. There is nothing running in either at that point. Both are basically sitting idle. On the Mac, you will see the vmware-vmx process running at 100%. Shutting down the guest either through the VMware GUI or SSHing into the Mac and running vmrun stop will bring things back to normal and I can restart the guest and run again for a few hours unattended before the lock up repeats.

I've tried a couple of experiments, including running the guests headless without a GUI (the old run level 3 or newer multi-user target) but that didn't have any effect. They will still lock up. Newer distros will set the RCU cpu stall detector and that's what I'm seeing in the guest's system logs (Fedora 34 in this case). Lengthening the timeout for trigger doesn't change anything.

I'm not seeing the same behavior on other Macs running similar configurations - a MacBook 2015 and MacBook 2017 both with the latest Big Sur and Fusion 12. Only on the iMac like you were seeing.

As a comparison, I've also been running guests using VirtualBox and QEMU. Both of those will stay up and running without issue, so I don't think this is related to a particular problem on the host hardware (all of the diagnostics pass). I can run the VirtualBox instance at the same time as the Fusion VMs and it continues to run without any problems after the Fusion VMs show the CPU lockup.

When I get some time, I'll rebuild the iMac with Catalina to see if rolling back makes a difference. 

0 Kudos