Alex_Ma
Enthusiast
Enthusiast

VM Ware Fusion potentially causes macOS 10.15.6 to crash

Jump to solution

I recently upgraded to macOS 10.15.6. Now if I leave VM Ware Fusion running while leaving my mac for some time, e.g. - for a night, the system crashes. This did not happen with the older macOS 10.15.5 version.

VM Ware Fusion version: Professional Version 11.5.5 (16269456)

macOS Catalina version: 10.15.6 (19G73)

1 Solution

Accepted Solutions
Mikero
Community Manager
Community Manager

Looks like Apple has a fix for us:

https://www.macrumors.com/2020/08/12/apple-releases-macos-10-16-5-supplemental-update/

macOS Catalina‌ 10.15.6 supplemental update includes bug fixes for your Mac.

- Fixes a stability issue that could occur when running virtualization apps
- Resolves an issue where an ‌iMac‌ (Retina 5K, 27-inch, 2020) may appear washed out after waking from sleep

-
Michael Roy - PM/PMM: Fusion & Workstation

View solution in original post

192 Replies
dariusd
Leadership
Leadership

What precisely are you seeing there?  Does your Mac return to the login screen?

Are there any kernel panic reports or WindowServer crash logs in Console?  I don't have a macOS 10.15 host with panic reports handy so I am not 100% sure where they might show up, but I would kind of expect they would be in either Crash Reports or Diagnostic Reports on the side bar.

If you can find a panic/crash log which correlates with the date and time of the crash, please upload it here as an attachment to a reply in this thread.

Thanks!

--

Darius

0 Kudos
Sven1802
Enthusiast
Enthusiast

Hi,

I've upgraded to 10.15.6 when it was released. Around 3 days later during working on my mac (I was not in the Fusion window right now) the Mac stopped working. It does not crash, it simply hangs. After 3 or 4 minutes the screen got black and thats all. Then I had so switch it off and on. Again 3 days later the same happens. Then the same happend 1 days later. I'm running VMWare in full screen and at this day I switched between the normal Mac and Fusion windows very often.

The logs does not show anything bad. But there is a lot of acitivity when it hangs. Usually only 3 or 4 lines each 3 or 4 minutes will be written but when it sops then maybe 50 or 60 lines gets written each minute.
I didn't save them, so I do not have them right now. I can provide a excerpt the next time.

Not sure if this is related to Fusion. I will quit it now when I'm not using it, usually I leave it running in the background. Lets see, if this will solve the issue.

Thanks, Sven

0 Kudos
Sven1802
Enthusiast
Enthusiast

Hi,

didn't see it before, yes, there was WindowServer crash when the hang started:

Process:           WindowServer [248]
Path:              /System/Library/PrivateFrameworks/SkyLight.framework/Versions/A/Resources/WindowServer
Identifier:        WindowServer
Version:           ???
Code Type:         X86-64 (Native)
Parent Process:    launchd [1]
Responsible:       WindowServer [248]
User ID:           88

Date/Time:         2020-07-23 17:28:04.639 +0200
OS Version:        Mac OS X 10.15.6 (19G73)
Report Version:    12
Anonymous UUID:    D39D6CF8-A030-14D4-AD7C-94F21CA5675E

Sleep/Wake UUID:   AAEF36EA-E606-4A62-8CB4-51B48F131E0A

Time Awake Since Boot: 37000 seconds

Time Since Wake:   33000 seconds

System Integrity Protection: enabled

Crashed Thread:    0

Exception Type:    EXC_CRASH (SIGKILL)
Exception Codes:   0x0000000000000000, 0x0000000000000000
Exception Note:    EXC_CORPSE_NOTIFY

Termination Reason:WATCHDOG, [0x1] monitoring timed out for service

Termination Details:   WATCHDOG, checkin with service: WindowServer returned not alive with context:

unresponsive work processor(s): WindowServer main thread

40 seconds since last successful checkin, 3343 total successsful checkins since wake (0 induced crashes)

Thread 0 Crashed:

0   ???                      0x00007fff72dccdfa mach_msg_trap + 10

Thread 1:

0   ???                      0x00007fff72dccdfa mach_msg_trap + 10

Thread 2:: com.apple.coreanimation.render-server

0   ???                      0x00007fff72dccdfa mach_msg_trap + 10

Thread 3:

0   ???                      0x00007fff72dccdfa mach_msg_trap + 10

Thread 4:

0   ???                      0x00007fff72e8bb68 start_wqthread + 0

Thread 5:

0   ???                      0x00007fff72e8bb68 start_wqthread + 0

Thread 6:

0   ???                      0x00007fff72dcce4e semaphore_timedwait_trap + 10

Thread 7:

0   ???                      0x00007fff72dcce4e semaphore_timedwait_trap + 10

Thread 0 crashed with X86 Thread State (64-bit):

  rax: 0x0000000010004005  rbx: 0x000000000400420e  rcx: 0x00007ffeef017e98  rdx: 0x0000000000000000

  rdi: 0x00007ffeef017f10  rsi: 0x000000000400420e  rbp: 0x00007ffeef017ef0  rsp: 0x00007ffeef017e98

   r8: 0x0000000000010213   r9: 0x0000000000000000  r10: 0x0000000000004000  r11: 0x0000000000000202

  r12: 0x000000000400420e  r13: 0x0000000000004000  r14: 0x00007ffeef017f10  r15: 0x0000000000000000

  rip: 0x00007fff72dccdfa  rfl: 0x0000000000000202  cr2: 0x00007fe347000018

 

Logical CPU: 0
Error Code:  0x0100001f
Trap Number: 133

Binary images description not available

External Modification Summary:

  Calls made by other processes targeting this process:

task_for_pid: 64
thread_create: 0
thread_set_state: 0

  Calls made by this process:

task_for_pid: 0
thread_create: 0
thread_set_state: 0

  Calls made by all processes on this machine:

task_for_pid: 23766
thread_create: 0
thread_set_state: 0

System Profile:

Network Service: Wi-Fi, AirPort, en1

Boot Volume File System Type: apfs

Memory Module: BANK 0/DIMM0, 4 GB, DDR4 SO-DIMM, 2400 MHz, 0x802C, 0x344154463531323634485A2D3247334532202020

Memory Module: BANK 0/DIMM1, 4 GB, DDR4 SO-DIMM, 2400 MHz, 0x859B, 0x4354344734534653383234412E43384646202020

Memory Module: BANK 1/DIMM0, 4 GB, DDR4 SO-DIMM, 2400 MHz, 0x802C, 0x344154463531323634485A2D3247334532202020

Memory Module: BANK 1/DIMM1, 4 GB, DDR4 SO-DIMM, 2400 MHz, 0x859B, 0x4354344734534653383234412E43384646202020

USB Device: USB 3.0 Bus

USB Device: Bluetooth USB Host Controller

USB Device: FaceTime HD Camera (Built-in)

Thunderbolt Bus: iMac, Apple Inc., 41.4

Model: iMac18,3, BootROM 428.0.0.0.0, 4 processors, Quad-Core Intel Core i7, 4,2 GHz, 16 GB, SMC 2.41f2

Graphics: kHW_AMDRadeonPro580Item, Radeon Pro 580, spdisplays_pcie_device, 8 GB

AirPort: spairport_wireless_card_type_airport_extreme (0x14E4, 0x16F), Broadcom BCM43xx 1.0 (7.77.111.1 AirPortDriverBrcmNIC-1615.1)

Bluetooth: Version 7.0.6f7, 3 services, 27 devices, 1 incoming serial ports

mmsvs
Contributor
Contributor

Unfortunately I am seeing the same issue.

It does not always crash  immediately, the window server locks up and some time after that there is a panic unless you manually shut the power off.

Please escalate this to the appropriate engineering I am sure quite a few people are seeing this, and many more will once they start running their Fusion VMs continuously under 10.15.6

I never ever had a problem with Fusion on Catalina (or Mojave) with my installation and this started immediately after installing 10.15.6.

For me (iMac, 32G RAM) it seems to happen around once every 24-36 hours

dariusd
Leadership
Leadership

Do your VMs use 3D acceleration?  If so, could you try turning off the Accelerate 3D Graphics option and see if that helps?

--

Darius

0 Kudos
mmsvs
Contributor
Contributor

I had it turned on, but not running any 3D workloads on the VM.

Just turned it off and restarted Fusion and the VM, but of course it might take over 24 hours until I know if this made a difference so I will get back on here if there is no windowserver issue or panic

0 Kudos
Sven1802
Enthusiast
Enthusiast

Hi,

I also had 3D Acceleration enabled but not used.

I will disable it and let you know if it changes anything.

Thanks a lot, Sven

0 Kudos
Alex_Ma
Enthusiast
Enthusiast

Yes, it's a typical kernel panic. Nothing fancy. I realized there are have been a lot of reports like this since Catalina was released, but I only started having this issue when I installed 10.15.6 update. This happens when I leave my MacBook Pro unattended. Quite often when I wake up or leave workplace for a few hours, I see it shows a login screen and when I login, it restores all windows and shows that kernel panic message.

panic(cpu 2 caller 0xffffff7f845a1ad5): userspace watchdog timeout: remoted connection watchdog expired, no updates from remoted monitoring thread in 61 seconds, 7430 checkins from thread since monitoring enabled 148641 seconds ago after loadservice: com.apple.logd, total successful checkins since load (148642 seconds ago): 14864, last successful checkin: 10 seconds ago

service: com.apple.WindowServer, total successful checkins since load (148581 seconds ago): 14831, last successful checkin: 110 seconds ago

Backtrace (CPU 2), Frame : Return Address

0xffffffa3ed833720 : 0xffffff8003b1a65d

...

0xffffffa3ed833fa0 : 0xffffff8003ac1226

      Kernel Extensions in backtrace:

         com.apple.driver.watchdog(1.0)[832CC890-EE61-33E0-8FD4-8D354BCD0921]@0xffffff7f845a0000->0xffffff7f845a8fff

BSD process name corresponding to current thread: watchdogd

Boot args: chunklist-security-epoch=0 -chunklist-no-rev2-dev

Mac OS version:

19G73

Kernel version:

Darwin Kernel Version 19.6.0: Sun Jul  5 00:43:10 PDT 2020; root:xnu-6153.141.1~9/RELEASE_X86_64

etc...

I disabled 3D acceleration long time ago. I'll try to turn off the VM when I go to bed. Let's see if anything will change.

0 Kudos
droark42
Contributor
Contributor

Yeah, I've been seeing some issues too. Fusion 11.5.5, macOS 10.15.6, MBP late-2019 (64 GB RAM). I have Win10 open. I noticed that there was funny behavior starting a few days ago, when I installed 10.15.6. It has gotten progressively worse. Now, I'm at a point where I can trigger the kernel panic pretty quickly. If I rip a disc via an optical drive connected via a TB3 dock, within a few seconds of ripping, the macOS kernel panics, and the laptop reboots. I wonder if this is a repeat of sorts of a problem from last year, when a kernel update in 10.14.6 (IIRC) caused all macOS machines with less than 32 GB of RAM to bog down Fusion VMs to the point that they were completely useless.

0 Kudos
lundqvist
Contributor
Contributor

Same here as well.  No issues until I installed 10.15.6.

OS - Catalina 10.15.6

Fusion - 11.5.5

VM - Windows 7

I use it for work.  In Windows 7 I use VMWare Horizon to connect to work to another Windows 7 desktop.  All my issues occur when I use Skype at work and have calls / share my desktop.  I say about 80% of the time during the call I'll get a Mac error (not the cause, mainly due to Mac crashing) and in about 5 seconds it's completely locked up.  I have to hold the power button to restart it.  This has happened multiple times a work day since I installed 10.15.6.

If I am off work / not using Fusion I have no issues at all.  Never had a locked post 10.15.6.

I tried to completely re-install Fusion, removing all the configuration files.  Still an issue.

I decided yesterday to just revert to 10.15.5 as I can't have it crashing at work during calls and meetings.

0 Kudos
Alex_Ma
Enthusiast
Enthusiast

I have 32 Gb though.

0 Kudos
dariusd
Leadership
Leadership

Hi folks,

I would like to ask anyone who can provide panic/crash reports related to this issue (WindowServer watchdog timeout or system watchdog panic with macOS 10.15.6) to please post the full panic/crash report as an attachment.  If you have multiple panic/crash reports available, please post them all.  I would like to gather as much information as I can to help track this down, and right now any information you could provide would be awesome.

Feel free to remove pieces of information from the panic reports, particularly if the reports include hardware info that you might like to keep private, but the more information you can share the better we will be able to track down the cause.

Thanks again,

--

Darius

0 Kudos
dariusd
Leadership
Leadership

droark42 is just saying that the problem could be a new regression in the host OS, kind of like the one we saw over in VM became extremely slow after upgraded to macOS 10.14.6 [Officially Solved]​, where VMs with more than 2 GBytes of virtual RAM running on hosts with less than 32 GBytes of RAM started performing very very poorly after a macOS update.  In this case, it was not anything wrong in Fusion, but something that Fusion did – and very few other applications would ever do – just happened to trigger the defect in macOS.  And if you didn't hit the exact conditions (at least 2 GBytes of RAM assigned to the VM, less than 32 GBytes of physical RAM in the host, and macOS 10.14.6, and with the problem substantially worsened by running an encrypted VM) then everything would work just fine.

Here, we have another case where everything was working fine before macOS 10.15.6, and now we see multiple users reporting this problem with macOS 10.15.6, and anecdotal evidence suggesting that Fusion is somehow involved.  It could either be that Fusion is doing something wrong all along, and we have been lucky up until macOS 10.15.6, or it could be that Fusion is doing something innocent and is triggering a new defect in macOS 10.15.6.  We won't know which is true until we get to the bottom of this... and the first step will be figuring out which factors (both host configuration and VM configuration) contribute to the problem.

Thanks,

--

Darius

0 Kudos
xfoo
Contributor
Contributor

Hello, I am also affected by this problem.

My theory is that something changed in the macOS 10.15.6 kernel that causes a memory leak when running virtualization software. I am seeing the same symptoms for both VMWare Fusion 11.5.5 and Virtualbox 6.0.24 when running a Windows 10 guest VM.

Some observations:

"sudo zprint -d" prints a stream of kernel memory allocation changes. When a windows guest VM is running (in either VMWare Fusion or Virtualbox), there is a non-stop stream of increasing allocations in "kalloc.32". The "cur size" and "cur elements" columns increases as long as the guest VM is running. Pausing/quitting the guest VM causes the "kalloc.32" streams to pause, but they never really go back down, and resuming the guest VM causes the kalloc.32 to resume its unbounded growth.

This can also be observed in Activity Monitor; go to the memory tab and sort by the "Real mem" columns, and watch "kernel_task"'s Real mem grow at a steady stream of about 1GB per hour. When kernel_task's real mem reaches about 4.5GB on my 16GB machine, I have to reboot to avoid the inevitable hard crash.

When the machine hard crashes, everything mostly locks up and becomes unresponsive; sometimes I get super strange prompts like having to re-enter icloud and google account passwords in settings, or sudo refusing to authenticate, or various system prompts such as "Unapproved caller. SecurityAgent may only be invoked by Apple software".

Investigating the log from around the time of these crashes seem to indicate that the kernel memory zone for "kalloc.32" is exhausted and the kernel starts killing pretty much every process on the system, including critical system-level processes. Eventually everything hard locks. Sometimes there is also panic report where for example WindowServer watchdog timed out.

Some example commands to investigate the log (let's say everything crashed at 2020-07-20 01:50:00) :

log show --start '2020-07-20 01:40:00'|egrep -i '(zone_|memory|kill)'

Some example log outputs from just around the time of crash:

2020-07-20 01:43:14.932748+0200 0x4f3      Default     0x0                  185    0    UserEventAgent: (MemoryMonitor) [com.apple.MemoryMonitor:plugin] kernel jetsam snapshot note received

2020-07-20 01:43:14.971227+0200 0x199      Default     0x0                  0      0    kernel: zone_map_exhaustion: Zone map size 6120333312, capacity 6442450944 [jetsam limit 95%]

2020-07-20 01:43:14.971232+0200 0x199      Default     0x0                  0      0    kernel: zone_map_exhaustion: Largest zone kalloc.32, size 3520872448

2020-07-20 01:43:14.971235+0200 0x199      Default     0x0                  0      0    kernel: zone_map_exhaustion: Nothing to do for the largest zone [kalloc.32]. Waking up memorystatus thread.

2020-07-20 01:43:14.971745+0200 0x145      Default     0x0                  0      0    kernel: 88294.432 memorystatus: killing_top_process pid 3465 [keybagd] (zone-map-exhaustion 1) 736KB - memorystatus_available_pages: 2412221

2020-07-20 01:43:14.971799+0200 0x4f3      Default     0x0                  185    0    UserEventAgent: (MemoryMonitor) [com.apple.MemoryMonitor:plugin] kernel jetsam snapshot note received

2020-07-20 01:43:14.978534+0200 0x199      Default     0x0                  0      0    kernel: zone_map_exhaustion: Zone map size 6120333312, capacity 6442450944 [jetsam limit 95%]

2020-07-20 01:43:14.978538+0200 0x199      Default     0x0                  0      0    kernel: zone_map_exhaustion: Largest zone kalloc.32, size 3520872448

2020-07-20 01:43:14.978541+0200 0x199      Default     0x0                  0      0    kernel: zone_map_exhaustion: Nothing to do for the largest zone [kalloc.32]. Waking up memorystatus thread.

2020-07-20 01:43:14.978927+0200 0x145      Default     0x0                  0      0    kernel: 88294.439 memorystatus: killing_top_process pid 3467 [sysmond] (zone-map-exhaustion 1) 560KB - memorystatus_available_pages: 2412121

2020-07-20 01:43:14.978989+0200 0x4f3      Default     0x0                  185    0    UserEventAgent: (MemoryMonitor) [com.apple.MemoryMonitor:plugin] kernel jetsam snapshot note received

2020-07-20 01:43:14.979260+0200 0x199      Default     0x0                  0      0    kernel: zone_map_exhaustion: Zone map size 6120341504, capacity 6442450944 [jetsam limit 95%]

2020-07-20 01:43:14.979265+0200 0x199      Default     0x0                  0      0    kernel: zone_map_exhaustion: Largest zone kalloc.32, size 3520872448

2020-07-20 01:43:14.979271+0200 0x199      Default     0x0                  0      0    kernel: zone_map_exhaustion: Nothing to do for the largest zone [kalloc.32]. Waking up memorystatus thread.

2020-07-20 01:43:40.716620+0200 0xb9f65    Default     0x0                  82515  0    securityd: [com.apple.securityd:SecServer] 0x7fa7f141d080 killing session 100005

2020-07-20 01:43:40.716631+0200 0xb9f65    Default     0x0                  82515  0    securityd: Killing auth hosts for session 100005

etc. etc, basically all system processes (like "com.apple.CodeSigningHelper", "cfprefsd", "secd", "trustd", "tccd", "amfid" etc are killed (and possibly respawned, then immediately killed again)

So I think the smoking gun is something in macOS 10.15.6 causing an unbounded growth in "kalloc.32" allocations in the kernel when virtualization software is running.

By the way, I saw someone else posted a thread about similar symptoms over at the virtualbox forum: virtualbox.org • View topic - kernel panic from memory leak on 10.15.6

mmsvs
Contributor
Contributor

It happened again even though I turned off 3D acceleration as you asked.

I point you to the analysis of kernel allocations in another response, I believe this is the root cause. Now, if it is due to to a "bug" in 10.15.6 or if this was a latent issue that started happening due to another change, who knows.

I suggest you escalate this case within engineering so that you can sort it out with the kernel engineering folks at Apple the zprint repro should be enough information it might be better than crash dumps even?

The windowserver crash looks like this, ie yes it does receive a SIGKILL probably in line with the previous analysis by another poster about kernel allocation issues

Process:               WindowServer [71921]

Path:                  /System/Library/PrivateFrameworks/SkyLight.framework/Versions/A/Resources/WindowServer

Identifier:            WindowServer

Version:               600.00 (451.4)

Code Type:             X86-64 (Native)

Parent Process:        launchd [1]

Responsible:           WindowServer [71921]

User ID:               0

Date/Time:             2020-07-26 12:16:30.218 +0200

OS Version:            Mac OS X 10.15.6 (19G73)

Report Version:        12

Anonymous UUID:        5643BC9C-6405-FB15-C2CF-FDCCCCFAAFCE

Time Awake Since Boot: 160000 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_CRASH (SIGKILL)

Exception Codes:       0x0000000000000000, 0x0000000000000000

Exception Note:        EXC_CORPSE_NOTIFY

Termination Reason:    WATCHDOG, [0x1] monitoring timed out for service

Termination Details:   WATCHDOG, checkin with service: WindowServer returned not alive with context:

is_alive_func returned unhealthy : WindowServer initialization not complete (post IOKitWaitQuiet)

40 seconds since last successful checkin, 16031 total successsful checkins since load (0 induced crashes)

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread

0   libsystem_kernel.dylib         0x00007fff6e797dfa mach_msg_trap + 10

1   libsystem_kernel.dylib         0x00007fff6e798170 mach_msg + 60

2   libdispatch.dylib             0x00007fff6e61190e _dispatch_mach_send_and_wait_for_reply + 632

3   libdispatch.dylib             0x00007fff6e611d4e dispatch_mach_send_with_result_and_wait_for_reply + 50

4   libxpc.dylib                   0x00007fff6e899846 xpc_connection_send_message_with_reply_sync + 238

5   com.apple.CoreFoundation       0x00007fff345de083 __78-[CFPrefsPlistSource sendRequestNewDataMessage:toConnection:retryCount:error:]_block_invoke + 22

6   com.apple.CoreFoundation       0x00007fff345e5779 CFPREFERENCES_IS_WAITING_FOR_SYSTEM_CFPREFSD + 74

7   com.apple.CoreFoundation       0x00007fff345ddfdf -[CFPrefsPlistSource sendRequestNewDataMessage:toConnection:retryCount:error:] + 672

8   com.apple.CoreFoundation       0x00007fff345a12b9 -[CFPrefsPlistSource handleErrorReply:fromMessageSettingKeys:toValues:count:retryCount:retryContinuation:] + 810

9   com.apple.CoreFoundation       0x00007fff345a0f86 -[CFPrefsPlistSource handleErrorReply:retryCount:retryContinuation:] + 40

10  com.apple.CoreFoundation       0x00007fff3459fea4 -[CFPrefsPlistSource handleReply:toRequestNewDataMessage:onConnection:retryCount:error:] + 187

11  com.apple.CoreFoundation       0x00007fff345ddf37 -[CFPrefsPlistSource sendRequestNewDataMessage:toConnection:retryCount:error:] + 504

12  com.apple.CoreFoundation       0x00007fff345a12b9 -[CFPrefsPlistSource handleErrorReply:fromMessageSettingKeys:toValues:count:retryCount:retryContinuation:] +

0 Kudos
mmsvs
Contributor
Contributor

good catch, I believe you have found the root cause... Now why this happens under 10.15.6 is another issue but that will probably take cooperation between VMware engineering and MacOS kernel engineers at Apple.

Given that this problem probably affects a lot of people I would suggest escalating this on monday morning

0 Kudos
Alex_Ma
Enthusiast
Enthusiast

I have three panic reports. I do not feel comfortable sharing these publicly. I tried to send you a private message, but not sure if it worked. Let me know how I can send the archive to you.

dariusd
Leadership
Leadership

I suggest you escalate this case within engineering

I am in engineering... I've been working mostly on Fusion and macOS internals at VMware for over a decade now.  If you have a case going through support it will quite likely end up being assigned to my team to figure out what's going on... and it might even end up assigned to me personally. Smiley Happy

Thanks for the information you've all provided so far... it's great.  I've got a few leads to investigate...

Thanks,

--

Darius

mmsvs
Contributor
Contributor

My apologies, never posted in this forum before, didn't know engineering looked directly at forum traffic Smiley Happy

I'll stay on 10.15.6 for now since it only happens once every 24/36 hours for me. Good luck with the debugging

0 Kudos