VMware Communities
Dr_Acrobat__no_
Contributor
Contributor

VM Crash 16.2.1 build-18811642

Hello,

virtual machine crash random during day.

Attached collect data and message.

With previous version 16.1 never happened.

Thanks.

Regards.

VMware Workstation unrecoverable error: (mks)

ISBRendererComm: Lost connection to mksSandbox (2878)

A log file is available in "E:\vmware.log".

You can request support.

To collect data to submit to VMware support, choose "Collect Support Data" from the Help menu.

You can also run the "vm-support" script in the Workstation folder directly.

We will respond on the basis of your support entitlement.

0 Kudos
49 Replies
Gleaton41
Contributor
Contributor

Thanks for the information 

Tags (1)
0 Kudos
Dr_Acrobat__no_
Contributor
Contributor

I think (not sure 100%) crash happen when system go in :

Your guest has entered a standby sleep state. Use the keyboard or mouse while grabbed to wake it.

Now I disabled "Accelerate 3D graphics" and seems not happen anymore.

Let you know.

Thanks.

Regards.

0 Kudos
banackm
VMware Employee
VMware Employee

You appear to be hitting a timeout where the host graphics system appeared unresponsive for too long.  It's not clear how that would be related to the guest OS sleep state, but we're trying to reproduce this internally.

In the meantime, if you want to experiment with keeping your 3D graphics enabled, you might try setting the following config option in your vmx file:

mks.sandbox.socketTimeoutMS = 200000

That will double the default timeout and make Workstation wait about 3 minutes until declaring a timeout. 

You could adjust that number up if it still wasn't helping, but that's already a long time for a graphics glitch.  The value is in milliseconds, and the risk of setting it too high is that will take longer for Workstation to detect the hang, which might mean you're unable to ungrab or power off the VM until the timeout is reached.

Dr_Acrobat__no_
Contributor
Contributor

I'll try and let you know.

Thanks.

Regards.

0 Kudos
ATSmatthewb
Contributor
Contributor

@banackm I am having the same issues now that I upgraded to 16.2. I am not too savvy but can follow along and wanted to share my logs for you to review.

Let me know if you see something in there to do other than disabling 3D and adding the 200000 timeout setting.

Thank you, Sir.

Regards,

Matthew

0 Kudos
Dr_Acrobat__no_
Contributor
Contributor

Hello, these days I have little time to be able to test, I think tomorrow afternoon I will be able to take the test and give feedback. Thanks. Regards.
0 Kudos
banackm
VMware Employee
VMware Employee

@ATSmatthewb, I don't see the actual VM logs with the crash in that bundle for some reason.  They should be in the VM folder named something like vmware-#.log and mksSandbox-#.log .  If you can post those I'll take a look.

Dr_Acrobat__no_
Contributor
Contributor

Hello,

this is a crash with version 16.2.0 but after updating it to 16.2.1 the problem was not fixed.

In this moment always with accelerated 3D graphics disable, the system has not crashed anymore.

Regards.

0 Kudos
ATSmatthewb
Contributor
Contributor

@banackm Sorry about that. I ran the Collect Support Data utility from the help menu and sent the Zip file that it created. I just assumed it had all the files you would need. In any case, I zipped up all the log files as requested.

Thanks for your help; I appreciate it!

0 Kudos
banackm
VMware Employee
VMware Employee

>I just assumed it had all the files you would need.

Yeah, it's supposed to, but it sometimes doesn't go find all the correct VMs on disk for various reasons.

Both of you have logs that are showing the VM losing it's connection to the graphics system.  There's a large time gap before that which might indicate something was hung, or might just mean everything was idle.

What's happening on your host systems when the crash happens?  Like, are you suspending/sleeping the host?  Are there other heavy graphics/disk/memory workloads running?  Because I can't explain just from the logs why this would either timeout or start dropping the connection.

Besides increasing the timeout config value I provided, if you're running the latest 16.2.1, you could try turning down the graphics memory on the VMs?  That should scale down the amount of memory involved in running 3D graphics under the theory that your host is running out of memory resources. 

But both of you are hitting a very similar issue that we're going to have to try and reproduce internally.

ATSmatthewb
Contributor
Contributor

Nice... I wish it would work the way it's supposed to, but that's why we have jobs...

The times I have noticed it crashing seem to correspond to me leaving the VM alone too long while working on my HOST machine. It says that the VM has gone to sleep or something like that. (I'll grab a screenshot next time it happens) 

Yesterday I added the 200000 timeout to the config file and it didn't crash after that. I still have 3D graphics enabled and have had the VM up and running for an hour or so. I will leave it alone and see if I can reproduce the crash.

On a separate note, do you have recommended settings for the # of CPUs and Cores, etc. that would work best for my hardware? Currently, I have it set to run 2 processors and 4 cores, but I don't know if that's the best setup. (Sorry for the dumb newbie question)

Installed RAM 32.0 GB (31.6 GB usable)
System type 64-bit operating system, x64-based processor
Processor Intel(R) Core(TM) i7-10875H CPU @ 2.30GHz 2.30 GHz 
Intel Core i9-10880H compare 2.3 - 5.1 GHz 8 / 16 16 MB
» Intel Core i7-10875H 2.3 - 5.1 GHz 8 / 16 16 MB
Clock Rate 2300 - 5100 MHz
Level 1 Cache 512 KB
Level 2 Cache 2 MB
Level 3 Cache 16 MB
Number of Cores / Threads 8 / 16
Power Consumption (TDP = Thermal Design Power) 45 Watt
Manufacturing Technology 14 nm
Max. Temperature 100 °C
Socket FCBGA1440
Features Dual-Channel DDR4 Memory Controller, HyperThreading, AVX, AVX2, Quick Sync, Virtualization, AES-NI
GPU Intel UHD Graphics 630 (350 - 1200 MHz)
64 Bit 64 Bit support

 

 

0 Kudos
Dr_Acrobat__no_
Contributor
Contributor

Hello,

I'm not 100% sure but it seems the problem happens when the following popup appears.

Regards.

Dr_Acrobat__no__0-1637138980436.png

 

ATSmatthewb
Contributor
Contributor

Yes, I received the same messages prior to each time it crashed. I added the line "mks.sandbox.socketTimeoutMS = 200000" to the .vmx file as @banackm suggested to us, and yesterday my VM ran all day without incident.

I have the VM up and running again today, so I will see if I can go two for two.

0 Kudos
Dr_Acrobat__no_
Contributor
Contributor

Hello,

I have added mks.sandbox.socketTimeoutMS = 200000 and enabled again acceleretad 3D graphics 1GB recommended.
I attach error file the system has crashed.

Thanks.

Regards.

0 Kudos
KevinGG
Enthusiast
Enthusiast

I’ve also experienced the same issue after migrating my VM images to a new laptop and moving from VMWare Workstation 16.1.0 to 16.2.1 – Laptops are a similar spec, fully patched Windows 10, all SSD, but with the latest Xeon processor and 128Gb as opposed to 64Gb. Image would randomly change, sometimes almost instantly and it felt like it was occurring when the VM was idle or under load using the shared folder facility. I even built a new clean image on the new machine to test if it was a migration issue.

However, adding:

mks.sandbox.socketTimeoutMS = "200000"

Fixed my issue.

But please note 200000 is encapsulated within quotes, I missed this the first time so dis-guarded this post as a solution. It could be posting messages here strips the quotes.

Regards,

Kevin.

0 Kudos
banackm
VMware Employee
VMware Employee

We still haven't been able to reproduce this internally yet, but our best theory is that there's something quirky happening with the host clocks/timing.  @Dr_Acrobat__no_ , your logs in particular show 30-90min jumps in the timestamps right before the problem hits.

We're not clear on what exactly is going wrong there, whether it's a bug in your host hardware, or we're doing something funny picking our timing source, but a jump like that would certainly trigger our timeout detection artificially.

So if the other timeout value I suggested isn't working for you, we'd suggest you try setting an extremely large one, such as:

#43200000 ms = 12 hours
mks.sandbox.socketTimeoutMS = "43200000"

It should work either with/without quotes around the number, but possibly some editors will do funny things with whitespace/line-ends  and mess that up?  So if people are having better luck with the quotes on, it's fine to put them there.

 

0 Kudos
KevinGG
Enthusiast
Enthusiast

Hi @banackm, for the record and if it helps you track this down, since I posted the message earlier today, I have been running two VMs concurrently - a Windows 2019 Server/4 processors/32GB RAM with Visual Studio and a second VM Windows 2019 Server/1 processor/16GB RAM acting as a fileserver for the other image. 

I only experienced the problem once since my original post and this only took out the Visual Studio server. It does seem (feel) as if the Hypervisor believes the image is still running, as if the graphics have become disconnected (as you suspect). If I issue a power down on the host, nothing visible happens to the VM, but the host reports that the Hypervisor is busy suspending the VM.

I've enabled a Windows share and also RDP, so if I see this occur again, I attempt to access the image to test the theory.

At the time of the crash, I was debugging a database application in the VM, with no other load on the host, the laptop I'm using has a NVIDIA RTS A5000 so its should have a lot of spare power!

I also tried mks.sandbox.socketTimeoutMS  without the quotes and it crashes pretty quickly. Editing of the .VMX was done with MS Notepad.

Regards,

Kevin.

0 Kudos
KevinGG
Enthusiast
Enthusiast

Hi @banackm, further update to this. I can re-create this every time within a few seconds with the following setup:

VMWare shared folder mapped as a network drive back to the host.

1. USB Drive share from which I am reading .ZIP files.

2. Local D drive share where I am unzipping the files 

This is a little app I wrote that simply traverses all the .zip files are unzips them to a single folder ready for further processing.

Crash happens really quickly.

I've left a ping running on the VM back to the host, this freezes and the host pinging the VM results in a timeout.

If you would like me to supply logs, please let me know how to extract these and I will make them available.

Regards,

Kevin.

0 Kudos
banackm
VMware Employee
VMware Employee

@KevinGG Yeah, if you could post here or send me in a private forum message the vmware-*.log and mksSandbox-*.log from the VM folder for the VM hit this that would be really useful.  If they have "PANIC:" in there you have the right log files and they haven't cycled out yet.

0 Kudos