VMware Communities
Dr_Acrobat__no_
Contributor
Contributor

VM Crash 16.2.1 build-18811642

Hello,

virtual machine crash random during day.

Attached collect data and message.

With previous version 16.1 never happened.

Thanks.

Regards.

VMware Workstation unrecoverable error: (mks)

ISBRendererComm: Lost connection to mksSandbox (2878)

A log file is available in "E:\vmware.log".

You can request support.

To collect data to submit to VMware support, choose "Collect Support Data" from the Help menu.

You can also run the "vm-support" script in the Workstation folder directly.

We will respond on the basis of your support entitlement.

Reply
0 Kudos
49 Replies
Dr_Acrobat__no_
Contributor
Contributor

Hello, I add mks.sandbox.socketTimeoutMS = "200000" and it hasn't crashed yet. Thanks. Regards.
Reply
0 Kudos
Dr_Acrobat__no_
Contributor
Contributor

Other crash.

Regards.

Reply
0 Kudos
KevinGG
Enthusiast
Enthusiast

Hi @banackm

I’m more convinced now that this is more likely to occur under load and in the scenario, I described previously (reading through a load of .zip files via a shared folder and extracting them onto another shared folder), I can re-create the crash every time within a few minutes. I’ve attached the logfiles as requested and I can see:

PANIC: MKSSandboxComm: Lost connection to isbRenderer (1408) – being logged.

I’ve also included a couple of screen grabs that I took, related to the first two logfiles at the point I detected the crash had happened. As you will see the processor load on the on both the VM and the host were not significant at the time it occurred. You can ignore the “Not Responding” message in the test app that is running, it’s a knock together single threaded WinForm just to demonstrate the point.

Please let me know if I can supply further information to help track this down, it’s a real pain as I use VMWS every day and this is causing an impact on productivity for me.

 

Regards,

Kevin.

Reply
0 Kudos
heath140
Contributor
Contributor

I ran into the same issue and found the resolution to be to disable the "Accelerate 3D graphics" and it seems to work fine, I also tied to just set the mks sandbox timeout to be longer, but I didn't do anything without disabling the Accelerate 3D graphics checkbox. 

Reply
0 Kudos
KevinGG
Enthusiast
Enthusiast

Thanks @heath140 - I already that the 3D accelerate option disabled.

For me its crashes in the same way as soon with or without it being enabled as I start to put some load on the VM. I have had Visual Studio open today ( > 7hours) and I've been editing some code with no issues at all. But as soon as I start to copy files around from the host to the VM, bang!

 

Regards,

Kevin.

Reply
0 Kudos
banackm
VMware Employee
VMware Employee

@KevinGG  There shouldn't be any mksSandbox process running when 3D is disabled, so you can't hit this particular crash when 3D is disabled.

The last round of vmware.log's you posted in the last round don't show this same issue, and the mksSandbox.log's you have in there are stale, from the last time you did have 3D enabled.

What your vmware.log's do show there all just stop with a message about WM_ENDSESSION ?  I think that happens when your user session logs off (because of Log Out, idle, or host sleep/shutdown?), and we try to suspend your VMs, but Windows will often give up on us because it takes too long to suspend the VMs and it just kills them.  So I think at this point you're hitting a different issue with your file-copies.

You may still have some underlying host clock issue that's triggering both things though.  There are some indications that something timed out or had a clock jump right before the logs cut off.  But the WM_ENDSESSION thing is really suspicious by itself, and sounds like there's some other involvement with something on your host?

Reply
0 Kudos
KevinGG
Enthusiast
Enthusiast

Thanks @banackm -- I'll go back around the loop to confirm my findings, deleting the existing logfiles before I start.

But I can confirm the "Accelerate 3D graphics" checkbox is definitely not ticked at present. 

I think the WM_ENDSESSION is possibly related to me having to reboot the host machine before being about to re-start the VM following the crash.

More details shortly....

Regards,

Kevin.

 

Reply
0 Kudos
banackm
VMware Employee
VMware Employee

That would make sense, but it suggests that the VM was still running at that point?  (Or at least the process was still running... it's still possible part of the system had hung somehow.)

Reply
0 Kudos
KevinGG
Enthusiast
Enthusiast

Hi @banackm - So some more information for you and from my earlier post I did suspect the VM was still running but I had dismissed this as the network layer failed (pings I had running stopped at the point of the crash) - it looks like this assumption was incorrect. For confirmation, 3D accelerate option is disabled and no mksSandbox log files have been created.

Attached are the some more artefacts for you:

1. Screen grab at the point of the crash. So running my same load test, this shows the frozen VM screen pinging the host and at the same time the first timeout appearing on the host pinging the VM - The grab was taken at crash point, to also capture the CPU load of both VM and host, obviously the host continues to report failed pings to the VM from this point on and never recovers.

2. The vmware log file before the reboot to reset everything, There are errors logged that mean nothing to me!

3. The vmware log file after the reboot, but before I attempt to restart the VM. As suspected you can see WM_ENDSESSION messages after the last of the errors from the previous version. I can also see a .vmem file has been created assuming this is created as part of the suspend process, 

So this does help prove your suspicion that the VM is still running at this point, even if its network layer has gone.

It does therefore seem the problem I am having is different to the issue others are reporting here, even if the symptoms are roughly the same. Although please also remember I did get the panic message previously in the logs, this must have been when 3D was switch on, so maybe there is still some overlap.

Help!

Regards,

Kevin.

Reply
0 Kudos
banackm
VMware Employee
VMware Employee

@KevinGG 

Yeah, I'm not sure what those error messages mean, but it looks pretty clearly like the VM process is still running, but either part of it is hung, or something has happened so that the guest isn't making forward progress any more, probably related to the disk activity.  I'm going to file a bug internally with the disk/file-sharing team to take a look at your logs.

So probably you were hitting the 3D crash because it was detecting an actual hang, as opposed to most of the cases here where it looks like we're falsely detecting a timeout when we shouldn't.

Reply
0 Kudos
KevinGG
Enthusiast
Enthusiast

Thanks again @banackm - Tell them to get in touch if needed and I'll provide whatever details they need.

Also may be worth mentioning to them that prior to migrating to this laptop, a soak test was performed on the SSD's to ensure they were all good. We sift through a lot of data, so integrity of the drives is very important to us.

Regard,

Kevin.

Reply
0 Kudos
ATSmatthewb
Contributor
Contributor

@banackm 

After changing the "mks.sandbox.socketTimeoutMS = 200000 setting", I still have not had any crashes.

I'm not sure what I'm doing differently or my issue was due to the VM settings we discussed @banackm.

Today I have been running a Product App for BAS controls as well as MS Visio w/ a programming plugin from the BAS Product App. I also had 2 youtube streams going, and transferred ~10GB from my company's file server to my VM drive space, via a VPN connection to the HOST and mapped as a drive on my VM. I then transferred the ~10GB of files from the VM file space back to the HOST and didn't have any crashes. So the file transfers from host to VM and VM to host do not seem to be the cause in my case @KevinGG.

Maybe this helps, maybe not :confused_face:

 

Reply
0 Kudos
steve_goddard
VMware Employee
VMware Employee

Hi there,

Can you please try the following for your VM? It might stop the hang issue.

  • Power off the VM
  • edit the VM's .vmx file (change to the VM's folder and edit the [VMname].vmx file)
  • add the line
    isolation.tools.hgfs.oplockmonitor.enable = "FALSE"
  • save the file
  • Power on the VM
  • Rerun your test with file IO through the shared folder.
    Thanks
Thanks. Steve
KevinGG
Enthusiast
Enthusiast

Hi @steve_goddard, thanks for your response. This has resolved the issue for me.

I've run the complete test twice with no issues and then re-enabled the 3D acceleration and successfully run the test again to see if the two issue where related.

This gets me up and running, which makes me very happy!

Is there anything else I can contribute to help you with tracking down a root cause?

Regards,

Kevin.

Reply
0 Kudos
steve_goddard
VMware Employee
VMware Employee

Hi KevinGG
Thank you for confirming that it resolved your issue.

Great to hear you are up and running successfully.

Steve

 

 

Thanks. Steve
Reply
0 Kudos
timatgca
Contributor
Contributor

DELETED: Sorry, wrong thread.

Reply
0 Kudos
pac_man
Contributor
Contributor

Hello,

will this issue be fixed in next release or is it necessary to change each Workstation Config?

Regards

Reply
0 Kudos
steve_goddard
VMware Employee
VMware Employee

Hello,
I am looking to get the issue fixed in the next available Workstation release.

 

Thanks. Steve
Reply
0 Kudos
AndreasPH
Contributor
Contributor

Hello, what is the current status? We have some machines here that we had to donwgrade to 16.1.2 because of crashes. Our IT is pushing for the installation of the latest version. When can we expect a solution?

 

Reply
0 Kudos
KevinGG
Enthusiast
Enthusiast

So my update on this one.....

I'm currently running 16.2.2 build-19200509 and no longer seeing the issue regardless of the isolation.tools.hgfs.oplockmonitor.enable setting.

Regards,

Kevin.

 

Reply
0 Kudos