I'm running VMWare Player 16.1.0 on Ubuntu 20.04, using a Windows 10 VM. Normal Windows shutdown takes about 10 to 15 seconds.
But sometimes shutting down Windows makes Ubuntu freeze slowly. At the beginning it is still possible to switch between browser tabs, but after about 20 seconds that is impossible, but scrolling still works, and so on. After about a minute or so the whole Ubuntu system is frozen, only the mouse cursor still can be moved, but clicking doesn't work anymore.
This happens at least three times a week.
Does anybody have an idea why this happens and what I can do to avoid it?
From what you're describing, your system is running out of resources, although I don't understand why this is happening only when the VM is shutting down. Can you setup a terminal with a resource monitor - top, glances, bpytop, to new a few - and see if you can tell what's happening during a shutdown?
no, the disk is not encrypted.
And for the ressources: the system is a bit oversized for the actual use. It seems pretty unlikely to me, that it should run out of ressources. But I'll give it a try.
A VM of 8GB on a 16GB host is fine. Unless you use the host intensive, there should not be a resource issue in regards to RAM.
You might have assigned too much vCPU resources.. we don't know. Depends on the CPU in your NUC and on how much you assigned. Be aware that threads don't count, only actual cores and you should not normally assign much more than half of what is available (rule of thumb). So on a intel i7 with 4 physical cores, you wouldn't assign more than 2 cores to the guest OS.
If you assign it more than you might indeed have resource issues.
I asked if the disk in the host was encrypted as I'm sometimes seeing similar issues on an ubuntu host, but there the disk is encrypted. On that host, heavy disk IO, such as a guest OS shut down, but also for example a snapshot, can bring everything to a stand still. It runs fine at all other times. Nothing in the logs about it (maybe not weird as disk IO locks up).
The Untel NUC 7 has a quad core processor, and I've assigned 2 virtual spus to the VM. I'm going to monitor system load with top the next times I shut down the VM. Yesterday the problem didn't show up.
It could also be a Windows 10 problem. What version of Windows 10 you are running? (Settings-System-About, 1909 or something)
You probably know this, but, in the end, your "Ubuntu system" might be frozen, but Linux hardly ever is and based on your description it is not now either. You can open up Terminal and kill the graphics/GUI system and restart the graphics/GUI. Or at the minimum, you can do a decent halt to Linux, while Ubuntu does not work anymore.
This can't be due to lack of ressources. I have experienced the exact same issue in the same interval with Ubuntu 20.04 and a Windows 10 VM.
I'm running with a Ryzen 7 3900X 12 cores, 24 threads and 32GB DDR4 Ram with a GTX 980 graphics card.
This bug has existed up through 15.x and now also in 16.x
Not sure why this is, but I got my suspicions today ...
Shutting down froze with Windows 10 VM, Host Kubuntu 20.04.2 . According to Kubuntu System Monitor, system didn't do anything, but I said that disk had been put to sleep.
So, I tried to activate the extra, internal disk with opening Nautilus. It indeed didn't work, it didn't show up files (other than those that came from the buffer). Shutdown was hanging but now there were more applications where it said that disk is put to sleep.
I was able to open up a Terminal (so Linux was not frozen) and give "shutdown now". That took a very long time, when it tried to send kill the processes several times (including vmware and nautilus) and failed in doing so. Then it did (all this was part of the shutdown process) try to unmount and re-mount the disk a few times and failed. Finally it didn't really shut it down but rebooted the system. Everything was fine after that incident.
So, perhaps a Linux guru could say what has happened, but it kind of looks like VMware shutdown does something horribly wrong for the disk and puts it into sleep in the middle of the shutdown process. There are kernel modules in VMware and I guess this is possible (somebody confirm this, please).
My entire system didn't hang up, but that was because it was NOT a system disk, where my VMware was located on my workstation. Since all attempts to use the extra, internal disk, I think the entire system would have hanged if a system disk is used for VMware.
(System resources were OK, about 6/16 GB allocated, 2/4 cores on i7 were used and like the description tells, system was basically idle while hanging. I don't have any high-end, recent graphics adapter. VMware Player version is this: 16.1.0 build-17198959. Windows 10 version is H2 something, very recent.)
PERHAPS A WORKAROUND (not tried, yet):
If you make sure that the disk that you use, is doing something while you shutdown Win10 VM, perhaps putting the disk to sleep isn't possible?
My scenario is different (I think TS their scenario too).
As he says, after a bit, you can do nothing. No key press, no starting a new terminal or anything else.
Not sure if you can still move the mouse, I think that was indeed the only thing, but I don't use that system often, so have trouble remembering the details.
Also I'm not on Ubuntu 20.04, probably 18.04 (need to check).
Only thing I've been able to do is "hard shutdown" after the system gets in this state. Switching to a text terminal doesn't work and AFAICR the ctrl+alt+backspace key combo to kill the X-server does not work on 'buntu. Hasn't worked for years, but it probably can be re-enabled.
Maybe.. it's been a while since I've seen the issue. Whenever I use that system nowadays I'm terribly careful not to increase the disk IO by too much. I will try to diagnose it again once I have time. Like you said, it isn't always easy to make it fail.
FWIW, I just confirmed it is indeed Ubuntu 18.04. The system also had the same issues with Ubuntu 16.04 and I've never seen this happen on other Linux systems.
Don't know about SDD's, mine was a HDD (a recent one, with a rather decent performance for casual testing).
I don't know if it has anything to do with the matter, but my networking and disk i/o was rather heavy on Windows 10 VM prior to the failure (I created a Win11 Dev ISO using uudump.net functionality, using the Win10 VM, just today, and copied the iso to shared folder on the host).
When trying to quickly get it to fail by starting and shutting down, it doesn't fail. I will try the same, with external disks, SSDs, if it fails again. I take copies and thus it doesn't matter if it gets destroyed in the process 🙂 .
After all, I was able to reproduce the problem. After creating a Windows 11 ISO, which involves rather intensive writing to disk AND copying the ISO-file using VMware sharing functionality AND system being idle for some time (an hour or two), the next shutdown was hanging. I could get the attached screenshot, though.
After starting Chromium, the entire system hang. System Monitor was working the entire time, though.
System was as described before, except that I now ran it from m.2 nvme ssd-disk, attached to a USB-2 port. There is no disk sleep activated in Power Management.
This is now pretty much inline with the experiences mentioned in this thread. This is also reproducible - just create a Windows 11 Pro ISO from Dev, using "uupdump.net" on the Win10 VM. Tom's Hardware Guide has the instructions, if needed ... and then do a shutdown. Perhaps copy with VMware share -functionality is also required. I would expect that a similar operation with lots of writing to the disk in VM, does the trick as well.
That's great detective work!
Note that "disk sleep" is not actually a sleeping disk. This is a special process state for a task.
IOW, the process is waiting for IO and not getting any.
As this is on ubuntu and I've not seen it elsewhere it makes me wonder if ubuntu has some special IO scheduling going on.
Can you please check the following, for the disk on which your VM is running verify what IO scheduler is active. You can do that like this, suppose your disk is "sda" then the command to check is:
The mode with the square brackets is what is currently set. It would be good to see the full output though. FWIW, the problem machine I have runs the VM from an SSD and I'm going to check this myself later on too.
edit: added example output