FreddyFredFred
Hot Shot
Hot Shot

windows 10 1809 slow

I downloaded the Windows 10 1809 and Server 2019 ISOs the day they became available so I can start working on my templates.

I built the templates with EFI, paravirtual for the C drive and vmxnet3 adapter. I've been using this combo for other versions of windows 10/8/7and windows server 2008r2/2012r2/1016 without issue.

So far Windows 2019 (with desktop experience) seems to be ok at least for a basic vm and guest customization. Haven't tried anything else yet.

Windows 10 1809 on the other hand is very, very slow to reboot after the initial install or even just rebooting after making some changes post install. After installing the OS it took 10-15 minutes for the initial windows setup stuff (user, security settings, etc) to appear . I tried a VM set to BIOS and it seemed faster but was still quite slow. Server 2019 and other versions of windows 10 have no issue.

The hosts are esxi 6.5 and 6.7.

I haven't had a chance to try every combo of BIOS/EFI/vmxnet3/e1000e/paravirtual/lsi sas to see if one is the cause of the issue but was wondering if anyone else had noticed any issues or if it was just me?

Thanks

126 Replies
MMAgeek
Enthusiast
Enthusiast

We also experienced this issue on ESX6.5/ VMFS 6 and Win 1809. Moving back to VMFS 5 datastores fixed the issue. Hopefully VMware will resolve this soon

0 Kudos
nschlip
Contributor
Contributor

Hello everyone,

I have an update to share on this - it may be fairly "wordy" (I know that's not a word), but here goes....

Issue

Linked-clone Windows 10 VM's take a considerable amount of time (5-30 minutes, depending) to complete a guest OS restart cycle.

Troubleshooting

To make a long story short, I eventually noticed that disk IO was ZERO during a Windows 10 VM restart - it zero's out until the VM gets to the Windows 10 logo with the spinning circles, then the disk IO suddenly spikes once it gets to the login screen. From this, I began to look at disk configuration settings directly on the VM (vSphere -> right-click VM -> edit). I noticed the disk was using LSI SAS. I reviewed event logs, and noticed (on several Win10 VM's I tested with) there was a 10+ minute span of time with LSI_SAS event warnings. From there, I created a brand new Windows 10 master/gold VM using Paravirtual SCSI instead of LSI SAS. I then sanitized the mast VM and created a new pool - the Win10 linked-clone VM's are now down to 5 minute restarts. Much better, but still not great by any means.

We then noticed that the .vmdk on the VM (all the linked-clone VM's, in fact) was using a "vmname.checkpoint.vmdk" - this "checkpoint" in the .vmdk indicates it's using a snapshot from the master VM. After many hours of testing, we found a sort of fix/workaround....

Fix

If you:

1. Shut down the VM and migrate the storage (storage only)

2. Select configure per disk

3. Change the storage datastore for each disk on the VM, click Next, let it complete (may take anywhere from a few minutes to several, depending on disk size)

4. You should then have an alert on the VM indicating a consolidation is needed

5. Right-click the VM -> Snapshots -> Consolidate

6. Once the consolidation is finished, right click the VM - notice that you can now change the disk size (if needed)

7. Power up the VM, log in, and restart

The restart time is now down to under 30 seconds (I've seen it as low as 7-8 seconds). I tested this using thick and thin disk provisioning, and either one didn't seem to make a difference. With 100% certainity (at least in my case) it has to do with the VM running off a "checkpoint.vmdk" disk - once the VM's disk is pointed back to its own self-named .vmdk, the restart issue is resolved.

Now to provide some additional context to this:

In vSphere, I create a new VM, set configuration options, point to an .iso, power up VM and proceed through our imaging process. Once this is complete, I log into the VM using a domain admin account, configure the OS, sanitize (prep it for cloning), and take a single snapshot.

I then create (or edit) a View pool (for Win10 testing they have all been persisten/dedicated pool's) which provisioning VM's off the snap from that new gold VM. Something interesting during this composing process I noticed earlier today - when the linked-clone (we unfortunately do not have licensing for instant clones) is being provisioned, I noticed the disk being used is its own named "vmname.vmdk" - great! However, once provisioning is complete, the disk changes to "vmname.checkpoint.vmdk"

Anyone have any insight into why that occurs? Why wouldn't the cloned VM just continue to use it own named .vmdk? Any helpful answer on that would be much appreciated!

At any rate, we're able to fix/work around the Win10 restart slowness by, again, migrating the VM's storage only, to a new datastore, consolidate under snapshots, power up VM, and BAM - fixed.

I hope all of this makes sense - if anyone is still experiencing Win10 restart slowness and is able to try this storage migration/consolidate process, please give it a try and let me know the result. So far it's been consisten for me, but I'd like to hear about what others experience is.

This almost feels like a possible bug in vCenter or View somewhere, but I can't say for sure - anyone with VMWare have any thoughts to share?

Also, fwiw, we're running a Pure storage array (not vSAN) with plenty of space available and excellent dedupe. Also, we're using EFI (though I've tried EFI and BIOS, neither seems to make a difference, as far as restart times are concerned).

-Nathan

0 Kudos
FreddyFredFred
Hot Shot
Hot Shot

I've never used View but based on your comments and especially the comment in step 6 (you can expand the hard drive), it sounds like you essentially removed the snapshot on the vm which is the fix to make things fast again. Maybe there's a bug in VIew or maybe it's working as designed but it's never been noticed before because 1809 is the first version with this issue.

0 Kudos
pdudas76
Enthusiast
Enthusiast

Horizon 7.6

ESXi 6.5 P03

Nimble

I can confirm at least in this environment that migrating an 1809 VM with BIOS from VMFS5 to VMFS6 increased the boot time to 11 seconds for me. Previously it was at least 5 min.

This is WITH snapshots still existing.

0 Kudos
nschlip
Contributor
Contributor

Is that a linked-clone VM using a checkpoint.vmdk disk?

0 Kudos
pdudas76
Enthusiast
Enthusiast

On most of my tests I was building a base for Linked clones, so the tests were just booting up configuring the base.

Though, I did just create some linked clones from it to test the boot up there. On VMFS6 it takes the over 5 min boot up. Migrating those linked clones to VMFS5 decreases that greatly but to about 20-30 seconds.

Side note: I am having a different problem though so not sure if that is what is causing the discrepancy between the bootup time for the base and the clones. When logging into the Linked Clones I am getting an error that Windows has created a temporary paging file. Working with VMware on that since I am configuring the paging file in the same way I have done all along. And in 1803 KB4480976 breaks the Paging file on clones, and reverting back to KB4480966 makes it work again.

0 Kudos
sWORDs
VMware Employee
VMware Employee

Could everyone post the SR numbers, plus provide more info about the storage?

Anyone using Intel 3xxx/4xxx (non Optane) backed local/VSAN storage?

0 Kudos
Poom22
Enthusiast
Enthusiast

Same for me guys on 1809

VWFS6 Pure storage

0 Kudos
FreddyFredFred
Hot Shot
Hot Shot

3par storage (all flash) for me. will PM you the SR #

0 Kudos
nschlip
Contributor
Contributor

We're running Pure storage on VMFS5

So, something else interesting I've learned as of late - from research, speaking with VMWare support engineers, and then directly with some inside customer success/engineer VMWare contacts, is they have all recommended basically the same thing: try using Windows 10 1803 or switch to Windows 10 *LTSC (formally known as *LTSB).

We're currently running Semi-Annual Channel 1809 - The problem with LTSC is that certain features of Windows are disabled/turned off, of which some of those we want to use. Even the features that will supposedly slow down boot times, we have many of those disabled via group policy any way. So here's the thing: Windows 10 LTSC just released last November (11.13.18), which is version 1809 OS Build 17763.292 - that's the EXACT same OS Build & version we're running right now, using Semi-Annual Channel. Does that mean even if I switch to LTSC 1809, I'll have the same issues?

Well, I asked that question, and the response was basically to use LTSB - however, the last version of that was released August 2nd 2016 - yes, there have been revisions as recent as January 17th of this year, but we want to be on a version of Windows 10 that's more recent, not three years ago.

As for running Win10 1803, the only release for that is on Semi-Annual Channel and there is no release for 1803 that falls under LTSB/LTSC, so it that even worth trying?

I've also heard that the reason for running Windows 10 LTSB (which is version 1607), is that it's been out for sometime and is far more compatible and works much better with Horizon View - basically, the way in which Microsoft releases Win10 Versions / OS Builds (which I personally have no problem with, I think it's great) creates difficulty in VMWare keeping up with builds of their products when it comes to compatibility with Windows 10.

LTSB = Long-term Servicing Branch, which has been switched to....

LTSC = Long-term Servicing Channel

(basically, they're the same thing)

For reference: Windows 10 - release information | Microsoft Docs

0 Kudos
jse8619
Contributor
Contributor

Just chiming in, we're experiencing the same.

vCenter 6.7.0.20000

ESXi 6.5u2 (8285314)

Horizon 7.6

VMFS 6

EMC XtremIO

Happens on both Windows 10 1809 Enterprise N and Windows 10 LTSC 2019 (1809 LTSC).

I noticed this when I snapped the templates I were building off of 1809. Boot time goes up by 2-4 minutes (it varies), and it gets exponentially longer the more times you snapshot the VM.

Didn't attempt doing any linked or instant clones yet before I saw this thread, so I don't know how those will fare.

0 Kudos
BenFB
Commander
Commander

We are in the same boat. LTSB/LTSC is not an option due to our needs. We are currently on 1703 and planning to move to 1809 although this issue might impact that. The Q3/Q4 releases like 1709 and 1809 carry a longer support window from Microsoft so I think they are preferable over LTSB/LTSC.

0 Kudos
BenFB
Commander
Commander

FreddyFredFred do you have a PR or SR that you can share? I'm going to lean on my account team for a status update.

0 Kudos
FreddyFredFred
Hot Shot
Hot Shot

Last I heard from support was 2 weeks ago and was told VMware and Microsoft were still working on finding the root cause. I've just asked for another update and will post back with what they say. I've they don't give me anything useful I'll consider sharing the SR.

BenFB
Commander
Commander

FreddyFredFred

If you could share your SR or PR I would really appreciate it. That will allow my account team to look it up and check the status.

0 Kudos
LukaszDziwisz
Enthusiast
Enthusiast

My SR is 19058118801. The PR was not given to me as I was told that it is confidential

0 Kudos
Andreasbuchwald
Contributor
Contributor

Hi,

we are running this problem now very long (native, in VDI, ....) and had SR with VMware and Microsoft.

Today i get informed by VMware that they found something new in Build 1809 thats called "VSCSIExecuteGetLBAStatus" that causing the problem.

Now Microsoft ist aked for an workaround before they fix this issue. Lets hope MS is working from now on this Bug!

Andy

Andreasbuchwald
Contributor
Contributor

Hi all,

hopefully good news.

VMware called Microsoft stopp working on this issue, cause they found something in VMFS6 whats going wrong, and allready working on an Hotfix comming out soon.

Andy

0 Kudos
BadEclipse2
Contributor
Contributor

Hi There,

I also have this problem with VMware Horizon 7.7 in VCenter 6.7 using a Windows 10 Template Version 1803 and using View Composer. After a restart command the VDI took about 10-15min to restart.

Initially my datastore was in VMFS6, now is on VMFS5 but this symptoms remains.

Does anyone have the solution, or the solution is to wait for Microsoft?

Thank you.

0 Kudos
sjesse
Leadership
Leadership

This is probably a different issue, I'd post this on the Horizon community, but I suggest trying to make a new parent follwing the steps in this vmware page

Creating an Optimized Windows Image for a VMware Horizon Virtual Desktop | VMware

0 Kudos