1 person found this helpful
Are you using EFI boot or BIOS mode for your master image? I do see the same problem with EFI mode. Moving to VMFS5 datastore does not resolve it for me with EFI mode.
Windows 10 1809 LTSC, vSphere 6.7, Horizon 7.6, AppVolumes 2.15
Here is what I'm seeing
EFI mode. 2 snapshots
VMFS6 - Boot time 3 minutes
VMFS5 - Boot time 3 minutes
BIOS Mode. 2 snapshots
VMFS6 - Boot time 1 minute 35 seconds
VMFS5 - Boot time 8 seconds
Going from memory at this point but I think I saw it with both BIOS and EFI. Maybe BIOS was a bit better but still not what it should be.
Having the same issue here - Win10 1809 x64 Gold VM (master) reboots just fine, typically less than 30 seconds (around 26 seconds). However, all linked clone VM's (we have VMWare Advanced licensing, so no instant cloning, UEM, etc.) are taking upwards of 20-30 minutes.
x5 ESXi 6.7. U1 hosts
Tested with both EFI and BIOS Win10 1809; I have a support ticket in with VMWare, but it's been difficulting connecting at the same time. It sounds like someone else on here has a support ticket open, and they've escalated it to engineering and are waiting a response? If anyone has any input it'd be greatly appreciated. As it stands now, reboot a Win10 VM is taking far too long to fully boot to the login screen.
I have a ticket open for a while now and the last communication I got from VMware was on Tuesday saying: "VMWare Engineering team is currently working with Microsoft to root cause the issue and provide a fix."
Previous to that they I got an email saying they have a bug matching my issue and are working on it.
Thanks for the info, good to know - we've built a new Win10 gold VM in View, and are going to test with it. I'll let you know how it turns out.
We also experienced this issue on ESX6.5/ VMFS 6 and Win 1809. Moving back to VMFS 5 datastores fixed the issue. Hopefully VMware will resolve this soon
I have an update to share on this - it may be fairly "wordy" (I know that's not a word), but here goes....
Linked-clone Windows 10 VM's take a considerable amount of time (5-30 minutes, depending) to complete a guest OS restart cycle.
To make a long story short, I eventually noticed that disk IO was ZERO during a Windows 10 VM restart - it zero's out until the VM gets to the Windows 10 logo with the spinning circles, then the disk IO suddenly spikes once it gets to the login screen. From this, I began to look at disk configuration settings directly on the VM (vSphere -> right-click VM -> edit). I noticed the disk was using LSI SAS. I reviewed event logs, and noticed (on several Win10 VM's I tested with) there was a 10+ minute span of time with LSI_SAS event warnings. From there, I created a brand new Windows 10 master/gold VM using Paravirtual SCSI instead of LSI SAS. I then sanitized the mast VM and created a new pool - the Win10 linked-clone VM's are now down to 5 minute restarts. Much better, but still not great by any means.
We then noticed that the .vmdk on the VM (all the linked-clone VM's, in fact) was using a "vmname.checkpoint.vmdk" - this "checkpoint" in the .vmdk indicates it's using a snapshot from the master VM. After many hours of testing, we found a sort of fix/workaround....
1. Shut down the VM and migrate the storage (storage only)
2. Select configure per disk
3. Change the storage datastore for each disk on the VM, click Next, let it complete (may take anywhere from a few minutes to several, depending on disk size)
4. You should then have an alert on the VM indicating a consolidation is needed
5. Right-click the VM -> Snapshots -> Consolidate
6. Once the consolidation is finished, right click the VM - notice that you can now change the disk size (if needed)
7. Power up the VM, log in, and restart
The restart time is now down to under 30 seconds (I've seen it as low as 7-8 seconds). I tested this using thick and thin disk provisioning, and either one didn't seem to make a difference. With 100% certainity (at least in my case) it has to do with the VM running off a "checkpoint.vmdk" disk - once the VM's disk is pointed back to its own self-named .vmdk, the restart issue is resolved.
Now to provide some additional context to this:
In vSphere, I create a new VM, set configuration options, point to an .iso, power up VM and proceed through our imaging process. Once this is complete, I log into the VM using a domain admin account, configure the OS, sanitize (prep it for cloning), and take a single snapshot.
I then create (or edit) a View pool (for Win10 testing they have all been persisten/dedicated pool's) which provisioning VM's off the snap from that new gold VM. Something interesting during this composing process I noticed earlier today - when the linked-clone (we unfortunately do not have licensing for instant clones) is being provisioned, I noticed the disk being used is its own named "vmname.vmdk" - great! However, once provisioning is complete, the disk changes to "vmname.checkpoint.vmdk"
Anyone have any insight into why that occurs? Why wouldn't the cloned VM just continue to use it own named .vmdk? Any helpful answer on that would be much appreciated!
At any rate, we're able to fix/work around the Win10 restart slowness by, again, migrating the VM's storage only, to a new datastore, consolidate under snapshots, power up VM, and BAM - fixed.
I hope all of this makes sense - if anyone is still experiencing Win10 restart slowness and is able to try this storage migration/consolidate process, please give it a try and let me know the result. So far it's been consisten for me, but I'd like to hear about what others experience is.
This almost feels like a possible bug in vCenter or View somewhere, but I can't say for sure - anyone with VMWare have any thoughts to share?
Also, fwiw, we're running a Pure storage array (not vSAN) with plenty of space available and excellent dedupe. Also, we're using EFI (though I've tried EFI and BIOS, neither seems to make a difference, as far as restart times are concerned).
I've never used View but based on your comments and especially the comment in step 6 (you can expand the hard drive), it sounds like you essentially removed the snapshot on the vm which is the fix to make things fast again. Maybe there's a bug in VIew or maybe it's working as designed but it's never been noticed before because 1809 is the first version with this issue.
ESXi 6.5 P03
I can confirm at least in this environment that migrating an 1809 VM with BIOS from VMFS5 to VMFS6 increased the boot time to 11 seconds for me. Previously it was at least 5 min.
This is WITH snapshots still existing.
Is that a linked-clone VM using a checkpoint.vmdk disk?
On most of my tests I was building a base for Linked clones, so the tests were just booting up configuring the base.
Though, I did just create some linked clones from it to test the boot up there. On VMFS6 it takes the over 5 min boot up. Migrating those linked clones to VMFS5 decreases that greatly but to about 20-30 seconds.
Side note: I am having a different problem though so not sure if that is what is causing the discrepancy between the bootup time for the base and the clones. When logging into the Linked Clones I am getting an error that Windows has created a temporary paging file. Working with VMware on that since I am configuring the paging file in the same way I have done all along. And in 1803 KB4480976 breaks the Paging file on clones, and reverting back to KB4480966 makes it work again.
Could everyone post the SR numbers, plus provide more info about the storage?
Anyone using Intel 3xxx/4xxx (non Optane) backed local/VSAN storage?
Same for me guys on 1809
VWFS6 Pure storage
3par storage (all flash) for me. will PM you the SR #