VMware Horizon Community
VSprague
Hot Shot
Hot Shot
Jump to solution

Instant Clone Publishing fails

I upgraded my 7.x environment to 7.10 and using a new Windows 10 Image tried to build a new Instant Clone pool. However the pool fails to publish. It creates a CP_Template vm and joins it to the domain, but then stalls out and displays this message:

Publish Error:

  • Fault type is UNKNOWN_FAULT_FATAL - After waiting for 300 seconds internal template VM
  • vm-998 is still not powered off. Giving up!

The time on the vm is correct, vmware tools is up to date, the hosts are running 6.7U2 and have all patches installed. So far VMWare support hasn't been able to resolve the issue either. Not sure where to go next, composer and full machine clones both work fine so I'm probably just going to go back to using composer as it has proven to be more reliable.

1 Solution

Accepted Solutions
VSprague
Hot Shot
Hot Shot
Jump to solution

Well vmware support finally came up with a solution.

Within this registry key:

HKLM\Software\VMware, Inc.\ViewComposer\ga

I had to create this DWord entry

GPUpdateEnabledOnIT Set to 0

After that i was able to create a new snapshost and deploy it without issue. I've been able to deploy snapshots from multiple 1903 VM's so hopefully this issue is resolved.

View solution in original post

27 Replies
sjesse
Leadership
Leadership
Jump to solution

Folow this guide exactly

Creating an Optimized Windows Image for a VMware Horizon Virtual Desktop | VMware

Another thing to check if support didn't make sure you do ipconfig /release before shutting the image down and taking a snapshot

Kishoreg5674
Enthusiast
Enthusiast
Jump to solution

You need to first prevent cp-template from getting destroyed after the customization failure. Enable debug mode by following the below edoc

Troubleshooting Instant Clones in the Internal VM Debug Mode

Enable the provisioning back on the affected Instant clone desktop pool & login to the VM and check the customization log which is located in C:\Windows\Temp\vmware-viewcomposer-ga-new.log.

Possible causes for customization failures are:

* APIPA address.

* Domain join failure.

* Windows volume license activation.

VSprague
Hot Shot
Hot Shot
Jump to solution

I had instant clones working before the upgrade to 7.10 which makes me wonder if there is some kind of bug in 7.10 causing this. And yes, I'm making sure to perform an ipconfig /release before I shutdown and snap the vm.

Reply
0 Kudos
sjesse
Leadership
Leadership
Jump to solution

I've built a few test environments in 7.10 and haven't had any issues either upgrading from 7.4 to 7.10 or starting fresh, but that doesn't mean your wrong. The other thing I can think of is try and delete any existing ad objects and see if that helps. I'd also run the viewdbchck script and see if that can clear out any inconsistiencies. I'd hate to see you fall back to linked clones as they are deprecated from what I understand, and won't be available at once point.

Resolving Database Inconsistencies with the ViewDbChk Command

Reply
0 Kudos
VSprague
Hot Shot
Hot Shot
Jump to solution

Yup, vmware support had me do that, the cp-template vm's aren't being nuked. I've attached the log from one of the last attempts. The log complains about a Time Sync failure but the cp-template vm has the correct time on it so I don't know why that is happening. And like I said, vmware tools is up to date.

Reply
0 Kudos
sjesse
Leadership
Leadership
Jump to solution

If the parent image is joined to the domain try removing if you haven't, and also make sure the time is correct.

Reply
0 Kudos
VSprague
Hot Shot
Hot Shot
Jump to solution

Vmware ran the viewdbchck yesterday and it found nothing and yeah, I don't want to go back to composer either. But so far Instant Clones, at least with 7.10 haven't worked reliably and so far I've been unable to figure out why. This is apparently what I get for upgrading from 7.9 to 7.10

I've tried having the master image domain joined and not joined, doesn't have an effect on the results. And the time is correct whenever I check it.

Reply
0 Kudos
Kishoreg5674
Enthusiast
Enthusiast
Jump to solution

it looks IT was joined to domain successfully, but later the cloned VM failed to join the domain. i.e. issue occurs on the cp-template deployment prior to the creation of the snapshot on the cp-template and cloning of the cp-replica.

2019-10-30 14:34:06,970 [3604] DEBUG Guest  -  ["Guest.cpp", 2926] Hostname is valid

2019-10-30 14:34:07,014 [3604] DEBUG Guest  -  ["Guest.cpp", 2855] Password Checksums Match

2019-10-30 14:34:07,014 [3604] DEBUG Guest  -  ["Guest.cpp", 2983] Machine Password Is Valid

2019-10-30 14:34:07,015 [3604] INFO  Guest  -  ["Guest.cpp", 3012] Attempting to join it1461786060 to domain: domain.com using preferred DC: domain.com\ind-dc02.domain.com

2019-10-30 14:34:15,982 [3604] DEBUG Guest  -  ["Guest.cpp", 3555] Cleared guestinfo.machinePasswd

2019-10-30 14:34:15,986 [3604] DEBUG Guest  -  ["Guest.cpp", 3572] Cleared guestinfo.machinePasswdChecksum

2019-10-30 14:34:15,986 [3604] INFO  Guest  -  ["Guest.cpp", 3166] Domain Join successful

2019-10-30 14:34:15,986 [3604] DEBUG Guest  -  ["Guest.cpp", 3059] IT Domain Join Succeeded With Preferred DC

2019-10-30 14:34:15,987 [3604] DEBUG Guest  -  ["Guest.cpp", 605] Template: Domain Join Completed

2019-10-30 14:34:25,990 [3604] DEBUG Guest  -  ["Guest.cpp", 624] Template: Marked Template Rebooted

2019-10-30 14:34:25,990 [3604] DEBUG Guest  -  ["Guest.cpp", 626] Template: About To Reboot

IC Agent is reporting the machine password is invalid due to invalid args. This happens when machine password is empty. This would happen this way on the replica.

2019-10-30 14:46:33,602 [3628] DEBUG Guest  -  ["Guest.cpp", 3810] Failed Getting Data: info-get guestinfo.inter-agent.vmpath

2019-10-30 14:46:33,603 [3628] DEBUG Guest  -  ["Guest.cpp", 2641] Failed getting guestinfo.AgentCustomizationFlags. Assuming no flags set

2019-10-30 14:46:33,603 [3628] DEBUG Guest  -  ["Guest.cpp", 2648] Agent Customization Flags: (0), CUSTOMIZATION_FLAG_NONE

2019-10-30 14:46:33,603 [3628] DEBUG Guest  -  ["Guest.cpp", 1174] Current Customization Flags: (0), CUSTOMIZATION_FLAG_NONE

2019-10-30 14:46:33,603 [3628] DEBUG Guest  -  ["Guest.cpp", 958] Starting clone customization...

2019-10-30 14:46:33,625 [3628] DEBUG Guest  -  ["Guest.cpp", 2822] Invalid Arguments

2019-10-30 14:46:33,625 [3628] DEBUG Guest  -  ["Guest.cpp", 3305] Machine Password Is Invalid

2019-10-30 14:46:33,625 [3628] DEBUG Guest  -  ["Guest.cpp", 3555] Cleared guestinfo.machinePasswd

2019-10-30 14:46:33,625 [3628] DEBUG Guest  -  ["Guest.cpp", 3572] Cleared guestinfo.machinePasswdChecksum

2019-10-30 14:46:33,627 [3628] DEBUG Guest  -  ["Guest.cpp", 4739] Set guestinfo.clone.CustomizationState to error

2019-10-30 14:46:33,627 [3628] DEBUG NotifyViewAgent::MarkCustomizationFailed  -  ["NotifyViewAgent.cpp", 117] Set NotifyVdmStatusValue to CustomizationFailed(5)

2019-10-30 14:46:33,627 [3628] DEBUG Guest  -  ["Guest.cpp", 1012] Clone: Domain Join Failed

2019-10-30 14:46:33,627 [3628] DEBUG Guest  -  ["Guest.cpp", 964] InitNetwork Failed

2019-10-30 14:46:33,627 [3628] DEBUG Guest  -  ["Guest.cpp", 436] Clone Customization Failed

Try the below action plan:

  1. On the master VM change MMU settings from software to hardware
  2. Modify the DRS migration threshold from 3 to 2. This will avoid the replica from being migrated during the provisioning.

If the above action plan doesn't resolve, then follow the below steps for further investigation:

  1. for the error "vm-998 is still not powered off. Giving up!".. you can login to  the VM and check eventlogs to see what is holding this VM from powering off.
  2. check the failed VM (it1461786060) and confirm whether guestinfo.machinePasswd and guestinfo.inter-agent.vmpath are set in the extraconfig parameters? this info can be found in the VMX file of the above mentioned VM.
  3. Also check VCenter tasks for any errors while taking the snapshot of the cp-template or cloning cp-replica from cp-template.
  4. You can also check the debug logs on the broker for the above mentioned time stamp to see where exactly we are failing in the process of cloning cp-replica.
Reply
0 Kudos
VSprague
Hot Shot
Hot Shot
Jump to solution

I tried a few of the things suggested but not luck. Then, instead of using the new Win10 master image I went back to the old Windows 7 image to see if that still works and it does. So the old Windows 7 image also running the 7.10 agent works fine without issue. So there is something wrong with the Win10 image. I just don't know what.

Reply
0 Kudos
sjesse
Leadership
Leadership
Jump to solution

What windows 10 version are you using, I couldn't get 1903 to work ,and actually 7.10 gave me a warning about using 1903 the first time so I've been using 1809 with some success.

Reply
0 Kudos
VSprague
Hot Shot
Hot Shot
Jump to solution

Yeah, I deployed Enterprise Build 1903 because it was the latest version on the compatibility matrix.

Reply
0 Kudos
nburton935
Hot Shot
Hot Shot
Jump to solution

During the Agent install, did you specify Instant Clone as an enabled component? You cannot have both LC and IC enabled on the Agent, so you can’t use the same image for both.

Reply
0 Kudos
VSprague
Hot Shot
Hot Shot
Jump to solution

Yes, the Agent Install will not install the Composer option and the Instant Clone option at the same time. Which is why I built one vm and then cloned it. So I have one with the composer option installed (which works fine). And then the cloned vm has the Instant Clone option installed. I'm currently building a new Win 10 1903 image from scratch. If it fails again I'll probably go back to 1607 LTSB.

Reply
0 Kudos
VSprague
Hot Shot
Hot Shot
Jump to solution

A fresh 1903 build failed as did a fresh 1607 build. So Windows 10 Instant Clones do not work for whatever reason. However Ubuntu and Windows 7 work fine.

Update: For some reason build 1803 worked fine, I'm at a loss as to why, but at least it's working.

Reply
0 Kudos
sjesse
Leadership
Leadership
Jump to solution

Any possibility of doing a fresh build out and moving people over? You can use the cloud pod function to add the new pod and create the new desktops, copy the appstacks and data over. You could also just build it as a new environment then just change dns once your ready as well. Thats what I did going from 6 to 7.4. It seems to work out every time I want to do an update, which isn't often, I have hardware I need to replace as well.

VSprague
Hot Shot
Hot Shot
Jump to solution

This is already a new environment. I am in the process of moving from 6.2.2. to 7.10. And at this point I don't think there are any issues with the View environment itself. I think there is an issue with either the 7.10 Agent or it's compatibility with Windows 10. In either case, I did a fresh build of 3 different Windows 10 clients and only 1803 currently works. And Windows 7 and Ubuntu instant clones work fine. I've asked VMWare to examine the logs from 1903 to see why it failed but I have little faith they will be able to tell me what the issue is. At this point I'll proceed with 1803 and just stick to that version.

Reply
0 Kudos
VSprague
Hot Shot
Hot Shot
Jump to solution

Spoke to soon about 1803, I installed some of the required software and made a new snap, and then it failed to deploy the new snapshot. So Linked clones are still not working reliably.

Reply
0 Kudos
VSprague
Hot Shot
Hot Shot
Jump to solution

So after some additional testing I have determined that I am able to deploy only when using the very first snapshot. If I have multiple snapshots for a vm and attempt to deploy anything other than the very first snap the process fails. This makes zero sense but so far it's the best I've got.

Reply
0 Kudos
Hotrod76
Enthusiast
Enthusiast
Jump to solution

How are you taking your snapshots?  Are you taking them while the master is powered on or powered off?  When I take a snapshot, I power down the master image and take the snap.  After the snapshot process completes, I power on the master image and build a new pool from the newly taken snapshot.  On more than one occasion, I've found that the snapshot I was trying to use got messed up which would cause provisioning errors.  When that happens I delete the snap I took and then take a new one.  This usually solves the issue for me.

Reply
0 Kudos