VMware Horizon Community
bradstaenglen
Contributor
Contributor

VM's stuck at Startup

Hi Everyone,

i just recently started facing this weird issue with my view deployment. it only affects about 40VM's which are all apart of 2 pools that use dedicated linked-clone VM's, running windows 7, set with no Refresh time. Also have the power off policy to be to off with 10 spares powered on.

i just built the pool last thursday and already users are having issues that when they log off after the first use, the VM wont spin back up for them. When i check the VDM i notice it says "Startup" and never goes away unless i manually power down the VM. Then the user can get in. what i also noticed was that when the user logged off, i opened the console to it, and it takes bout 30 minutes for the VM to log off Smiley Sad

Does anyone have any ideas what is causing the VM's to take so long to log off, as im sure it's affecting the view agent in some way where it cant be logged onto unless the VM is reset.

0 Kudos
3 Replies
jvasquezmsac
Contributor
Contributor

Did you ever find out what the problem is?  We are having the same problem, though we are only using View 5.1.2, and not Horizon yet.  It only happens on one of our physical servers and can be fixed with a live migration of the VMs to another server.  It never happened with View 4.6 on the same physical server.  This is a single large pool with about half set to be available and it seems to always be around 10 VMs stuck in startup.  Almost everything else seems the same as your pool, except these are refreshed on disconnect/logoff and they are not dedicated.

This has happened on different pools, but always the same physical server.  If it helps, these are VM images that were running View 4.6 and were upgraded to View 5.1.2. Also, while there are different spec servers in the cluster, the server in question is not the only one with the same specs.  All hypervisors are using the same version of esxi (5.1.0 build-799733).  The View server is running 5.1.2 build-928164.

Update: We may have figured out what it was.  We recently moved all our servers and VMs to different subnets as well as phase out some older DCs and the one server in question did not get its dns settings updated as it was using two DCs that were no longer active.  This was affecting the NTP service and the server was set one hour ahead of the others.  Upon fixing this, and rebooting the VMs that were stuck in startup, they all are now available.  I will post an update if this was just a coincidence.

0 Kudos
mcharris1520
Contributor
Contributor

Howdy,

This confirms our suspicion and tests. We just had the same issue crop up and solved it. Friday 3/8/2013 Everything running great. Monday 3/11/2013 All hell broke lose. Turns our master image was built BEFORE Daylight savings time switched back. So when our Vms refresh upon logout the time doesnt change to get out of DST until network connectivity is restored. Apparently this is AFTER the agent talked with the connection server. We have changed our VM Tools to correct this issue. For an immediate fix just boot up your master, make sure all time settings are corrected, shut the VM down and take a new snapshot and recompose. Dont let this cause the downtime that it did for us.

-Mike

0 Kudos
jvasquezmsac
Contributor
Contributor

Yeah, I think that is what it was.  I recomposed the main pool over the weekend (before daylight savings) and it was the pool that had the most problems. When I came in on Tuesday, there were 49 VMs, all from the main pool, stuck in startup mode.  I updated the base image again (and stagger rebooted all the servers for good measure) and recomposed it and so far no problems.  There are problems with the other images, but those are easier to manage as there is significant downtime available to recompose.

Normally a recompose happens in the middle of the night, but lately they have been failing.  They show they finished, but only part of the pool is updated.  I would suspect it is the same problem with the time difference on one server that was the culprit.  I will check tonight when I recompose the second pool that is showing the startup problem.

0 Kudos