jmatz135
Hot Shot
Hot Shot

This happens to us as well.  Probably about the same rate as you.  As for Ray's response in our environments (we have two) this should most definitely not be a congestion issue with lots of VMs trying to check in.  We have about 100 total users in one environment and have two app volumes managers load balanced.  If two servers can't handle 100 machines there is a major issue with the platform.  Regardless, it isn't actually even that many VMs as we use instant clones that spin up when necessary so we only have at most about 8 machines spinning up in a given 30 second time frame.  So basically 2 app volumes managers can't handle 8 machines starting up at in 30 seconds. 

Here is the interesting thing.  We set up one appstack to apply to the computers themselves and not at login.  The appstack that is applied to the computer will actually attach to the VM even though the agent itself says that it failed contacting manager at computer startup.  How this is even possible is anyone's guess.  We actually applied the appstacks to the computer instead of at login just so we might be able to see what machines didn't contact the manager.  This clearly didn't work though as the appstack still attaches at computer startup even though the agent failed to connect to the manager.

One other thing we did was set the MaxDelayTimeoutS registry variable to 600.  This seems to help a bit. This is because the agent at startup while it says it is checking every 5 seconds over 120 seconds to reconnect to the manager if you actually check the network traffic it doesn't actually do that.  It only checks once.  The timeout is then hit at 120 seconds and it says it fails.  Oddly if you set the timeout to above 305 seconds it will actually check a second time at 300 seconds plus or minus 5 seconds.  So at least you get the agent to do 2 whole checks to see if the manager is there and 2 is twice the opportunity for the agent to actually work.

Reply
0 Kudos