VMware Horizon Community
jmatz135
Hot Shot
Hot Shot

App Volumes Agent Fails to Connect to Managers

About 5% of the time we have app volumes agents on machines just fail to connect to the app volumes manager.  I get the following error, but it doesn't make sense because it clearly has network access seeing as it becomes available in Horizon View just fine and users can log into the machine via View just fine, but they don't get appstacks because the App Volumes agent has decided not to connect to the manager for whatever reason.  Once in the desktop you can even go to the app volumes manager website just fine with no errors so it isn't like there is just an issue to the app volumes manager.  The funny thing is it says it waits 5 minutes, but I'm almost positive that it doesn't actually do that and it fails out in far less than that.  Even more interesting is now that the app volumes agent can have computer and user assigned appstacks we have one appstack that is assigned to the computer and the computer will still actually get that appstack AND STILL FAIL to connect to the manager.  How is that even possible?  How can an agent that claims to have failed to connect to the manager still get an appstack applied to it?  Of course when a user logs in though they don't get their user assigned appstacks,so...

[2018-04-07 00:05:11.390 UTC] [svservice:P868:T964] [0] Connecting to appvols.manager.local:443 using HTTPS (attempt 1)

[2018-04-07 00:05:11.390 UTC] [svservice:P868:T964] WinHttpSendRequestWithSSLCertValidation: SSL certificate validation is disabled.

[2018-04-07 00:05:11.390 UTC] [svservice:P868:T964] WinHttpSendRequestWithSSLCertValidation: WinHttpSetOption(WINHTTP_OPTION_SECURITY_FLAGS) succeeded.

[2018-04-07 00:10:11.386 UTC] [svservice:P868:T964] WinHttpSendRequestWithSSLCertValidation: WinHttpSendRequest succeeded.

[2018-04-07 00:10:11.386 UTC] [svservice:P868:T964] WinHttpReceiveResponse timed out waiting for response

[2018-04-07 00:10:11.386 UTC] [svservice:P868:T964] Retrying in 5 seconds (waited 300 seconds out of 120 max)

[2018-04-07 00:10:16.386 UTC] [svservice:P868:T964] Aborting HTTP request after exceeding time limit (120 seconds)

[2018-04-07 00:10:16.386 UTC] [svservice:P868:T964] HttpComputerStartupThread Pre-startup over HTTP failed: error 1000

[2018-04-07 00:10:16.386 UTC] [svservice:P868:T964] HttpComputerStartupThread: failed (computer startup)

[2018-04-07 00:10:43.730 UTC] [svservice:P868:T872] Received SERVICE_CONTROL_INTERROGATE

Anyone have any ideas as to a fix?  As you can see in the logs certificate validation is already off since that causes all kinds of issues too, just apparently not this one.  I have also made sure the network cards can't be put to sleep via power management.  Our desktops are instant clone desktops that are refreshed every log off.

Reply
0 Kudos
14 Replies
BC559
Enthusiast
Enthusiast

It looks like you are using :443 without cert validation

Have you tried switching the AppVolumes Agent to use :80 since you are not using cert validation anyway?

Reply
0 Kudos
jmatz135
Hot Shot
Hot Shot

No, I haven't.  I would like to use 443 with cert validation but that works even worse so I can't at the moment.  Plus 443 without cert validation is still better than 80 as at least it is encrypted, just the certificate isn't validated so it could in theory be spoofed.

Reply
0 Kudos
techguy129
Expert
Expert

Are your app volume managers behind a load balancer?

Reply
0 Kudos
jmatz135
Hot Shot
Hot Shot

Yes, our app volumes managers are behind an F5 load balancer.

Reply
0 Kudos
techguy129
Expert
Expert

may I suggest adding the servers as additional sources for the agent:

VMware Knowledge Base

AppVolumes: Add Multiple App Volume Manager addresses to the App Volumes Agent - The SLOG – SimonLon...

We had a similar issue. Adding the above to your clients as well as updating to the latest version of app volumes solved our issue.

Reply
0 Kudos
jmatz135
Hot Shot
Hot Shot

We tried that and it didn't help as all that really does is make the agent "choose" one in the list.  It then sticks to that one and ignores the others.

In fact it caused other odd issues like the agent would literally just straight up delete the Manager1 entry and then it would just say it doesn't have a manager assigned and not going to even try to connect even though Manager2 and Manager3 still existed.

Reply
0 Kudos
Ray_handels
Virtuoso
Virtuoso

Just out of curiosity, have you checked the manager log file during the same time the agent tried to connect? Do you see info in the manager log? Do you even see any information in the manager log during this timeframe?

I don't have any info on your environment but it could be that this is during a boot storm and you might need some more power to service the incoming requests..

Reply
0 Kudos
jmatz135
Hot Shot
Hot Shot

Yes, the manager does see the agent.  That is how it gets the appstacks that are computer based apply.  Then it just stops communicating.  Also, this happens all of the time.  Unless you mean 5 machines booting up spread over two different app volumes managers is too much for the managers to handle.

Reply
0 Kudos
JSFunk99
Contributor
Contributor

Has this been resolved?

We had a similar issue but not sure if it's the same thing.  After months of back and forth, design had us make some changes to some files. 

How many VMs are communicating with the App Volumes?

How many App Volumes servers are you using?

Reply
0 Kudos
MattVicos
Enthusiast
Enthusiast

I'm seeing the same issue, did you get this resolved?

Reply
0 Kudos
MattVicos
Enthusiast
Enthusiast

I'm seeing the same issue, did you get this resolved?

Reply
0 Kudos
jmatz135
Hot Shot
Hot Shot

No this has not been fully resolved.  The agent doesn't communicate properly with the app volumes managers a small percentage of the time and the users will not get their appstacks.  I have though done some things to reduce how often this occurs.  On the app volumes manager in the machine managers tab turn off Mount ESXi, Mount Local and Mount Queue

pastedImage_0.png

Apparently all optimizations to actually make app volumes work faster just make it not work.

Reply
0 Kudos
MattVicos
Enthusiast
Enthusiast

strange...... I'm getting this issue on a clean windows VM I want to use for capturing AppStacks.

WinHttpReceiveResponse timed out waiting for response

Reply
0 Kudos
jmatz135
Hot Shot
Hot Shot

If you are getting it on a clean machine that you are using to capture it is probably because:

1. The server is name is wrong

2: The port is wrong 443 vs 80

3. You are using 443 and you don't have a valid ssl certificate to trust the server

Reply
0 Kudos