VMware Horizon Community
PeteJohns
Enthusiast
Enthusiast

AppVols upgrade to 2.10 "NTLM Authentication failed for: Domain\user. Virtualization is disabled"

Hi,

Last night we upgraded from 2.9 to 2.10, tested and tested as much as we could and were happy so went live. This morning users (and myself!) are reporting that upon login you receive the error and no stacks get attached. Logging out and back in again occasionally fixes the issue but at the moment it is not consistent at all. Very close to rolling back but I'd rather go forwards!

Error:

App Volumes Message

Error from Manager <dns name> (error code 401):

NTLM Authentication failed for: <Domain\user>

Virtualization is disabled

Horizon View 6.2.

5 domain controllers on Server 2012 R2

ESXi 6

Any ideas?

48 Replies
Lieven
Hot Shot
Hot Shot

VMware support told me to downgrade the appVolumes Agent to version 2.9, they did not tell me about the DEMO=1 setting

Can somebody from VMware give his/her idea about the two proposed solutions:

1. DEMO=1 in c:\program files (x86)\clouvolumes\manager\svmanager_server.bat

2. Downgrade Appvolumes Agent to version 2.9

It is strange that a different suggestion is given by VMware support for the same problem

Cfr: my ticket number was SR #15836351012

Reply
0 Kudos
MSMorgan12345
Contributor
Contributor

Actually tried the Demo fix and still saw the issue afterward. (Though it only happen 1 in 20 logins)  Though just a heads up.  Talked to our VMware rep and he said that 3.0 is getting released this month so probably won't be worth it to downgrade.

Reply
0 Kudos
Bleeder
Hot Shot
Hot Shot

Isn't there still a question whether 3.0 is just a feature release and won't fix this bug?

Reply
0 Kudos
joshopper
Hot Shot
Hot Shot

I was also told to expect a 2.10.1 release which would address this issue.

Reply
0 Kudos
epa80
Hot Shot
Hot Shot

This is something we're experiencing as well. Seems completely sporadic. Luckily we're in a POC, but, unluckily, as I said, it's so sporadic. As are, it seems, the fixes posted in this thread. I'm hoping this week to hear something about a permanent fix, otherwise we'll just be rolling back to 2.9. We haven't bothered to utilize any solutions posted here, only because they seem to be hit or miss. I'd rather keep it vanilla until we find the long term fix.

Reply
0 Kudos
epa80
Hot Shot
Hot Shot

I wanted to post about a few things we tried that SEEM to have resolved our issue. We did these 2 things and have not had the error on about 40 consecutive logins. I spoke with VMware support and they recommended the DEMO setting be set. We were going to go ahead and do it, when we decided to do these "maintenance" steps listed below. Once done, we've been pretty solidly set.

  1. Installed 90 Critical OS Patches on the Managers.
    • We had our AppVolumes instance spun up by VMware professional services a few months ago. We were lax in getting these things patched and on a regular schedule. That has been resolved as of yesterday as they are now on our monthly patching plan, but in the meantime, I figured I'd post a screenshot of the 90 we installed. I don't know how helpful that is for anyone to weed through 90 patches and see if any possibly made a difference, but, it is there for anyone to look.
  2. Statically set our WINS servers on the Managers.
    • We know that typically WINS isn't important, and from speaking to other AppVolumes customers, by and large it seems that AV does most of it's talking via DNS. Still, not all environments are the same, and WINS perhaps is utilized at our site differently than it is elsewhere. Again, I don't know if this was what did it at all, but, this was a step we performed on all 4 managers. My guess is it was just an oversight by professional services, as WINS may or may not be needed by the book for AppVolumes. Possibly just moot. As these are servers, and IP/Gateway/Subnet/DNS are all set statically, it just made sense to set WINS as that is our standard anyway.

As I said, after doing this, we haven't seen the NTLM error 401 once. Would I be surprised to see it again? No, not at all. I am happy though we were able to do it without modifying the .bat file, as benign as that is. If it ends up coming back, we'll just move on to that step.

Again see attached for the patches we installed. Maybe 1 will click with someone else experiencing the issue. Apologies for 3 screenshots, I couldn't get the entire list of 90 into one sheet.

Edit: to clarify, our architecture:

We have 4 Managers in each of our 2 Data Centers (8 Managers in total).

Each side is behind a VIP: IE DC1-APPV has all 4 DC1 managers behind it, and DC2-APPV has all 4 DC2 managers behind it.

On our parent image when installing the agent just points to their side's respective VIP.

In each DC, we have 1 domain controller, so in regards to latency/replication, the VMs should never be leaving the DC to talk to their controller. Ditto the managers.

Edit: my co-worker provided me with a spreadsheet of the patches installed, might be easier to view than those screenshots.

K_Miller
Enthusiast
Enthusiast

We just started experiencing this issue on June 20th. It appears to have been resolved at this point. In the Appvol Manager I explicitly set my domain controller host name as a specific domain controller. Leaving this setting blank did not work and selecting the primary domain controller which holds all of our roles did not work.

Reply
0 Kudos
cyberfed2727
Enthusiast
Enthusiast

FWIW this is now actually in the documentation for 2.11 with steps on how to disable NTLM for App Vols.

From page 18:

Disable NTLM Authentication

NTLM authentication is used to verify the user, computer, and the domain of the agent when it makes

HTTP requests to the App Volumes Manager.

You can allow or stop the HTTP request from proceeding by defining a system environment variable.

Procedure

1 On the App Volumes Manager machine, open Windows Explorer.

2 Right-click My Computer.

3 Click Properties > Advanced System Settings > Environment Variables.

4 In the System Variables panel, click New.

The New System Variable window appears.

5 Enter AVM_NTLM_DISABLED in the Variable name text box.

6 Enter 1 in the Variable value text box.

7 Restart the App Volumes Manager service.

This disables the NTLM authentication.

I'm going to go out a limb and say that this config probably holds true for 2.10 and older installs as well. I'm speculating.

In 2.10 pointing to a DC seemed to resolve the NTLM issue for all but one of our users. I don't like this method though if that DC is unavailable then App Vols is going to freak out.

Best of luck.

Reply
0 Kudos
hockeyguyin714
VMware Employee
VMware Employee

It might be helpful to turn debug logging on the App Volumes Manager to see what Active Directory server was used to authenticate the user.   App Volumes logs the LDAP failure code which should give you some key indication why it failed or timed out or whatever the issue was.  http://kb.vmware.com/kb/2101668  You will want to look at production.log on the App Volumes Manager to see why the user failed to login.  Make sure to turn debug logging off after you figure out the possible cause.

App Volumes Manager will talk to the DC that first responds to the query for that domain.   This can be controlled and limited by Active Directory Sites and Services to further increase success chances. 

Reply
0 Kudos