PeteJohns
Enthusiast
Enthusiast

AppVols upgrade to 2.10 "NTLM Authentication failed for: Domain\user. Virtualization is disabled"

Hi,

Last night we upgraded from 2.9 to 2.10, tested and tested as much as we could and were happy so went live. This morning users (and myself!) are reporting that upon login you receive the error and no stacks get attached. Logging out and back in again occasionally fixes the issue but at the moment it is not consistent at all. Very close to rolling back but I'd rather go forwards!

Error:

App Volumes Message

Error from Manager <dns name> (error code 401):

NTLM Authentication failed for: <Domain\user>

Virtualization is disabled

Horizon View 6.2.

5 domain controllers on Server 2012 R2

ESXi 6

Any ideas?

48 Replies
CPhelix
Enthusiast
Enthusiast

Hi PeteJohns!

I actually got this error on 2 of my capture machines after upgrading from 2.9 --> 2.10 yesterday. I found that something odd was going on with communication of the Capture Machines and the domain. Possibly doing 100's of snapshot restores. Removing the machines and re-joining them to the domain cleared the error.

I haven't gotten the error on any user desktops though. (Not sure what you use, but I use linked clones)

Hope this helps in some way.

--Chris

0 Kudos
OFish
Contributor
Contributor

Are you by chance using the trusted domains feature?

0 Kudos
dkirchberger
Contributor
Contributor

Hi,

same error here with AppVolumes 2.10 on a clean environment.

The system log is full with "NTLM Authentication Invalid" errors.

Nothing seems to be wrong with the configuration:

-Two AppVolumes managers with one database

-SVService with two Manager registry keys

-VMs (pool) were already redeployed

Other ideas?

0 Kudos
hockeyguyin714
VMware Employee
VMware Employee

Typically this occurs if you have time skew between machines and the AD servers.

Verify that all yours hosts and AD servers line up on the same time.

Check Windows Event logs errors with time or domain controller access.

0 Kudos
nuberaldhoore
Enthusiast
Enthusiast

I have the same error here after upgrading from 2.7 to 2.10

VDIs tha are located in the same site as the AppVolumes servers work well, VDIs that are located in another site (and this using another DC) are having the problem.

0 Kudos
nuberaldhoore
Enthusiast
Enthusiast

I have been looking into this problem for hours and hours and hours and cannot figure out what is wrong:

When I create a kinked clone Horizon View Pool, I see in that the System Messages in the GUI of AppVolumes contains fro each provisioned machine an entry "NTLM authentication failed: Authentication failed for: XXX" (see attached screenshot AppVol_01).

When I log in to a VDI  I see the error message "Error for Manager x.x.x.x(error code 401): NTLM Authentication Invalid: Authentication failed fro : XXX. Virtualiation disabled" (see attached screenshot AppVol_02).

I am also attaching the agent log (svservice.txt) and the Appvolumes Manager log (production.txt).

Description of the Environment

I am using ESX 5.5 (build1331820), vCenter 5.5 (build 1750787), Horizon View version 6.0.0 (build 1884746), AppVolumes Manager version 2.10, AppVolumes Agent version 2.10, Windows 7 SP1 VDIs

vCenter and  AppVolumes managers are located in DataCenter11, VDIs are running in DataCenter50

2 AppVolumes Managers have been configured

I added the 2nd AppVolumes manager in the registry key HKLM\SYSTEM\CurrentControlSet\Services\svservice\Parameters

The AppVolumes Managers have been configured in the AppVolumes agent with their IP address to avoid possible DNS issues.

Appstacks are available on Datastores in both DataCenter11 and DataCenter50 (An AppVolumes storage group replicates from DataCenter11 to DataCenter50)

Appstacks and VDIs are sharing the same LUN (The AppVolumes option "Mount local copies of volumes" has been enabled)

If somebody has an idea on how to resolve this error I would be very happy.

Jason_MarshallRay_handels

0 Kudos
Prime_ID
Contributor
Contributor

1.UP

We was hoping to not make a support case, has anyone found an solution here?

0 Kudos
nuberaldhoore
Enthusiast
Enthusiast

I spend all day today and a good part of the entire week trying different parameters in app volumes, checking DNS issues, checking time sync issues, reinstalling the app volumes agent, ... but nothing worked.

I was also hoping not to have to open an SR as I do not have a good experience with VMware support and AppVolumes (I have a support request open since January already and two other ones were closed without a solution).

I have contacted two people I know from VMware directly, but I do not have any news from them yet.

I HAVE to solve this before the end of the year or come up wth another solution for the customer as this issue is blocking the entire project.

0 Kudos
FROGGY_VMware
VMware Employee
VMware Employee

have you tried testing with a single machine to remove it and re-join the domain?

0 Kudos
nuberaldhoore
Enthusiast
Enthusiast

Yes, I dis that test also, but no luck.

0 Kudos
FROGGY_VMware
VMware Employee
VMware Employee

I noticed in your svservice log "NetGetJoinInformation() success, domain name WORKGROUP and type is 2"

type 2 = workgroup and type 3 = domain joined......so what account is running the app volumes service on the machine?

0 Kudos
nuberaldhoore
Enthusiast
Enthusiast

The App Volume Service is running as Local System.

At first my golden image was not joined to the domain. The composed VDIs however are joined to the domain.

Afterwards, I also joined my golden image to the domain, but it does not give any difference.

0 Kudos
nuberaldhoore
Enthusiast
Enthusiast

Yesterday evening I opened a VMware Support call for this issue (SR # 15836351012).

I was contacted by VMware and send them the attached logfiles.

Hopefully they can come back soon with an explanation/solution

0 Kudos
hahaU812
Contributor
Contributor

Hi Nuberaidhoore,

I too am having the same issue. Would be great to hear if you managed to find out what it was and how to fix it.

Matt

0 Kudos
nuberaldhoore
Enthusiast
Enthusiast

‌VMware support suggested me to follow the instructions in VMware KB: Synchronizing ESXi/ESX time with a Microsoft Domain Controller

I am however reluctant to make theses changes as I believe there is no time sync issue in the environment I am working on. Moreover, these type of changes might negatively impact other production systems and one should be very careful implementing this.

the time synch process in the environment I am working in is as follows:

1. windows clients sync sync with any domain controller in the domain

2. Member domain controllers sync their time with the PDC emulator

3. The PDC emulator syncs with dedicted NTP servers

4. The dedicated NTP servers sync with external sources

NTLM authentication errors due to time issues also typically only occur when the time screw is a couple of minutes (I believe the default is 5 minutes).

i am waiting now further instructions from VMware.

0 Kudos
nuberaldhoore
Enthusiast
Enthusiast

VMware support told me that there is, in version 2.10, a new NTLM check for computer account and user and they think this is causing the errors.

The reason is if a computer account is created and DC1 knows about this but for any number of reasons (too long to discuss on this thread) AppVolumes Manager asks DC2 about the computer account it will fail if DC2 is not yet aware of the new computer. Then AppVolumes Agent will disable virtualization for that computer because AppVolumes Manager said you are not a valid computer.

Support asked me to test if specifying a DC directly resolves the issue?

.

.

I have tested the scenario where VDIs/users located in DC2 will authenticate to a domain controller in DC1 (where the Appvolumes servers are located) instead of their local domain controller. Doing this resolves the issue, but this is obviously not a solution we can permanently live with.

.

.

This is obviously a problem in any distributed environment (Centrally located AppVolumes servers and distributed AppVolumes clients).

I am waiting for instructions from VMware Support on how to proceed.

0 Kudos
nuberaldhoore
Enthusiast
Enthusiast

VMware support suggested me now to downgrade the AppVolumes Agent to version 2.9 as this version of the Agent does not have the NTLM authentication check.

0 Kudos
Raymond_W
VMware Employee
VMware Employee

Hi,

Small update about this issue,

We are aware of this issue and it has our full attention, at the moment we are looking at multiple options to solve it. If I have more information about this issue, I'll let you know.

Temporarily workaround can be:

  1. Point all VDIs to the same DC at the main site.
  2. Utilize App Volumes Agent 2.9.0 as this will not have the NTLM authentication needed at computer startup. This would avoid the replication time needed for the VMs to replicate back to the main DC after the VM joins the domain.  *This may not resolve all issues*
  3. Reduce replication time from the sites to the main DC so that the VMs show as joined to the domain.

Kind regards,

Raymond

Kind regards, Raymond Twitter: @raymond_himself
0 Kudos
BungeBash
Enthusiast
Enthusiast

Will 2.9 agent support Windows 10?  We are seeing these issues with 2.10 when looking at the System Messages:

Jan 07 2016 11:20AM

NTLM authentication Error: Unable to contact Active Directory to authenticate xxxxxxxxxxxxxxxxxxxxxxx

On Windows 7, the authentication still works and the disk is attached even though the system claims it failed to authenticate.  On Windows 10, the disk attachment fails and we get the virtualization is disabled error.

Also, from the above, our DC's are all online and authenticating just fine with any other application/desktop.

0 Kudos