Am getting this message while trying to join an ESXi 6 host to a domain. I see a lot of KB articles, tips, forum entries, etc. on how to solve specific problems, but is there some recommended place to start with the log files on the host that will give me the best information to get to one of those "specific problems?"
So here's what may be the problem. VMware support asked for the Likewise and ESXi logs again, so I went back to the KB article that discusses how to set up logging for the Likewise agent (1026554) to jog my memory on getting that configured. What I found was that the article had been updated this week (6/2/15). Under the ESXi 6.0 section, it looks like there is now a new step that says "Start the lwsmd service by running this command:
/etc/init.d/lwsmd start
I don't recall ever taking that action before when I went through this process, but lo and behold, after starting that daemon, I was able to join all domains that it had failed on previously. (Or it appears that the daemon wasn't started from the messages I got when I ran the command.) So this service condition might be a known issue that has to be corrected in a future patch, where either the service should be set to run on startup or to start up and stay on whenever a domain join is requested.
Furthermore, there's another article that says that you need to set that service to start automatically using
chkconfig lwsmd on
which may have been the root cause of why it wouldn't start whenever I rebooted the host. Being that that's a very low level service, I wouldn't have had any idea it was running or not.
Now this whole thing might be completely off the mark and may or may not work for anyone else, but I can definitely say that the host joined after I manually ran that service startup script on the host, after not being able to join via many other troubleshooting actions.
May I know the what is the user credential format you are using while adding the host to domain and all required ports are open in your environment?
The UPN format (user@do.main.com) and yes I know the NETBIOS-style reference (DO-MAIN\user) doesn't work.
Active Directory service is running and the firewall is in its default configuration with the "Active Directory All" item checked (88,123,137,139,389,445,464,3268,51915 outbound).
Cool,To be Frank I didn't add my test environment with AD.May be I should try.Lemme see if I come across similar issues.
usafseic,
I have been spending a lot of time troubleshooting this for a large customer where we have nearly 500 hosts to get joined to the domain. Here are some of the things I have had to do and check to get it working.
First of all see this article for enabling logging for the Likewise agents. These are the log files you can review, however they haven't been very helpful for me. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=102655...
One thing I've noticed is that I am much more successful joining the domain immediately after a fresh reboot. My new process is to reboot the host, join the domain, then reboot again. If I let the host sit too long after a reboot before joining the domain I get the same error as you do. That has solved 80% of my issues so far. It is not the most ideal, but it seems to work.
Check that you have both the proper host name and domain name in the host's DNS and Routing configuration. If the name is wrong or the domain is missing it has failed for me.
In some rare instances I had to set a preferred domain controller because it was trying to authenticate to a DC that was in a location blocked by a firewall. In the host's advanced configuration -> UserVars you can enter the name of the preferred domain controller for the setting UserVars.ActiveDirectoryPreferredDomainControllers. This has helped on a couple of occasions where there wasn't a local functioning domain controller.
Wow, thanks for that info. Here's what the log file is showing
20150429210955:ERROR:lsass: Failed to run provider specific request (request code = 12, provider = 'lsa-activedirectory-provider') -> error = 2692, symbol = NERR_SetupNotJoined, client pid = 37545
20150429211010:ERROR:lsass: Failed to run provider specific request (request code = 12, provider = 'lsa-activedirectory-provider') -> error = 2692, symbol = NERR_SetupNotJoined, client pid = 37582
20150429211015:ERROR:lsass: Failed to run provider specific request (request code = 8, provider = 'lsa-activedirectory-provider') -> error = 40286, symbol = LW_ERROR_LDAP_SERVER_DOWN, client pid = 34807
20150429211017:ERROR:lsass: Failed to run provider specific request (request code = 12, provider = 'lsa-activedirectory-provider') -> error = 2692, symbol = NERR_SetupNotJoined, client pid = 37604
LW_ERROR_LDAP_SERVER_DOWN doesn't make much sense because it can resolve the hostname.
Expanding that specific error, I see
20150429211510:INFO:netlogon: Looking for a DC in domain 'XXX.YYY.COM', site '<null>' with flags 10
20150429211510:INFO:netlogon: Determining the current time for domain 'XXX.YYY.COM'
20150429211510:INFO:netlogon: Looking for a DC in domain 'XXX.YYY.COM', site '<null>' with flags 10
20150429211511:INFO:netlogon: Looking for a DC in domain 'XXX.YYY.COM', site '<null>' with flags 1001
20150429211511:INFO:netlogon: Filtering list of 9 servers with list of 0 black listed servers
20150429211511:INFO:netlogon: Filtering list of 5 servers with list of 0 black listed servers
20150429211515:ERROR:lsass: Failed to run provider specific request (request code = 8, provider = 'lsa-activedirectory-provider') -> error = 40286, symbol = LW_ERROR_LDAP_SERVER_DOWN, client pid = 34582
and then running as VERBOSE, there's some GSS-API error
20150429211753:INFO:netlogon: Looking for a DC in domain 'XXX.YYY.COM', site '<null>' with flags 10
20150429211753:VERBOSE:lsass: Affinitized to DC 'XXX.YYY.ZZZ.com' for join request to domain 'XXX.YYY.COM'
20150429211753:VERBOSE:lwreg: Registry::sqldb.c RegDbOpenKey() finished
20150429211753:VERBOSE:lwreg: Registry::sqldb.c SqliteGetValueAttributes_Internal() finished
20150429211753:VERBOSE:lwreg: Registry::sqldb.c RegDbOpenKey() finished
20150429211753:VERBOSE:lwreg: Registry::sqldb.c SqliteGetValueAttributes_Internal() finished
20150429211753:INFO:netlogon: Determining the current time for domain 'XXX.YYY.COM'
20150429211753:INFO:netlogon: Looking for a DC in domain 'XXX.YYY.COM', site '<null>' with flags 10
20150429211753:INFO:netlogon: Looking for a DC in domain 'XXX.YYY.COM', site '<null>' with flags 1001
20150429211753:INFO:netlogon: Filtering list of 9 servers with list of 0 black listed servers
20150429211753:INFO:netlogon: Filtering list of 5 servers with list of 0 black listed servers
20150429211753:VERBOSE:lwio: GSS-API error calling gss_init_sec_context: 1 (The routine must be called again to complete its function)
20150429211755:VERBOSE:lsass: Permission granted for (uid = 0, gid = 0, pid = 38118) to open LsaIpcServer
20150429211755:VERBOSE:lsass-ipc: (session:e8153487dc6055af-a9956bd6be374f20) Accepted association 0x3d410b40
20150429211755:VERBOSE:lwreg: Registry::sqldb.c RegDbOpenKey() finished
20150429211755:VERBOSE:lwreg: Registry::sqldb.c SqliteGetValueAttributes_Internal() finished
20150429211755:ERROR:lsass: Failed to run provider specific request (request code = 12, provider = 'lsa-activedirectory-provider') -> error = 2692, symbol = NERR_SetupNotJoined, client pid = 38118
20150429211755:VERBOSE:lsass-ipc: (assoc:0x3d410b40) Dropping: Connection closed by peer
20150429211758:ERROR:lsass: Failed to run provider specific request (request code = 8, provider = 'lsa-activedirectory-provider') -> error = 40286, symbol = LW_ERROR_LDAP_SERVER_DOWN, client pid = 34581
20150429211758:VERBOSE:lsass-ipc: (assoc:0x3d412618) Dropping: Connection closed by peer
and finally at the bottom level I see
20150429212849:DEBUG:lsass:KtLdapQuery():ktldap.c:149: Ldap error code: 4294967295
20150429212849:DEBUG:lsass:KtLdapGetBaseDnA():ktldap.c:258: Error code: 40286 (symbol: LW_ERROR_LDAP_SERVER_DOWN)
20150429212849:DEBUG:lsass:KtLdapGetBaseDnW():ktldap.c:295: Error code: 40286 (symbol: LW_ERROR_LDAP_SERVER_DOWN)
20150429212849:DEBUG:lsass:LsaSaveMachinePassword():join.c:2043: Error code: 40286 (symbol: LW_ERROR_LDAP_SERVER_DOWN)
20150429212849:DEBUG:lsass:LsaJoinDomainInternal():join.c:778: Error code: 40286 (symbol: LW_ERROR_LDAP_SERVER_DOWN)
I also tried the domainjoin-cli command, and it returns "The DC closed an LDAP connection in the middle of a query" and LW_ERROR_LDAP_CONSTRAINT_VIOLATION [code 0x00009d7b]
So I'm opening a ticket with our domain admins to see if they maybe have the object permissions messed up or to see if something is coming up on the back end.
So here's what may be the problem. VMware support asked for the Likewise and ESXi logs again, so I went back to the KB article that discusses how to set up logging for the Likewise agent (1026554) to jog my memory on getting that configured. What I found was that the article had been updated this week (6/2/15). Under the ESXi 6.0 section, it looks like there is now a new step that says "Start the lwsmd service by running this command:
/etc/init.d/lwsmd start
I don't recall ever taking that action before when I went through this process, but lo and behold, after starting that daemon, I was able to join all domains that it had failed on previously. (Or it appears that the daemon wasn't started from the messages I got when I ran the command.) So this service condition might be a known issue that has to be corrected in a future patch, where either the service should be set to run on startup or to start up and stay on whenever a domain join is requested.
Furthermore, there's another article that says that you need to set that service to start automatically using
chkconfig lwsmd on
which may have been the root cause of why it wouldn't start whenever I rebooted the host. Being that that's a very low level service, I wouldn't have had any idea it was running or not.
Now this whole thing might be completely off the mark and may or may not work for anyone else, but I can definitely say that the host joined after I manually ran that service startup script on the host, after not being able to join via many other troubleshooting actions.
I am currently having this same issue and am unable to resolve it using the method provided. Gladly welcome some assistance to get this working in my nested lab setup.
We are having similar problems. I have just found this: VMware KB: Joining an ESXi 6.0 host to Active Directory fails with error: 40286 (symbol: LW_ERROR_LD...
Hi, please your username as "username@yrdomain.com" while join domain on esxi server. Good Luck
Hi, for anyone still having this problem (i.e. cant join ESXi host to AD nor the VCSA) this post solved it for me Error joining vCenter Server Appliance to Active Directory » VCDX56 .
"You need to enable SMB version 1 in Windows Server 2012/2012 R2 from the registry".
Hope this helps anyone. Was a real pain to find.
Our site attempted joining the domain after enabling SMBv2 and the process continued to fail. After reversing the process and setting SMB2Enabled to '0', the lsass service fails to start. Can anyone verify that simply changing the "1" to "0" turns it off?
Folks,
Regarding turning on SMB v1 on and Active Directory controller....
This is a BAD idea.
It is also not something that we recommend. The reason for it is as simple as EternalBlue.
Malware is known to use SMBv1 and Microsoft has officially recommended that it be turned off.
If you have an issue where you cannot join an ESXi host or vCenter to AD without SMBv1, please contact VMware GSS.