VMware Cloud Community
Mark_Herring
Contributor
Contributor

ESXi 6 hangs when joining Active Directory domain

I have just installed ESXi 6.0.0 on an HP ProLiant DL630p Gen8 server. The install all went ok until I tried to join the host to our domain. Under the configuration tab I selected Authentication Services and then properties. In the Directory Services Configuration window I selected Active Directory from the drop down and then entered our Domain under Domain Settings. After clicking on Join Domain I entered my username and password. At this point in the Recent tasks pane I see an entry that says Join Windows Domain with the status "In Progress". The task then just sits there in this status and the Directory Services configuration window reverts back to Local authentication with all of the values I entered now gone. The server then becomes less and less responsive until the point where we have to reboot it. As I test I installed ESXi 5.5 on the server wiping out version 6. When I attempt to join this to the domain it works first time. We have checked firewall ports and everything required seems to be open. Has something changed in 6 and has anyone else come across this? Any ideas on how to solve this?

** update **

Just checked the syslog.log file and it is filling up with the following line:-

lwsmd: [lsass] Failed to run provider specific request (request code = 12, provider = 'lsa-activedirectory-provider') -> error = 2692 = NERR_SetupNotJoined, Client pid = 41380

13 Replies
brunofernandez1

have you allready tried to resolv the domain from the esxi command line?

i've have also tried to join an esxi 6.0 server to the domain without any problems...there wasn't anything I had to configure before...

------------------------------------------------------------------------------- If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) Regards from Switzerland, B. Fernandez http://vpxa.info/
0 Kudos
Mark_Herring
Contributor
Contributor

Yes the domain name resolves from the esxi command line. I am able to ping all of our domain controllers and DNS os working as well.

0 Kudos
Tredecillionair
Contributor
Contributor

I am having a similar issue except I receive the Errors in Active Directory Operations message within a second after initiating the join operation from the vSphere 6 client (not the Web client). I am finding weird messages in both the syslog.log and in hostd.log both seeming to indicate that the join request was terminated by the domain controller. Adding to the oddity: The account for my ESXi 6 server populates in AD (if not pre-created) while the ESXi server reverts to Local Authentication. There is no change to the ESXi's behavior if I pre-populate AD with a computer account for the host. Also interstingly enough, I am able to join ESXi 5.5 servers patched to the latest rev without any issue. And moreover, if I upgrade an ESXi 5.5 that was joined the domain to ESXi 6 it reverts to local authentication. But then I can join my upgraded server without any errors. The ESXi 5.5 servers were fully hardened to the VMware ESXi security guidance standards.

What are you seeing in your hostd.log and syslog.log files?

shahed_hasib
Enthusiast
Enthusiast

Same issue was observed for ESXi 6.0 for HP Prolient DL 580 G8 (4 socket) servers.

marcinkloc
Contributor
Contributor

Hi,

I have the same issue. Can't add ESXi 6.0 host to Active Directory.

When I tried to restart ActiveDirectory Services I get connection error (remote server took too lonf to respond). I checked lswmd process, tried to restart it from cli and get another errors like "failed to release memory reservation". There is 29 lswmd processes (ones I checked it was over 60).

syslog.log

2015-07-19T13:01:32Z watchdog-lwsmd: [36423] Begin '/usr/lib/vmware/likewise/sbin/lwsmd ++group=likewise --syslog', min-uptime = 60, max-quick-failures = 5, max-total-failures = 1000000, bg_pid_file = '', reboot-flag = '0'

2015-07-19T13:01:33Z watchdog-lwsmd: Executing '/usr/lib/vmware/likewise/sbin/lwsmd ++group=likewise --syslog'

2015-07-19T13:01:33Z lwsmd: Logging started

2015-07-19T13:01:33Z lwsmd: Likewise Service Manager starting up

2015-07-19T13:01:33Z lwsmd: Starting service: lwreg

2015-07-19T13:01:33Z lwsmd: [lwreg-ipc] Listening on endpoint /etc/likewise/lib/.regsd

2015-07-19T13:01:33Z lwsmd: [lwreg-ipc] Listener started

2015-07-19T13:01:33Z lwsmd: [lwsm-ipc] Listening on endpoint /etc/likewise/lib/.lwsm

2015-07-19T13:01:33Z lwsmd: [lwsm-ipc] Listener started

2015-07-19T13:01:33Z lwsmd: Likewise Service Manager startup complete

2015-07-19T13:01:35Z lwsmd: Starting service: netlogon

2015-07-19T13:01:35Z lwsmd: [netlogon-ipc] Listening on endpoint /etc/likewise/lib/.netlogond

2015-07-19T13:01:35Z lwsmd: [netlogon-ipc] Listener started

2015-07-19T13:01:35Z lwsmd: Starting service: lwio

2015-07-19T13:01:35Z lwsmd: [lwio-ipc] Listening on endpoint /etc/likewise/lib/.lwiod

2015-07-19T13:01:35Z lwsmd: [lwio-ipc] Listener started

2015-07-19T13:01:35Z lwsmd: Starting service: rdr

2015-07-19T13:01:35Z lwsmd: Starting service: lsass

2015-07-19T13:01:35Z lwsmd: [lsass-ipc] Listening on endpoint /etc/likewise/lib/.ntlmd

2015-07-19T13:01:35Z lwsmd: [lsass-ipc] Listener started

2015-07-19T13:01:35Z lwsmd: [lsass] Failed to open auth provider at path '/usr/lib/vmware/likewise/lib/liblsass_auth_provider_local.so'

2015-07-19T13:01:35Z lwsmd: [lsass] /usr/lib/vmware/likewise/lib/liblsass_auth_provider_local.so: cannot open shared object file: No such file or directory

2015-07-19T13:01:35Z lwsmd: [lsass] Failed to load provider 'lsa-local-provider' from '/usr/lib/vmware/likewise/lib/liblsass_auth_provider_local.so' - error 40040 (LW_ERROR_INVALID_AUTH_PROVIDER)

2015-07-19T13:01:35Z lwsmd: [lsass-ipc] Listening on endpoint /etc/likewise/lib/.lsassd

2015-07-19T13:01:35Z lwsmd: [lsass-ipc] Listener started

2015-07-19T13:02:35Z lwsmd: [lsass] Failed to run provider specific request (request code = 12, provider = 'lsa-activedirectory-provider') -> error = 2692, symbol = NERR_SetupNotJoined, client pid = 36599

when I tried to add host to domain by domainjoin-cli it hangs or throw random error

[root@esx03:~] /usr/lib/vmware/likewise/bin/domainjoin-cli join --ou vSphere domain.wew administrator Domain.Pass

Joining to AD Domain:   domain.wew

With Computer DNS Name: esx03.domain.wew

Error: ERROR_FILE_NOT_FOUND [code 0x00000002]

Anybody found solution? Or maybe can give me some directions to diagnose problem?

0 Kudos
Craig_Baltzer
Expert
Expert

We have an open support case on this and VMware has repro'd this issue on higher socket and core count servers (we have failures on 4 socket/24 core and 4 socket/60 core servers).

It would be helpful for anyone experiencing the issue that has a support contract to open a support case with VMware on it. You can reference our support case # (15687374706) to keep these tied together in the VMware system. The more folks having the problem that are known to VMware, the higher the priority to get a fix developed.

0 Kudos
jagdish_rana
Enthusiast
Enthusiast

Hi There,

Could you please take a putty of an esxi host and check the nslookup for DNS server and telnet the port 389 of DC.

Thanks

0 Kudos
cos11
Contributor
Contributor

try

/etc/init.d/lwsmd start

and try rejoining again

also you might want to enable verbose logging

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=102655...

0 Kudos
bheggem
Contributor
Contributor

We have been experiencing this problem.  To work around here is what We've done and it has worked consistently thus far

 

/usr/lib/vmware/likewise/bin/lwsm set-log file /var/log/likewise.log

 

/usr/lib/vmware/likewise/bin/lwsm set-log-level info

 

/etc/init.d/lwsmd stop

 

/etc/init.d/lwsmd start

 

/usr/lib/vmware/likewise/bin/domainjoin-cli join domain administrator password

 

0 Kudos
VIR2AL3X
Enthusiast
Enthusiast

I am currently having this same issue and am unable to resolve it using the method provided.  Gladly welcome some assistance to get this working in my nested lab setup.

0 Kudos
DavidPeyton
Contributor
Contributor

One other thing to validate is your /etc/hosts file. If your FQDN is not there this could also be causing an issue.

We were getting this error when joining to the Domain:

Error: LW_ERROR_INVALID_MESSAGE [code 0x00009c46]

The Inter Process message is invalid


-----


The reason was the FQDN was not in the /etc/hosts file

Do not remove the following line, or various programs

# that require network functionality will fail.

127.0.0.1            localhost.localdomain      localhost

::1                       localhost.localdomain      localhost

172.0.0.127       <name.domain.local>    <name>

Shamsher0
Contributor
Contributor

Hi, please put your AD username as "username@yrdomain.com" while join domain on esxi server. Good Luck Smiley Happy

0 Kudos
OmVM
Contributor
Contributor

This has resolved my issue 

0 Kudos