VMware Cloud Community
jrhaakenson
Enthusiast
Enthusiast
Jump to solution

VCSA 7.0.2 Can't Join Active Directory (AD) Domain

I recently upgraded to VCSA 7.0.2-17694817 and my Active Directory (AD) communication stopped working.  Below are the troubleshooting steps I've taken:

- I've verified the AD DNS servers are configured properly on my VCSA.  The VCSA is resolving domain entries correctly using nslookup and I can ping both my DC and domain name.

- I've followed all the guides for leaving/joining AD via the VCSA CLI.  I've left the domain using /opt/likewise/bin/domainjoin-cli leave.

- I've removed the VCSA AD instance in my Windows Server 2016 DC.  I've verified I'm using a domain administrator account with the correct password and the account is unlocked.

- When I attempt to join the AD from the VCSA CLI using /opt/likewise/bin/domainjoin-cli join domainname username password I receive ERROR_GEN_FAILURE [code 0x0000001f] which suggests SMBv2 must be enabled. 

- I've followed the guides to enable SMBv2 on VCSA using /opt/likewise/bin/lwregshell add_value
'[HKEY_THIS_MACHINE\Services\lwio\Parameters\Drivers\rdr]' Smb2Enabled REG_DWORD 1

- I've verified SMBv1 is disabled on my Windows 2016 DC and SMBv2 is enabled.  Registry location HKLM\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters\

- I've verified there are no issues with my DC Windows Firewall or McAfee Host Intrusion Prevention Firewall blocking the connection.  No threat events reported in any Antivirus applications.  Furthermore these devices are on the same subnet so there is no network firewall or IDS to consider either.

I'm now at a loss on why my VCSA is not joining my AD.  I continue to receive ERROR_GEN_FAILURE [code 0x0000001f] when attempting to join via CLI and I continue to receive Idm client exception: Error trying to join AD, error code [31] when attempting to join via the vSphere web client.  Please help.

0 Kudos
2 Solutions

Accepted Solutions
HotHotPocket
Contributor
Contributor
Jump to solution

Nothing obvious I saw in the logs either.  However, I found the source of my issue!

My setup is on an non-internet facing private network with public time source.  This is a brand new system and the DC (which is virtualized) had not had its time properly set.  The time on vCSA was close to actual time.  The different in time between these two machines was over 10 hours.  After pointing the DC to a local NTP source (so that the time difference was small between the DC and vCSA), the vCSA was able to join the domain immediately.

I've ran into this issue in the past regarding joining hosts to domains with huge differences in time; I'm upset I didn't catch it earlier.

I'm hoping this is your problem too.

View solution in original post

0 Kudos
jrhaakenson
Enthusiast
Enthusiast
Jump to solution

Yes. I was able to fix my AD joining issue by synchronizing the time correctly across the board.  Since I don't have a valid NTP server to use for the ESXi host, I had the ESXi host using the Domain Controllers as an NTP server.  This is generally not a best practice to sync a host with a VM running on that host.  As a result, the ESXi host's time was wrong and subsequently VMs were pulling time from the host rather than the Domain Controllers.  This included my VCSA which was pulling the wrong time from the host.  It's not an issue with the Windows VMs, because they sync time correctly with the Domain Controllers via their Group Policy settings.  But VCSA wasn't set to synchronize with the Domain Controllers so it was pulling its time from the ESXi host which was incorrect.  After changing the host to manual time, I then set the VCSA to synchronize with the Domain Controllers.  After the VCSA had time synchronized with the Domain Controllers I was able to join AD, restart, and login with my AD accounts once again.

I think what got me in the wrong direction in the first place was that my VCSA time was close (maybe about 15-30 minutes off) but not 10 hours as you experienced.  So I didn't suspect time issues initially.  At any rate, time synchronization between Domain Controllers and the VCSA was the cause of this issue.  Thanks for your contribution.

View solution in original post

9 Replies
HotHotPocket
Contributor
Contributor
Jump to solution

Same issue for a brand new installation of 7.0.2.  I've tried all of these same steps following all of the guides.  Only setup difference is Server 2019 instead of 2016; however, the forest / domain tree levels are 2012.

Still looking for answers, let me know if you find a resolution.

 

0 Kudos
Sanooj_aj
VMware Employee
VMware Employee
Jump to solution

Check if this helps 

https://kb.vmware.com/s/article/77531?lang=en_US

Also IWA type of identity sources are deprecated in vCenter 7.0 (but is available) and will be discontinued in future releases. You can read about it here - https://kb.vmware.com/s/article/78506 

 

Let me know if the problem was related to port 445 connectivity

0 Kudos
HotHotPocket
Contributor
Contributor
Jump to solution

Thanks for the update regarding IWA.

Port 445 reports "Connected" according to the guide in article 77531.

0 Kudos
jrhaakenson
Enthusiast
Enthusiast
Jump to solution

Likewise if you find an answer.  I still haven't found one yet.  Thanks.

0 Kudos
Sanooj_aj
VMware Employee
VMware Employee
Jump to solution

Thank you for testing that out 

Next thing I would recommend is to enable to likewise logging to debug level for the domainjoin-cli actions like below

/opt/likewise/bin/domainjoin-cli --loglevel verbose --logfile /var/log/domain.log join <domain name> <user>

Once the failure happens, take a look at log file /var/log/domain.log - see if it gives any specific details around the failure. 

0 Kudos
jrhaakenson
Enthusiast
Enthusiast
Jump to solution

In my /var/log/domain.log file the last few entries are as follows:

INFO:Writing krb5 file /tmp/likewisetmpSZxuYO/etc/krb5.conf

INFO:File /tmp/likewisetmpSZxuYO/etc/krb5.conf modified

INFO:Finishing krb5.conf configuration

ERROR:ERROR_GEN_FAILURE [ERROR_GEN_FAILURE]

Stack Trace:

../domainjoin/domainjoin-cli/src/main.c:962

../domainjoin/domainjoin-cli/src/main.c:511

../domainjoin/libdomainjoin/src/djmodule.c:344

../domainjoin/libdomainjoin/src/djauthinfo.c:721

../domainjoin/libdomainjoin/src/djauthinfo.c:1227

My /etc/krb5.conf file looks correct for the record.  What is this ERROR_GEN_FAILURE error the logs are reporting?

0 Kudos
HotHotPocket
Contributor
Contributor
Jump to solution

Nothing obvious I saw in the logs either.  However, I found the source of my issue!

My setup is on an non-internet facing private network with public time source.  This is a brand new system and the DC (which is virtualized) had not had its time properly set.  The time on vCSA was close to actual time.  The different in time between these two machines was over 10 hours.  After pointing the DC to a local NTP source (so that the time difference was small between the DC and vCSA), the vCSA was able to join the domain immediately.

I've ran into this issue in the past regarding joining hosts to domains with huge differences in time; I'm upset I didn't catch it earlier.

I'm hoping this is your problem too.

0 Kudos
jrhaakenson
Enthusiast
Enthusiast
Jump to solution

Yes. I was able to fix my AD joining issue by synchronizing the time correctly across the board.  Since I don't have a valid NTP server to use for the ESXi host, I had the ESXi host using the Domain Controllers as an NTP server.  This is generally not a best practice to sync a host with a VM running on that host.  As a result, the ESXi host's time was wrong and subsequently VMs were pulling time from the host rather than the Domain Controllers.  This included my VCSA which was pulling the wrong time from the host.  It's not an issue with the Windows VMs, because they sync time correctly with the Domain Controllers via their Group Policy settings.  But VCSA wasn't set to synchronize with the Domain Controllers so it was pulling its time from the ESXi host which was incorrect.  After changing the host to manual time, I then set the VCSA to synchronize with the Domain Controllers.  After the VCSA had time synchronized with the Domain Controllers I was able to join AD, restart, and login with my AD accounts once again.

I think what got me in the wrong direction in the first place was that my VCSA time was close (maybe about 15-30 minutes off) but not 10 hours as you experienced.  So I didn't suspect time issues initially.  At any rate, time synchronization between Domain Controllers and the VCSA was the cause of this issue.  Thanks for your contribution.

Safarinow
Contributor
Contributor
Jump to solution

This resolved the AD join issue we were experiencing. It's best to either use your AD as the NTP server or utilize the same NTP endpoint as your AD servers. Another important point is that if you're running a test environment and those hosts don't have internet access, then it's advisable to use AD as your NTP server or whichever server you're using to set the time on your network. Big Thanks 

0 Kudos