My environment consists of three (Windows 2012 R2) PSC in three physical sites. All three PSC where recently upgraded from “6.0.0b” to “6.0U1b” without an issue.
This week I attempted to add a new PSC into the environment. The installation of the fourth PSC failed and I was left with the same stale PSC entry.
Attempting to manually remove the failed PSC using cmsso-util and vdcleavefed resulted in the same error as described in the following post: https://communities.vmware.com/thread/520219?start=0&tstart=0
I was finally able to install the fourth PSC by using a previous version of the vCenter installer, specifically build installer number number 2800571 / “vCenter Server 6.0.0b”.
hope this helps.
Also I did find it's possible to remove this stale PSC entry by using LDAP editor, e.g. jxplorer. and pointing to a working PSC.
When you open the LDAP tree and under Domain Controllers section you can remove the PSC from there and it will disappear from the web client view. But this is NOT recommended!!!!
I found that the when i run the vdcleavefd command the hostname is case sensitive SERVERNAME.domain.local needs to be typed, instead of servername.domain.local.
hope this helps someone out
Here was the scenario that I encountered this error, and how it was resolved. We have a two-site domain, vsphere.local, and each site has a separate Platform Services Controller (PSC) and vCenter server. All at the time were version 6.0 Update 1b.
1. I attempted to deploy a third site using the virtual appliance version, joining the existing domain. The new appliance was chosen as an integrated PSC and vCenter version. During the deployment, the configuration failed during the dcpromo operation, and I got the error "Firstboot Script Execution Error - Failed to run vdcpromo".
2. After a lot of searching, I attempted to deploy the VCSA again as a vCenter server only, using my Windows based PSC (PSC#1) as an external PSC. This failed with the error "TOO_MANY_NAMES", which basically meant that the failed registration of this name was still in the domain.
3. Searched again and found Using the cmsso command to unregister vCenter Server from Single Sign-On (2106736) | VMware KB. Tried to use this KB to remove the object, but it continued to give me another error, the "leave federation cleanup failed. error - confidentiality required" error. I was unable to use cmsso to clean up the object.
4. Used jxplorer to check the LDAP domain, and found that under domain controllers, there was an object for the failed installation. I deleted the object from domain controllers manually, and the object disappeared from the Web Client under Administration->Configuration->Nodes. This didn't help other than remove it from the list.
5. Attempting the installation now using Windows based machines. I stood up a third external PSC, joining the domain against the first PSC (PSC#1).
6. Then I setup a new Windows machine as a vCenter server only. I pointed this new installation at PSC#3, but the installation failed again with the error "error 1 join vmdir failed".
(The ghost of the failed first VCSA install is still haunting me I think.)
7. I verified that the object was still not listed under Configuration->Nodes, so attempted to install on Windows again as a PSC and vCenter server integrated, joining the domain as a new site. This too failed with the same "error 1 join vmdir failed."
8. Stopped here, and upgraded all four existing servers from Update 1b to Update 2. This was my PSC#1 and PSC#2, and both vCenter servers. I should also mention that at no time did I use or touch my second site (PSC#2 and vCenter #2).
At this point, I had seen several online posts mention that basically reinstalling the PSC and vCenter exactly as it had been installed, before manual removal/failure, allowed it to be cleanly uninstalled.
9. With this in mind, my next installation attempt was to install a vCenter and PSC combo, just as my first installation had been using the VCSA. I chose the same PSC to join the domain with (PSC#1), used the same site name and server name as the appliance was, and got a different error. Okay at least now I am making some progress. The new error is "Error while configuring vSphere Auto Deploy Waiter: Auto Deploy register Exception".
Since 6.0 now has Auto Deploy bundled with the vCenter server for Windows, and cannot be installed separately, I figured I would try again, this time just installing a PSC with no vCenter.
10. At this point I install just a Platform Service Controller role on the server that will eventually just be vCenter, pointing it at PSC#1 again, and it joins fine with the installation completing. The failed appliance is now a functional PSC on Windows. Now I can remove the Platform Services Controller on the server by re-running the "autorun" application from the ISO, and it uninstalls cleanly.
11. Re-run the installer, choosing only vCenter, pointed at PSC#3 for my new (3rd) site, and it installs just fine.
OK, so this was long-winded but I had attempted everything I found online, and none of it seemed to work. I had several gotcha's that were different than others had seen, but what I think happened, was that the original VCSA deployment (embedded PSC and vCenter) failed to join the Windows based domain, and so the promotion of the additional PSC failed to complete. It left the object behind, and wouldn't allow for a new installation. By rebuilding just a PSC with that name, joined to the same original PSC, it allowed it to install and finally finish the vdcpromo, using a Windows based PSC.
If there is a cleaner way to do this, I didn't find anything else to be helpful. Here were the links of basically ALL the stuff I used to try to wade through this brown river....
I hope all of this can help someone else so it is less painful....