We're running a ESX 3.02/VC 2.01 environment, and experienced what will be politely termed as an 'event' with our storage that resulted in 3 of the ESX servers being hard shut down. We have since repaired the storage, and cleanly booted the ESX servers.
However, since the crash that we experienced, we have not been able to manage the ESX servers via VC. You can telnet to port 902 on the ESX servers from the VC by DNS name, and the ESX servers appear to be resolving the VC properly. I have cleanly rebooted one of the ESX servers, and have restarted vmware-vpxa and vmware-mgmt, but it hasn't improved matters. You can manage the ESX server directly using the Infrastructure Client.
When we try to add the host into VC either by name or by IP, the Add Host wizard comes up, but returns the error "Request Timed Out" when we enter in the address of the ESX server and our credentials.
Here is a snippet from the logs on the VC:
Authd error: 510 Could not create lock for vmware-vpxa
Failed to connect to host 10.10.19.50:902. Check that authd is running correctly (lib/connect error 11)
-- FINISH task-internal-1035 -- datacenter-2 -- vim.Datacenter.queryConnectionInfo
Here is telnetting to port 902 on that same server:
220 VMware Authentication Daemon Version 1.10: SSL Required, ServerDaemonProtocol:SOAP, MKSDisplayProtocol:VNC
Any ideas?
Thanks,
-matt
See if restarting the virtual center service on the vc server helps.
Check the log /var/log/vmware/vpx/vpxa.log; my guess is that there is already a vpxuser account present (from before 'the event').
On the ESX servers that crashed, you might try deleting the vpxuser account. When you then add the ESX servers to the VirtualCenter, the vpxuser will be recreated.
Make sure the /tmp/vmware-root directory still exists on the host. A reboot might have cleared the /tmp directory.
No luck. Two odd observations though:
If I enter incorrect credentials in the Add Host Wizard, it comes back right away and says that the password was no good.
Also, if I directly manage the ESX host using the client, and try to console into one of the virtual servers, it will time out and crash the client.
-matt
It almost sounds as though there is a firewall issue.
The inability to open a remote console at least indicates that port 903 is
not open.
You mentioned that you were able to telnet to port 902, but this still
sounds like a firewall issue (especially since, if I understand correctly,
there are 3 ESX servers all experiencing the same problem).
I assume you have removed the hosts from your VC inventory first, haven't you?
Following up to this:
I have deleted the vpxuser - no luck
I have restarted the VC - no love
I can telnet to port 903 on the ESX, and the client doesn't actually give me an error message when I try to console into a server, it just hangs and crashes.
Man, you have yourself a dilly-of-a-pickle.
Yeah, I'm not real thrilled. I'm tempted to re-install ESX on one of the servers, and see if that resolves the issue. It will just be a major hassle to rebuild all the VMs, and because I don't have VMotion available there will be considerable downtime.
First off, I hope my last comment didn't sound insensitive to your situation (if it did, I apologize and I'll edit it shortly).
Rebuilding one of the ESX servers might be your quickest option at this point, although if someone else knows an easy fix I'll be very happy to be wrong.
You shouldn't have to rebuild any of the VMs; if you re-install ESX, just make sure you "check" the box for the option to leave VMFS volumes and VMs intact.
If you have sufficient storage, you can also cold-migrate the VMs to another ESX hosts before you re-install ESX.
Rebuilding one of the ESX servers might be your quickest option at this point, although if someone else knows an easy fix I'll be very happy to be wrong.
Agreed.
virtualdud3 - nope, I wasn't irritated at you, just the situation.
I was looking for information on how to manually un-install the management agent from the ESX server, such that VC needs to re-deploy it. I'll update further if I find out anything.
I think jayolsen might be onto something there...
Man, first sentence of the original question!!!
Update to this - I upgraded VC to 2.0.2, and still experience the same problem. I'm also still experiencing the same problem with the console access when I'm using the 2.0.2 client.
Might be time to open a support case.
I agree with jayolsen...
doh, good catch.