VMware Cloud Community
Allsopp
Contributor
Contributor
Jump to solution

Emergency - All hosts lost connection after VC rebuild

I had a hardware failure on the VC server. The database had just been backed up so I built a new VC server on new hardware, imported the database and installed VC 2.01 using the existing database.

(I will be upgrading to 2.5 as soon as this works) The client comes up but shows all three of my hosts as disconnected. I have gone into one host via console and restarted the mgmt-vmware and vmware-vpxa services with no effect. All services on the VC server are running.

Short of disconnectiing and reconnecting the hosts through the client, is there anything else I can try?

0 Kudos
1 Solution

Accepted Solutions
HyperSprite
Enthusiast
Enthusiast
Jump to solution

Without the SSL certs from your old VC, you will need to disconnect, reconnect your hosts.

See http://www.vmware.com/resources/techresources/1025

View solution in original post

0 Kudos
4 Replies
HyperSprite
Enthusiast
Enthusiast
Jump to solution

Without the SSL certs from your old VC, you will need to disconnect, reconnect your hosts.

See http://www.vmware.com/resources/techresources/1025

0 Kudos
jhanekom
Virtuoso
Virtuoso
Jump to solution

I don't believe VC 2.0.x performs SSL cert checking...?

I do know that each ESX host has a record of the IP address that manages it, however, so if that's changed you need to amend /etc/vmware/vpxa.cfg on the host.

There is also an instance ID on the VC host (under one of the Administration menus, if memory serves), but I cannot seem to spot anything obvious on the ESX side that needs to match that.

0 Kudos
HyperSprite
Enthusiast
Enthusiast
Jump to solution

The SSL certs are needed to decrypt host passwords in the database.

If you have the ssl certs and change the vpxa.cfg (or use the old IP on the new server), it will all work like it did before without any other intervention.

0 Kudos
Allsopp
Contributor
Contributor
Jump to solution

I just reconnected the hosts. All configuration information was preserved, which was what I was concerned about.

Lots of other errors happened during this upgrade. Here is a summary for those who might be doing the same thing.

- disable HA before the upgrade. This will prevent a lot of errors and service restarts.

- I had one host where the error "DIsable firewall failed: vmodl.fault.SystemError" came up and soon after, the host dropped connection entirely. No ping nothing. I couldn't find this anywhere in the forums so I took a chance and restarted the firewall service from /etc/rc.d/rc3.d. Once that was started a simple reconnect worked and I was able to reconfigure HA without issues.

My next step is to upgrade each host from 3.01 to 3.5. I'm not anticipating any problems here since all sirtual machines will be migrated off. If it boms and I need to do a complete re-install it will be a pain but not world ending.

0 Kudos