Ryan_D
Contributor
Contributor

Vcenter looses connection with host consistently.

Has anyone had an issue with Vcenter consistently loosing connection to their hosts? Here is my setup

Vcenter server on physical machine monitoring the following

3 Vsphere ESX 4.1 servers locally (on same subnet)

1 Vmware ESX 3.5 server at a remote location (different subnet)

1 Vmware ESX 4.1 server in a datacenter (public ip, also different subnet)

The 3 servers that are local and the one 3.5 server work just fine, the problem is the server in the datacenter... for some reason I can connect vcenter to that host and it will only stay connected for 30-45 seconds... 60 seconds tops! I had it connected one time long enough to get it added to the cluster and upload the license and 10 seconds later it dropped.

From the same Vcenter server I can connect to the host in the datacenter with the standard VIX client directly and it will stay connected until I disconnect it (days at a time)

Just for testing I created a VM on the host in the datacenter and installed Vcenter on that VM. It stays connected as it should so that leads me to believe there is some kind of issue with Vcenter not wanting to stay connected to a host over the internet? (doesn't make sense because the esx 3.5 host works just fine over the internet and it has not lost connection since the beginning) Maybe a firewall issue? But if it were a firewall issue wouldn't the Vcenter never be able to connect at all?

This is not just a host issue or problem with 4.1 because I had 3.0 esx installed in the datacenter until last weekend when we upgraded and the 3.0 host had the same exact issue for over a year.

Any assistance anyone can provide would be greatly appreciated.

0 Kudos
3 Replies
logiboy123
Expert
Expert

I'm rather convinced this is a firewall issue.

There are a lot of ports required to be open in both directions for full connectivity. It could be that the management ports 902 and 903 are open TO the ESX box, but not from the ESX box back to vCenter. I had similar issues that you describe and it turned out to be inadequete networking communication, namely the correct ports were not open and therefore the heartbeat was not working. So the box could be added to vCenter, but soon after it would disconnect.

One way to test this before making changes to your networking abd firewall rules would be to modify your heartbeat value and see if this impacts the environment. Also if you have a firewall enabled on the vCenter server, disable it for testing purposes.

Further reading:

0 Kudos
Ryan_D
Contributor
Contributor

What you are saying makes sense but just to test I added an any/any rule to my subnet here at the office from any ip in the block at the colo and it doesnt appear that anything even tried to connect let alone got blocked. So I am not sure that the firewall issue makes sense unless I can see some kind of traffic trying to at least hit my side...

Any other ideas?

0 Kudos
logiboy123
Expert
Expert

This may help:

When you say this hosts is in a datacenter, what do you actually mean? Is this an offsite hosted solution?

Is this host in a cluster or standalone configuration?

If the virtual machines are working just fine, but you can't manage the box then that points to an issue with networking at the management layer, which can be in the form of;

1) DNS issues.

2) Loopback incorrectly set, for example; the hostname is host2.domain.com but in some of the config files on the console it is set to something else like host3.domain.com

3) Firewalls; Turn off the firewall on the vCenter server and allow an any/any rule for host to vCenter communications. If your box is at a hosting facility, then chances are you will need to ask your hosting facility to add 80, 443, 902 and 903 to your allowed communications.

Regards,

Paul Kelly

0 Kudos