VMware Cloud Community
ehaniffa
Contributor
Contributor
Jump to solution

Failed to connect to host

Hello all

We've just implemented our production VM esx environment we have 2 c7000 HP blade chasis's with 8 BL465 blades in each plus 32 gig of ram per blade. On the back end we have a 3020 Netapp cluster as our SAN connecting to the C7000 via FC.All 16 blades are in a VM cluster and we are in the process of cloning the servers we put on our proof of concept VM esx server to the new cluster. We managed to move about 5 VM's to it but now everytime I try and close another one to the esx cluster we keep getting the "failed to connect to host" error. Tech support have been trouble shooting but have not found anything yet. I've sent them the logs from some of our esx server for review too. Have any of you experienced this? any help would be greatly appriciated.

thanks

0 Kudos
1 Solution

Accepted Solutions
mike_laspina
Champion
Champion
Jump to solution

Two consoles are a good thing for HA DRS etc. Having them on the same segment is not optimum e.g. a routing/gateway failure brings it down. But let's not get into that on this thread.

Make sure the VC management IP has not changed from the original setting before doing the next steps.

For now lets work with just the primary console IP that is resolved by DNS and clean up the Cert errors first.

Are you using PKI or just the default SSL certs that ESX Generates?

If your not using PKI then do this.

Backup your existing cert

mkdir /etc/vmware/ssl/bk

'mv /etc/vmware/ssl/rui.* /etc/vmware/ssl/bk*'

service mgmt-restart restart

On the VC remove the host and add it back again and see if you can manage it after that.

http://blog.laspina.ca/ vExpert 2009

View solution in original post

0 Kudos
11 Replies
mike_laspina
Champion
Champion
Jump to solution

Hello,

The key is "failed to connect to host" all hosts in your case.

From the logical side the root cause must centered on one component that can affect all hosts.

I would start by ruling out two major areas.

1) VC <--> host management plane - Use the client to directly connect to a host, this will eliminate VC.

2) If direct connecting does not work then from a console issue a vmkping between two host IP's on the Service Console network - this will eliminate network configuration issues e.g. VLAN, pSwitch vSwitch ...

http://blog.laspina.ca/ vExpert 2009
ehaniffa
Contributor
Contributor
Jump to solution

Hi Mike,

Thanks for your response. I've been able to connect to all the hosts in my cluster directly from the VI client. I've also done the vmkping and all hosts can ping each other. I can also log into one host and move data between the LUNS.

0 Kudos
mike_laspina
Champion
Champion
Jump to solution

Well that's good (sort of) 😮 lol

This root cause is very likely in VC or DNS resolve from VC and possibly the license service.

Have a look in the VC system logs

C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\Logs

This should reveal what action needs to happen next.

If your not sure of what your seeing you can post it

http://blog.laspina.ca/ vExpert 2009
0 Kudos
ehaniffa
Contributor
Contributor
Jump to solution

Yep not sure what I'm seeing so if you can help me read the log it would be great.

0 Kudos
mike_laspina
Champion
Champion
Jump to solution

Wow, a few things are happening.

Could you post the output of

esxcfg-info -n

I need to see the network configuration to figure this out.

It looks like you have two IP's per ESX host and the server is responding on the IP that is not part of the originally generated SSL cert from the host.

http://blog.laspina.ca/ vExpert 2009
0 Kudos
ehaniffa
Contributor
Contributor
Jump to solution

I was looking at the logs and that was going to be my next question. Should I just have one console? right now I have two. Also if the cert was attached to the original IP of the esx how do i reset it to the new IP? Log attached

0 Kudos
mike_laspina
Champion
Champion
Jump to solution

Two consoles are a good thing for HA DRS etc. Having them on the same segment is not optimum e.g. a routing/gateway failure brings it down. But let's not get into that on this thread.

Make sure the VC management IP has not changed from the original setting before doing the next steps.

For now lets work with just the primary console IP that is resolved by DNS and clean up the Cert errors first.

Are you using PKI or just the default SSL certs that ESX Generates?

If your not using PKI then do this.

Backup your existing cert

mkdir /etc/vmware/ssl/bk

'mv /etc/vmware/ssl/rui.* /etc/vmware/ssl/bk*'

service mgmt-restart restart

On the VC remove the host and add it back again and see if you can manage it after that.

http://blog.laspina.ca/ vExpert 2009
0 Kudos
mike_laspina
Champion
Champion
Jump to solution

This could be a gateway problem, what is the current primary gateway for the VC?

http://blog.laspina.ca/ vExpert 2009
0 Kudos
ehaniffa
Contributor
Contributor
Jump to solution

Hey man,

Fixed the issue, you pointed me in the right direction. Before I got your last post I started to disconnect my hosts from the VCMS and when I tried to reconnect them they gave me a cert error. I ended up doing a repair on the vcms i.e.add/remove programs on the server. Once I did that I tried reconnecting the hosts again and still had a cert error. I then removed them from inventory and added them again, boom everything was working ok.

Now I have a new problem when I try to install the converter plug in I get a "remote server reported (404 error)" or something like that.

0 Kudos
mike_laspina
Champion
Champion
Jump to solution

The plugin is downloaded from the vmware web site to the client PC and then installed the 404 is a page not found.

Do you have a proxy server?

http://blog.laspina.ca/ vExpert 2009
ehaniffa
Contributor
Contributor
Jump to solution

Forgot to mention that you need to go to the VCMS options fromt he administration menu and click on SSL and then select the "Check Host Certificates" when I did this it would not let me add hosts with cert issues. this was done before I did the repair on VCMS it was after I got the errors that I decided to repair vcms. No more "Failed to connect to host" errors

0 Kudos