Hello,
I have cluster of ESXs(3xESX-3.5u4 and 1xESX-4.0) connected by fiber to 2 storages and operated via vCenter.
I have a VM on ESX-4.0 host. I try to use Storage vmotion - relocate powered on VM with vmtools to other LUN and got an error:
"Cannot connect to host". Very informative...
I try to relocate VM to other LUN only - not to another host(standart vmotion works fine)
Google does not help me at all.
Any hints?
I'd start by checking with these two KB articles:
vCenter displays error: Failed to connect to host (1010837)
Migrating/Clone fails from a local disk to a NFS share (1004586)
Rick Blythe
Social Media Specialist
VMware Inc.
Google does not help me at all.
Go to the console for one of those hosts. Type vmkping: -servername- (-servername- being the actual IP or hostname for your OTHER ESX host). What is the response?
Edit the network configuration for BOTH hosts, and vmkernel should have vmotion enabled. There should also be an IP address assigned. Make SURE that BOTH hosts are accessible to each other AND vmkernel IP address.
Then we can go from there.
Thank you for your reply.
I check this KB and it seems a my problem. After step 3 i got right IP in a vpxa.conf.
But this not help me at all - I still get a "cannot connect to host error".
"BOTH host" - who is a second host?
Storage vmotion seems to done like this: VC -> ESXNODE -> DAS. As I understand it nt connect any other ESX.
BTW: vmkping - pings all ok.
I check logs and got it:
I replace wrong IP with "1.1.1.1".
Cannot connect to server 1.1.1.1:902: Connection refused
[2009-08-14
06:18:10.129 0xf7bd5b90 warning 'Libs']
NfcNewAuthdConnectionEx: Failed to connect to peer (numRetries=0).
Error: Failed to connect to server 1.1.1.1:902
Cannot connect to server 1.1.1.1:902: Connection refused
[2009-08-14
06:18:10.130 0xf7bd5b90 warning 'Libs']
NfcNewAuthdConnectionEx: Failed to connect to peer (numRetries=1).
Error: Failed to connect to server 1.1.1.1:902
Cannot connect to server 1.1.1.1:902: Connection refused
[2009-08-14
06:18:10.130 0xf7bd5b90 warning 'Libs']
NfcNewAuthdConnectionEx: Failed to connect to peer (numRetries=2).
Error: Failed to connect to server 1.1.1.1:902
NfcNewAuthdConnectionEx: Failed to connect to peer. Error: Failed to
connect to server 1.1.1.1:902
Unable to connect to NFC server: Failed to connect to server 1.1.1.1:902
BUT:
# grep -i 'hostip' /etc/opt/vmware/vpxa/vpxa.cfg
<hostIp>172.16.1.204</hostIp>
1.1.1.1 in our infrastructure NAT IP address of GW of all ESXs and VMs.
So what we got:
1) After a fix a vpxa.cfg IP addr - system still tryis to connect to wrong IP.
2) For unknown reasons "wrong IP" was a NAP IP of GW. It is seems to very strange for me.
I got a problem.
I found this in vCenter DB:
select * from dbo.VPX_HOST
DNS_NAME IP_ADDRESS
esxdell.esx.domain.org 1.1.1.1
esxnode-01.esx.domain.org 1.1.1.1
esxnode-02.esx.domain.org 1.1.1.1
esxnode-03.esx.domain.org 1.1.1.1
esxnode-04.esx.domain.org 1.1.1.1
Then I remove and diconnect any node and connect it again - in this table it got right IP but after 10-15 seconds it is changed to 1.1.1.1
I start to search on nodes for 1.1.1.1 in text files - no hint.
But it is got it from somewere.
Then I change default gw of one vkernel to non used IP - so it lost internet connections.
I remove and readd node and... in table was right IP and it is does`t changed. But node after 10-20 sec statr to "not responds".
SO:
Node get this IP by asking a question fro remote host - "what is my ip?" and it is of course - NAT IP.
So problem is seems to clear now but I still don`t get what to do this it.
Any hints?
Storage vmotion seems to done like this: VC -> ESXNODE -> DAS. As I understand it nt connect any other ESX.
OK, sorry, yeah storage vmotion. Yeah I remember what the problem is now. I had this a few times.
Disconnect the host from vCenter (right click disconnect on host).
Remove vmkernel from the host.
Go to console of host use 'service mgmt-vmware restart'
Readd vmkernel, add the same IP as before.
right click host and click connect.
That should fix it.
Thanks for reply.
This host a ESXi and don`t have a "service mgmt-vmware restart" command.
So re-addind vkernel on non connected host should fix it?
So re-addind vkernel on non connected host should fix it?
Yes, but you can still restart management services from the ESX console. You just have to disconnect from vCenter first, then remove vmkernel, then restart services, then reconnect, then add vmkernel.
Then see if that works.
I disconnect and remove node from VC.
Delete vmkernel. Add vmkernel with same IP. Connect to VC.
Not helps.
I resolve it.
Problem was in vCenter managment IP. It was "white IP" and in vpxa.conf in nodes VC IP was this white IP.
So nodes conect to vCenter via internet - they go to GW, got NAT IP and go to VC - for VC it was one ip for all nodes - NAT IP.
Summary:
Make all configuration with grey or all with white IP and all will be good.
Thanks all for help.
