VMware Cloud Community
olor
Contributor
Contributor

Storage vmotion don`t work

Hello,

I have cluster of ESXs(3xESX-3.5u4 and 1xESX-4.0) connected by fiber to 2 storages and operated via vCenter.

I have a VM on ESX-4.0 host. I try to use Storage vmotion - relocate powered on VM with vmtools to other LUN and got an error:

"Cannot connect to host". Very informative...

I try to relocate VM to other LUN only - not to another host(standart vmotion works fine)

Google does not help me at all.

Any hints?

Tags (3)
Reply
0 Kudos
12 Replies
admin
Immortal
Immortal

I'd start by checking with these two KB articles:

vCenter displays error: Failed to connect to host (1010837)

Migrating/Clone fails from a local disk to a NFS share (1004586)

Rick Blythe

Social Media Specialist

VMware Inc.

depping
Leadership
Leadership

check the log files on the host they might give more info.

Duncan

VMware Communities User Moderator | VCP | VCDX

-


Blogging:

Twitter: (*NEW*)

New book in town: vSphere Quick Start Guide ()

If you find this information useful, please award points for "correct" or "helpful".

Reply
0 Kudos
RParker
Immortal
Immortal

Google does not help me at all.

Go to the console for one of those hosts. Type vmkping: -servername- (-servername- being the actual IP or hostname for your OTHER ESX host). What is the response?

Edit the network configuration for BOTH hosts, and vmkernel should have vmotion enabled. There should also be an IP address assigned. Make SURE that BOTH hosts are accessible to each other AND vmkernel IP address.

Then we can go from there.

Reply
0 Kudos
olor
Contributor
Contributor

Thank you for your reply.

I check this KB and it seems a my problem. After step 3 i got right IP in a vpxa.conf.

But this not help me at all - I still get a "cannot connect to host error".

Reply
0 Kudos
olor
Contributor
Contributor

"BOTH host" - who is a second host?

Storage vmotion seems to done like this: VC -> ESXNODE -> DAS. As I understand it nt connect any other ESX.

BTW: vmkping - pings all ok.

Reply
0 Kudos
olor
Contributor
Contributor

I check logs and got it:

I replace wrong IP with "1.1.1.1".

Cannot connect to server 1.1.1.1:902: Connection refused

[2009-08-14

06:18:10.129 0xf7bd5b90 warning 'Libs']

NfcNewAuthdConnectionEx: Failed to connect to peer (numRetries=0).

Error: Failed to connect to server 1.1.1.1:902

Cannot connect to server 1.1.1.1:902: Connection refused

[2009-08-14

06:18:10.130 0xf7bd5b90 warning 'Libs']

NfcNewAuthdConnectionEx: Failed to connect to peer (numRetries=1).

Error: Failed to connect to server 1.1.1.1:902

Cannot connect to server 1.1.1.1:902: Connection refused

[2009-08-14

06:18:10.130 0xf7bd5b90 warning 'Libs']

NfcNewAuthdConnectionEx: Failed to connect to peer (numRetries=2).

Error: Failed to connect to server 1.1.1.1:902

NfcNewAuthdConnectionEx: Failed to connect to peer. Error: Failed to

connect to server 1.1.1.1:902

Unable to connect to NFC server: Failed to connect to server 1.1.1.1:902

BUT:

# grep -i 'hostip' /etc/opt/vmware/vpxa/vpxa.cfg

<hostIp>172.16.1.204</hostIp>

#

1.1.1.1 in our infrastructure NAT IP address of GW of all ESXs and VMs.

So what we got:

1) After a fix a vpxa.cfg IP addr - system still tryis to connect to wrong IP.

2) For unknown reasons "wrong IP" was a NAP IP of GW. It is seems to very strange for me.

Reply
0 Kudos
olor
Contributor
Contributor

I got a problem.

I found this in vCenter DB:

select * from dbo.VPX_HOST

DNS_NAME IP_ADDRESS

esxdell.esx.domain.org 1.1.1.1

esxnode-01.esx.domain.org 1.1.1.1

esxnode-02.esx.domain.org 1.1.1.1

esxnode-03.esx.domain.org 1.1.1.1

esxnode-04.esx.domain.org 1.1.1.1

Then I remove and diconnect any node and connect it again - in this table it got right IP but after 10-15 seconds it is changed to 1.1.1.1

I start to search on nodes for 1.1.1.1 in text files - no hint.

But it is got it from somewere.

Then I change default gw of one vkernel to non used IP - so it lost internet connections.

I remove and readd node and... in table was right IP and it is does`t changed. But node after 10-20 sec statr to "not responds".

SO:

Node get this IP by asking a question fro remote host - "what is my ip?" and it is of course - NAT IP.

So problem is seems to clear now but I still don`t get what to do this it.

Any hints?

Reply
0 Kudos
RParker
Immortal
Immortal

Storage vmotion seems to done like this: VC -> ESXNODE -> DAS. As I understand it nt connect any other ESX.

OK, sorry, yeah storage vmotion. Yeah I remember what the problem is now. I had this a few times.

Disconnect the host from vCenter (right click disconnect on host).

Remove vmkernel from the host.

Go to console of host use 'service mgmt-vmware restart'

Readd vmkernel, add the same IP as before.

right click host and click connect.

That should fix it.

Reply
0 Kudos
olor
Contributor
Contributor

Thanks for reply.

This host a ESXi and don`t have a "service mgmt-vmware restart" command.

So re-addind vkernel on non connected host should fix it?

Reply
0 Kudos
RParker
Immortal
Immortal

So re-addind vkernel on non connected host should fix it?

Yes, but you can still restart management services from the ESX console. You just have to disconnect from vCenter first, then remove vmkernel, then restart services, then reconnect, then add vmkernel.

Then see if that works.

Reply
0 Kudos
olor
Contributor
Contributor

I disconnect and remove node from VC.

Delete vmkernel. Add vmkernel with same IP. Connect to VC.

Not helps.

Reply
0 Kudos
olor
Contributor
Contributor

I resolve it.

Problem was in vCenter managment IP. It was "white IP" and in vpxa.conf in nodes VC IP was this white IP.

So nodes conect to vCenter via internet - they go to GW, got NAT IP and go to VC - for VC it was one ip for all nodes - NAT IP.

Summary:

Make all configuration with grey or all with white IP and all will be good.

Thanks all for help.

Reply
0 Kudos