VMware Cloud Community
Eternal_Snow
Contributor
Contributor

vMotion failed at 10%. Not so simple.

Hi.

Here is my environment:

vCenter: 4.1.0.345043

Hypervisor: 4 ESXi servers, 4.1.0.348481

VMkernels with vMotion of all server are linked to a separated switch. No VLAN set. These 4 servers are the only devices linked to this switch.

VMkernels with vMotion are managed by a vNetwork Distributed Switch.

The ip addresses of VMkernels with vMotion are set to A:192.168.7.101/24, B:192.168.7.102/24, C:192.168.7.103/24 and D:192.168.7.104/24. This subnet is not the same as the one which carries management traffic. The default gateway cannot be reached by this switch.

All settings of these servers are same. They are compliant with one host profile.

I can:

Migrate (vMotion) VM between server A and server B;

Migrate (vMotion) VM between server C and server D;

I attached my notebook to this switch of vMotion and set ip to 192.168.7.200/24. I can ping to all the 4 servers with the IP listed above.

I cannot:

Migrate (vMotion) VM from server A or B to server C or D;

Migrate (vMotion) VM from server C or D to server A or B.

Thanks, and waiting for your advice.

Update:

Each server can use vmkping to get the response of all servers. Network seems to be ok. vMotion problem still exists.

I've tried to delete the Distributed Switch and re-create a original one to hold vMotion, still no luck.

I've changed this switch hardware, problem still exists.

VMware EVC Mode: Intel Xeon Core i7

Hypervisor: Dell PowerEdge R710 (E5504*2, 72GB), all same

Reply
0 Kudos
16 Replies
idle-jam
Immortal
Immortal

what if you back to back A and C via network cable and see if there can vmotion? also make sure the subnet and ip range are the same. this should work fine and then from there you can isolate whether it's the switch, network cable or the network settings and etc ..

Reply
0 Kudos
ats0401
Enthusiast
Enthusiast

Are all four servers seeing the exact same LUNS\datastores?

Do you have the vMa appliance setup?

If not, you can enable remote tech support mode on your ESXi box and SSH to it and run the

vmkping command from servers A&B to servers C&D

enable remote tech support mode ESXi

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=101791...

Eternal_Snow
Contributor
Contributor

All VM can be power on the any server. datastore works fine.

I dunno about vMa.

I found that it's impossible to login with root by SSH or any other account.

I cannot go into the server room coz it's night here.

I'll check it by console tomorrow. Thanks.

Reply
0 Kudos
a_p_
Leadership
Leadership

A reason for not being able to login on the host using root, might be an activated LockDown mode. You may need to temporarily disable LockDown mode for the hosts.

For troubleshooting the issue you may also want to take a look at the VMware KB Diagnosing VMware vMotion failure at 10%

André

Eternal_Snow
Contributor
Contributor

Testing Result: each server can use vmkping to get the response of all servers. Network seems to be ok. vMotion problem still exists.

I've tried to delete the Distributed Switch and re-create a original one to hold vMotion, still no luck.

I've changed this switch hardware, problem still exists.

Reply
0 Kudos
mcowger
Immortal
Immortal

Have you checked the vmware.log file for an affected VM?  Post the log file - it will contain a ton more information as to the cause.

--Matt VCDX #52 blog.cowger.us
Reply
0 Kudos
Eternal_Snow
Contributor
Contributor

Nothing relevant log found.

All log within this time period:

Jun 03 03:31:31.929: vmx| TOOLS received request in VMX to set option 'synctime' -> '1'
Jun 03 03:31:31.947: vmx| VMXVmdb_LoadRawConfig: Loading raw config
Jun 03 03:31:31.958: vmx| VMXVmdb_LoadRawConfig: Loading raw config
Jun 03 03:31:31.974: vmx| VMXVmdb_LoadRawConfig: Loading raw config
Jun 03 03:32:32.054: vmx| GuestRpcSendTimedOut: message to toolbox-dnd timed out.
Jun 03 03:35:01.902: vmx| TOOLS received request in VMX to set option 'synctime' -> '1'
Jun 03 03:35:01.906: vmx| VMXVmdb_LoadRawConfig: Loading raw config
Jun 03 03:35:01.917: vmx| VMXVmdb_LoadRawConfig: Loading raw config
Jun 03 03:35:01.932: vmx| VMXVmdb_LoadRawConfig: Loading raw config
Jun 03 03:36:02.011: vmx| GuestRpcSendTimedOut: message to toolbox-dnd timed out.
Reply
0 Kudos
mcowger
Immortal
Immortal

can you post /var/log/messages and /var/log/vmkernel?

Also, try disabling vmotion on the NIC vmknic, then re-enabling.

--Matt VCDX #52 blog.cowger.us
Reply
0 Kudos
Eternal_Snow
Contributor
Contributor

Hi, matt.

How to get those file you mentioned?

I cannot login with any account by SSH. Access denied.

Reply
0 Kudos
mcowger
Immortal
Immortal

You can get them a couple ways.

You can either get them from the host directly if you enable SSH/TSM.

or you can you vSphere's 'export diagnostic logs': http://pubs.vmware.com/vsp40/admin/wwhelp/wwhimpl/common/html/wwhelp.htm#href=t_export_diagnostic_da...

--Matt VCDX #52 blog.cowger.us
Reply
0 Kudos
Eternal_Snow
Contributor
Contributor

I cannot find the files you specified within the package generated by vSphere Client.

In var/log, many gz files and folder exist. but no one looks like you specified.

Reply
0 Kudos
Eternal_Snow
Contributor
Contributor

Here is the log on my server 1.

vm "KMS Server" migration failed.

Reply
0 Kudos
ats0401
Enthusiast
Enthusiast

Can each host ping and lookup all the hosts by DNS hostname?

Reply
0 Kudos
Eternal_Snow
Contributor
Contributor

Yes, DNS works great.

Reply
0 Kudos
krishna_v78
Enthusiast
Enthusiast

Hi,

Probably, you can verify for errors for vMotion under Maps tab.

Regards,

Balu

Reply
0 Kudos
Eternal_Snow
Contributor
Contributor

OK. Finally, I fix it..

by reinstall server C and D.

That's a long story. Now I write it down here for avoiding other going into this trap again.

At the begining, I want to build a vSphere system with only IPv6 addresses, without IPv4. I installed a vCenter system and ESXi on server C and D. No IPv4 addresses entered. But I found that it's impossible to do that coz HA/FT is not supported IPv6 yet. I have to add IPv4 address to the VMkernal which hold management traffic to enable HA.

Then, I installed server A and B with ESXi. IPv4 and IPv6 are specified at the begining of configuration.

They look like the same in vSphere client, but not.

Now, after I reinstall server C and D with IPv4 and IPv6, vMotion comes back.

Thanks all.

Reply
0 Kudos