VMware Cloud Community
mhost
Enthusiast
Enthusiast

vMotion with crossover cable

Hello,

I have a lab setup with two ESXi's, recently upgraded to 5.1.

The two hosts have used a single physical NIC (vmnic2) for management, vMotion and VM traffic - Management and vMotion running on a single VMkernel port (192.168.1.0/24)

Now I have added an extra NIC (vmnic0), which I plan to use for vMotion and replication.

The extra NIC has been added to a new vSwitch, where I have created a VMkernel port with vMotion and replication enabled.

Physical NICs are interconnected with a crossover cable, and the VMkernel port has been assigned a non-existing IP-range (192.168.4.0/24).

Connection is up at 1Gbps/full duplex and I can vmkping between the new VMkernel ports.

Udklip.JPG

I have then disabled vMotion and Replication on the old VMkernel and restarted both hosts.

When I try to vMotion or start replication however, the traffic still goes over the old vmnic2.

What am I missing here?

Best regards

Martin Holst

Reply
0 Kudos
12 Replies
schepp
Leadership
Leadership

Hi,

I guess you forgot to remove the vMotion check from the VMkernel settings at the old vSwitch where vmnic2 is connected.

Regards

Reply
0 Kudos
mhost
Enthusiast
Enthusiast

Hi Tim,

That was my first guess as well. But as stated in the post:

"I have then disabled vMotion and Replication on the old VMkernel and restarted both hosts."

But thanks for the suggestion.

When I initiate a vMotion, validation takes a long time.

So I believe that some kind of check fails - and it falls back to the other NICs?

Best regards

Martin

Reply
0 Kudos
chriswahl
Virtuoso
Virtuoso

Validate that the hosts can reach each other via a vmkping to the vMotion IP on the ESXi shell.

VCDX #104 (DCV, NV) ஃ WahlNetwork.com ஃ @ChrisWahl ஃ Author, Networking for VMware Administrators
Reply
0 Kudos
mhost
Enthusiast
Enthusiast

Hello Chris,

As mentioned in the original post, this has already been tried:
"Connection is up at 1Gbps/full duplex and I can vmkping between the new VMkernel ports."

But thank you for the suggestion.

Best regards

Martin

Reply
0 Kudos
chriswahl
Virtuoso
Virtuoso

My bad.

Have you tried adjusting the MTU back to 1500?

VCDX #104 (DCV, NV) ஃ WahlNetwork.com ஃ @ChrisWahl ஃ Author, Networking for VMware Administrators
Reply
0 Kudos
mhost
Enthusiast
Enthusiast

Hi Chris,

Actually I was unsure about how much I had fiddled with Smiley Happy

So I have just tried recreating the vSwitches from scratch, with default MTU settings - but without luck.

The strange thing is that on the first migration, it takes forever to validate the destination host, when it's been selected.

It seems that vsphere runs some kind of check - and then discards the designated vMotion network?

So I am wondering if I have missed a basic networking prerequisite?

As far as I know there should be no routing or DNS involved on the vMotion network?

I just looked at the vMotion Peformance/Best Practice paper - and they use a crossover network for vMotion in the lab.

Best regards

Martin

Reply
0 Kudos
chriswahl
Virtuoso
Virtuoso

Check the host logs to see if you find errors related to a vMotion. I'd check vpxagent and hostd. Also, check the esxcfg-route -l list to ensure that it sees that vmnic as the interface for your vMotion subnet.

VCDX #104 (DCV, NV) ஃ WahlNetwork.com ஃ @ChrisWahl ஃ Author, Networking for VMware Administrators
Reply
0 Kudos
mhost
Enthusiast
Enthusiast

Output from esxcfg-route -l is as follows:

Network          Netmask          Gateway          Interface
192.168.1.0      255.255.255.0    Local Subnet     vmk0
192.168.4.0      255.255.255.0    Local Subnet     vmk1
default          0.0.0.0          192.168.1.1      vmk0

Seems that the new vmkernel (vmk1) is correctly bound to the vmotion subnet

I've tried looking through the logs, searching for "vmk1", "vmnic0", "vmotion" and "XP_SP3" (the VM used for test-vMotion).

But I haven't found anything useful.

Any ideas of something more specific to search for?

Thanks for your help so far.

Martin

Reply
0 Kudos
mhost
Enthusiast
Enthusiast

Just tried a bunch of things to rule out faulty NICs and cables:

  • Switched NICs around between vSwitches and verified that the vmnic0s worked OK for management and VM-traffic.
  • Also tried with vmnic2 for the vMotion/Replication vSwitch, but made no difference.
  • Tried inserting a switch (instead of crossover cable), changed cables and switched speed/duplex on NICs

No mater what, I can always vmkping between the vmkernel NICs (vmk1) dedicated to vMotion.

- But vMotion still goes through the management vmkernel NICs (vmk0) instead.

I tried a capture with tcpdump-uw on vmk1 when doing vMotion - no traffic whatsoever.

(When doing vmkping, traffic is captured nicely)

/Martin

Reply
0 Kudos
kurtd
Enthusiast
Enthusiast

I'm having a similar issue.  We were on esxi 4.1 and vmotion with crossover cables was working fine.  After upgrading to 5.1, vmotion no longer works and fails at 14% every time.

The vMotion migrations failed because the ESX hosts were not able to connect over the vMotion network. Check the vMotion network settings and physical network configuration.
vMotion migration [168430091:1350944171434890] failed to create a connection with remote host <10.10.10.10>: The ESX hosts failed to connect over the VMotion network
Migration [168430091:1350944171434890] failed to connect to remote host <10.10.10.10> from host <10.10.10.11>: Timeout
The vMotion failed because the destination host did not receive data from the source host on the vMotion network. Please check your vMotion network settings and physical network configuration and ensure they are correct.

If I remove my vmotion vswitch and recreate it, vmotion will work again.  But then I put one host into maintenance mode, bring it out of maintenance mode and vmotion fails again.

I do notice that the vmkernal gateway is set the same as the gateway for our production network and I don't see a way to change it.  Could that be the issue?

Reply
0 Kudos
kurtd
Enthusiast
Enthusiast

I had two Active vmotion adapters.  I moved vmnic3 to standby on both servers and vmotion started working again.  Why was it working fine on 4.1 with two active adapters but not on 5.1?

10-23-2012 2-25-44 PM.jpg

Reply
0 Kudos
chriswahl
Virtuoso
Virtuoso

I do notice that the vmkernal gateway is set the same as the gateway for our production network and I don't see a way to change it.  Could that be the issue?

Each host can only have one default gateway.

VCDX #104 (DCV, NV) ஃ WahlNetwork.com ஃ @ChrisWahl ஃ Author, Networking for VMware Administrators
Reply
0 Kudos