We have 6 VM hosts. 4 are configured with multiple 1G VM GUEST network (trunked), and 2x 1G VMKERNEL for management and vMotion. Other 2 hosts are configured with 1x 10G for VM GUEST network and 2x 1G VMKERNEL for management and vMotion.
Here lies the problem. vMotion between 4x 1G hosts have no issues; however, when we try to vMotion a guest from any one of the hosts to one of the 10G hosts, guest doesn't connect to the network for 1 to 10 minutes. It eventually connects, but takes forever. When we vMotion from 10G host to 1G host, it does not exhibit this problem (Connects immediately sometime without dropping any packets).
All 1G hosts are connecting to the same switch and 10G hosts have the 2x 1G VMKERNEL network on the same switch as others but the 1x 10G card is connecting directly to our core switch.
I have a feeling that it's MAC address timeout setting between the 2 switches.
Has anyone experienced this problem and if so, how did you resolve it?
Also check your load balancing algorithms and beacon probing. Check your network switch port logs too.
I've seen cases where Cisco switches get annoyed, think they detect mac flapping, and disable learning on the port for 180 seconds.
To help troubleshoot, see if you can turn off beacon probing and configure the port groups for an active/failover configuration and see what happens.