VMware Cloud Community
Rocl_LI
Contributor
Contributor

VM network issue after migration

Hello VMware, hello world!!

I used VC to migrate VM between two ESX 301 and found that if I ping to the VM during the migration, then I could get timeout start at 94% of migration and one case till minute after the completion of VMotion.

I spent some time to find that the cause is on the LAN switch forwarding:: The ethernet address of VM remains the same before and after migration, due to no physical link lost the switch CAM table will not be changed, hence my external pinging to the VM died in the process. But if I have the VM busy in network activity (e.g. ping -t to a physical host ) during the migration, then I only get one packet drop.

Has anyone have the similiar issue? Is there any config/settings on either VM or networking side can be done to wake this networkly quiet VM after migration? (I use e1000 quad ports, but no TX hang in vmkernal log, I have separated VMotion and VM networks, applied all related patched and not aware any error/fault)

Question to VMware:; Could/Had VMtools be coded to smooth this issue out?

0 Kudos
5 Replies
oreeh
Immortal
Immortal

Check your switch MAC / ARP caching timeouts and adjust them accordingly.

Many switches use far too high defaults.

Rocl_LI
Contributor
Contributor

Most MAC aging time is set to 5 mins. Not sure about ARP table, any suggestion for both?

0 Kudos
oreeh
Immortal
Immortal

Some switch mfg call it MAC aging some ARP caching...

any suggestion for both?

Setting it too high disables network access after vMotion, setting it too low increases network traffic (ARP requests).

5 mins definitely is too high - set it at least to 1 min or (even better) lower.

I prefer 30 seconds - but this depends on your environment.

0 Kudos
Rocl_LI
Contributor
Contributor

Hi oreeh, thanks for reply,

cisco default ARP table timeout is 4 hours. (IP < > MAC),

CAM table aging/timeout is 5 mins (MAC < > switch port),

My problem seems only related to CAM not with ARP..

By changing the ARP cache time in the problem VM could possibly help but as you mentioned too many arp flooding later.. Certainly not on the network router that keeps the ARP and would make much more noisier network.

Do you think if the migration can be script to trigger an ARP announcement, few ICMP or some sort after migration would bypass the need to change network device configuration that is working for all other systems?

0 Kudos
oreeh
Immortal
Immortal

Do you think if the migration can be script to trigger an ARP announcement, few ICMP or some sort after migration would bypass the need to change network device configuration that is working for all other systems?

AFAIK you can't.

Since your switch has extra tables for MAC - Port relations (CAM) try to lower these values.

If you are lucky you can adjust this on a per port basis.

0 Kudos