VMware Cloud Community
Roodie
Contributor
Contributor

vMotion & svMotion over 10gb NIC's extrememly slow

Hi All,

I've currently taken over 2 ESX servers (7.x) in my role that was previously setup by someone else.

Bost hosts have identical hardware except for 4 drives in one server which is a different brand but the same specs.

Primary NIC is a DELL INTEL X540-T2 DUAL PORT 10GB RJ-45 which then goes back to a TP-Link 10G dumb switch which goes to another dumb switch that then connects to our gateway.

All the IPs are on the same range and jumbos are set to be the same as well. it took me 30mins to transfer just under 100GB VM (Hot)

I'd be grateful for some advice and guidance and if any log files are needed then please let me know and where they are located and ill happily grab them.

Reply
0 Kudos
6 Replies
RajeevVCP4
Expert
Expert

Hi

What is jumbo size on physical switch , 

ping source to destination ( by using this if jumbo size 9000)

mkping -I vmkX -s 8972 x.x.x.x

 

Rajeev Chauhan
VCIX-DCV6.5/VSAN/VXRAIL
Please mark help full or correct if my answer is use full for you
Tags (1)
Reply
0 Kudos
Roodie
Contributor
Contributor

So i ran the command from both nodes and they both came back with the below.

root@****:~] vmkping -I vmk0 -s 8972 172.16.0.130
PING 172.16.0.130 (172.16.0.130): 8972 data bytes
8980 bytes from 172.16.0.130: icmp_seq=0 ttl=64 time=0.499 ms
8980 bytes from 172.16.0.130: icmp_seq=1 ttl=64 time=0.570 ms
8980 bytes from 172.16.0.130: icmp_seq=2 ttl=64 time=0.607 ms

--- 172.16.0.130 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.499/0.559/0.607 ms


Its an unmanaged switch - TP-Link TL-SX105 5-Port 10G Desktop Switch

Reply
0 Kudos
IRIX201110141
Champion
Champion

The suggested command miss a important parameter

vmkping -d -s 8972

The -d is for not fragmenting and its the only way to tell if jumbo works.

About svMotions.... thats most time related to Disk rather than networking.  I upgrade a customer last week from multiple 1G links to 2x25G and we are archiving 25G saturation with large VMs. Customer have 768GB Mem in their ESXi Hosts.

If you watch vmkernel.log you can see vMotion statistics.

Regards,
Joerg

Reply
0 Kudos
Roodie
Contributor
Contributor

Output:

[root@****:~] vmkping -I vmk0 -d -s 8972 172.16.0.130
PING 172.16.0.130 (172.16.0.130): 8972 data bytes
8980 bytes from 172.16.0.130: icmp_seq=0 ttl=64 time=0.615 ms
8980 bytes from 172.16.0.130: icmp_seq=1 ttl=64 time=0.603 ms
8980 bytes from 172.16.0.130: icmp_seq=2 ttl=64 time=0.492 ms

--- 172.16.0.130 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.492/0.570/0.615 ms

 

The hosts are currently running 144GB of ram on each not massive but the most amount of ram being used is probably 40GB and ive watched disk performance and i dont see huge spikes but i will tail that log in vSphere and see what it says and give you an output.

 

Reply
0 Kudos
IRIX201110141
Champion
Champion

Yes your ping times are not the best ones... but you should archive a couple of Gbits for a vMotion.

2021-09-10T20:17:16.569Z cpu0:2162298)VMotion: 5277: 4898884007115602731 S: Stopping pre-copy: only 24514 pages left to send, which can be sent within the switchover time goal of 0.500 seconds (network bandwidth ~1900.131 MB/s, 92872% t2d)
2021-09-10T20:17:16.694Z cpu40:4948093)VMotionSend: 5095: 4898884007115602731 S: Sent all modified pages to destination (network bandwidth ~2365.814 MB/s)

and

[root@esx-node-02:~] vmkping -d -s 8972 172.22.149.3 -I vmk1
PING 172.22.149.3 (172.22.149.3): 8972 data bytes
8980 bytes from 172.22.149.3: icmp_seq=0 ttl=64 time=0.215 ms
8980 bytes from 172.22.149.3: icmp_seq=1 ttl=64 time=0.210 ms
8980 bytes from 172.22.149.3: icmp_seq=2 ttl=64 time=0.196 ms
--- 172.22.149.3 ping statistics ---3 packets transmitted, 3 packets received, 0% packet loss 
round-trip min/avg/max = 0.196/0.207/0.215 ms

 

Reply
0 Kudos
Roodie
Contributor
Contributor

Im wondering why the pings arnt the best seen as they are on 10GB NIcs etc.

Im new to ESX where can i locate this on the ESX Nodes or vSphere?

Reply
0 Kudos