VMware Cloud Community
fsckit
Enthusiast
Enthusiast

Network fast going to the ESXi host, but slow going to the VMs

I am trying to determine why network throughput is so slow going to all the Linux Virtual Machines running on an ESXi host, but fast to the ESXi host itself. For instance, if I scp a large file to a VM, it is taking about 10x longer than it does to scp the file directly to the ESXi host.  (And it is definately a network issue, not disk I/O.)

Where would you all begin to debug a problem like this?

Thanks in advance.

Reply
0 Kudos
4 Replies
tomtom901
Commander
Commander

Which network adapter do you run on the VM?

Are the VMware tools installed?

Does the management network use the same vmnic as the VM network to which the Linux VM's are connected?

Reply
0 Kudos
fsckit
Enthusiast
Enthusiast

We have two VMXNET3 virtual NICs configured on most of the VMs.  Yes, we have VMware Tools installed on each guest. I suppose I could configure a E1000 NIC on one of the guests if anyone thinks the problem might be with the VMXNET3 NIC or the driver provided by VMware Tools...

The managemnet network uses a standard vSwitch and two physical NICs on the host.  The VMs are on a distributed switch that uses a different set of physical NICs on the host.

There is another distributed switch for that second virtual NIC on the VMs.  (Let's call this the backup network.)  I just tested, and throughput to this network is fine.  So just our "primary" network is suffering.

Reply
0 Kudos
tomtom901
Commander
Commander

And is this a new deployment, or has this always worked fine until some point? Could you configure a VM on the management network and do the same test to determine whether it's slow or fast there?

Also, perhaps you could check duplex / speed settings on the physical nics (vmnics). Via SSH, perform esxcli network nic list which will give you some info on that. Also, are all hosts and nics using the same physical switch? If not, have you ruled out any issues on the physical side of the network?

Judging by your problem description, I'm not thinking this has something to do with the VMXNET3 adapter, so swapping that one to an E1000(E) is not necessary right now. We have other stuff to check first. Smiley Happy

fsckit
Enthusiast
Enthusiast

All our NIC settings are consistent.

There are four active uplinks on this ESXi host, each using two physical NICs, each on its own virtual switch and network:

1.) Primary

2.) Backup

3.) Management

4.) Special

Look at the network performance chart for this ESXi host, and we see that Special is pretty consistent in it's heavy network traffic.  I can't do anything about Special.

The peaks in each of the lines is me copy a large file to a VM on this host, or in the case of the Management network, copying the file directly to the host.  Note how low the Primary peak is.  It took about 15 minutes to copy the file. Using the Backup network, it took only about 4 minutes. The Management network took about 1.5 minutes.

Where is the bottleneck? What is preventing the Primary network from performing at its peak?

a_tale_of_4_uplinks.png

Reply
0 Kudos