I'm troubleshooting quite low VM to VM ethernet speeds on our enterprise plus hosts. The topology is 3x blade servers connected via FC 16G to the same storage and a distributed switch as a network backbone.
I'm trying to move 10GB file between VM1 and VM2 which are on the same subnet, on the same "host" within the cluster. Both of the machines have 1 VMXNET3 adapters and are Windows Server 2019 hosts, both show in the status "10.0Gbps" connection via ethernet.
Maximum speeds when transferring data I get about 1.3Gbps, which is quite low.
Any ideas, where to even start looking for some hickups? Thank you
PS: Not sure if this belongs to vSphere, or NSX, but as because this is by my guess a networking issue, I post it here. Feel free to correct me if I'm wrong
You could try to use 'iperf'. You can put them in server and client mode. So you can purely test the network.
But on a host itself you should get high speeds. Maybe it is storage being the issue here.
SSH onto the host and use esxtop to check utilisation.
Esxtop and then n for networking or the link below to check disk utilisation.
Does the result stay the same when you try more sessions ?
iperf3 -P 10 for example (if i remember right) This will use 10 sessions instead of 1
Do you also have the issue with a normal port group ? Or only with nsx segment ?
Did you check all the setting on the physical switches ? No issues there ? Maybe some qos or something on the blades ?
I'm trying to move 10GB file between VM1 and VM2 which are on the same subnet, on the same "host"
Keeping the Host-Host bandwidth issue aside, VM-VM on the same host and subnet should give you good bandwidth as it is an in-memory switching.
1. Is the issue specific to these Windows 2019 machines?
2. Have you tested with any other OS?
3. How are you moving the files from VM-VM? Any specific software or normal copy&paste operation?
Have you checked the disk utilization using esxtop > pressing d. Should see something similar to the image below.
This is disk usage from esxtop
p0wertje : I do not know if there's a difference between NSX or normal port group? We have vSphere 7 Enterprise Plus in a cluster mode in HA with vMotion across 3 Dell R640 blades in a Dell FX2 chassis and 10Gbe IOMs connected to a distributed switch. Both machines are atm on the same blade.
The reason I came to this issue in first place, is I noticed low transfers from this blade setup to another site on a 10Gbit network due backups and I started digging where might be the problem.
So I performed a simple copy test from one vhost to another and this issue surfaced.
Then performed iperf3 test in order to ignore possible storage bottlenecks (which is FiberChannel 16Gb connected all-ssd ME4024, but then ... even that shouldn't be a problem, crystal disk mark on the host shows very nice numbers - 1450MB/s read 1580MB/s write, which are near tech caps of our storage).
I'm not aware of any enabled QoS on DSwitch, IOM or network - how can I check it?
We have only default MTUs on our DSwitch, no jumbo frames nor any other "non-default" values. But that should not be a problem in this case...
Like @Sreec mentions: When the vms are on the same host, you should get high numbers because of the in memory switching.
If you do not get high numbers from vm to vm on the same host -> please check the reply from @Sreec.
Maybe you should open a support case with vmware ? So they can look into the issue.
What i mean in difference between a 'normal portgroup' and a 'NSX portgroup':
Normal: not leverage a nsx segment (vlan backed or geneve backed) But a normal port-group you create from within virtual center. (which is vlan based)
Ok, tried Centos to Centos:
[SUM] 0.00-10.00 sec 10.9 GBytes 9.34 Gbits/sec 5614 sender
[SUM] 0.00-10.00 sec 10.9 GBytes 9.34 Gbits/sec receiver
now this looks more like it. This means that the problem might lie within windows.