VMware Cloud Community
carterfields
Contributor
Contributor

Slow Network Performance

I'm about 90% done with a new vSphere deployment, and we've just noticed that our IIS web application that we have is having extreme slowness, even some timeouts.  Our QA team says that a particular process they run typically completes in 1-5s on our previous, aging, physical servers and now are doing 45s-1m on the vm's.

Everything else seems fine. Normal server operations seem to be performing fine.  CPU, Mem, Disk performance are all on the low side, even during the QA process.

The servers in question have 10vcpu, 16gb mem, NetApp 10g iSCSI SAN.  Physical servers are Cisco UCS B200M3 Servers, and are ridiculously under utilized.

Running with a vDS but also has same problem with Standard switch.

Development is of course saying its infrastructure/networking.

The only indication of an issue that I can actually replicate is that if I ping one vm to another, the first few pings drop, and then it will eventually start a solid continuous ping.

If anyone has seen anything like this, please let me know.

Thanks!

0 Kudos
11 Replies
dhanarajramesh

I doubt in two areas one is Cisco service profile VNIC config and second one is NIC Teamiing policies.can you explain more about your setup? if possible attach the screen shot of  NIC Teaming policies ?

0 Kudos
vfk
Expert
Expert

I would check esxtop during testing and monitor the vmnics and vms to see if there are any packet drops.

--- If you found this or any other answer helpful, please consider the use of the Helpful or Correct buttons to award points. vfk Systems Manager / Technical Architect VCP5-DCV, VCAP5-DCA, vExpert, ITILv3, CCNA, MCP
0 Kudos
NealeC
Hot Shot
Hot Shot

Hi Carterfields,

Just to clarify, have you assigned 10vCPU to all your VMs? or were you saying your physical hosts have 10 cores/threads in them available to the VMs?


If you have assigned 10 vCPUs to multiple VMs then the likelihood is that ESXi is struggling to schedule them efficiently and you will notice high CPU Ready time in ESXTOP (%RDY which indicates the amout of time the VM was ready to do stuff but was waiting on the hypervisor to give it a core, or 10 cores, to work with)

Another problem if you have 10 core physicals and high vCPU VMs is how you choose to present those vCPUs, either as sockets or cores.
KB Reference: http://kb.vmware.com/kb/1010184&src=vmw_so_vex_cneal_850

It’s often been said that this change of processor presentation does not affect performance, but it may impact performance by influencing the sizing and presentation of virtual NUMA to the guest operating system.

Reference Performance Best Practices for VMware vSphere 5.5 (page 44): http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.5.pdf#src=vmw_so_vex_cneal_850

-------------- If you found this or any other answer useful please consider the use of the Helpful or Correct buttons to award points. Chris Neale VCIX6-NV;vExpert2014-17;VCP6-NV;VCP5-DCV;VCP4;VCA-NV;VCA-DCV;VTSP2015;VTSP5;VTSP4 http://www.chrisneale.org http://www.twitter.com/mrcneale
0 Kudos
carterfields
Contributor
Contributor

esxtop shows no dropped packets Smiley Sad

0 Kudos
carterfields
Contributor
Contributor

Thank you for your reply NealeC,

There are only 4 servers (vm's) that have 10 vcpu's.  I was trying to emphasize that the assigned resources of the troubled VM's was not an apparent issue.  The problem also presented itself when those four vm's only had two vcpu's.  I'm seeing %RDY at about 0.60 on average for each of the VM's.

0 Kudos
vfk
Expert
Expert

Does this happen when the vm is on a particular esxi host or regardless where the vm is the problem persists?  Is this the only application experiencing performance issues?  How is the rest of environment performing?  Have you changed the MTU size?

Process of elimination:

  1. Put Application VM and test vm on the same esxi host and run your test, monitor the performance.  If the problems persists, it is possible you have storage latency issues.  If local storage is available, move vms to local storage and test again.
  2. If no problems, in stage 1, then separate the two vm, but still same portgoup (vlan) and run the test again, and monitor performance.  if you are dropping packets while on the same vlan but different physical esxi host then, possibly network issue, speak with your network team.  If everything is ok from same vlan, try move the test vm a different vlan (portgroup)
  3. If no problems, in stage 2, try running the test from your desktop machine, or another desktop and monitor performance.  If you observe problems, possibly network issues.  At stage, you have proved everything is working while running test inside the virtual environment.
  4. If no issues found, escalate, and log a call with VMware.

Let me know how you get on.

--- If you found this or any other answer helpful, please consider the use of the Helpful or Correct buttons to award points. vfk Systems Manager / Technical Architect VCP5-DCV, VCAP5-DCA, vExpert, ITILv3, CCNA, MCP
0 Kudos
carterfields
Contributor
Contributor

Thanks dhanarajramesh,

Here are a few screenshots.  I can provide more if you have any ideas.

VM1.PNGVM2.PNGVM3.PNGVM4.PNG

0 Kudos
carterfields
Contributor
Contributor

PROD dvPG1 is our production network, regular data traffic.  The other two are iSCSI PG's to the NetApp.

0 Kudos
dhanarajramesh

I would suggest  do not add iscsi adapters together with production and dvPG1 set only two adapters ( uplink3,4) as active. and in ISCSI team set fail back  as No beacuse when there is failure happened it wold flip flap the connection and cause storage issue. more over in ucs VNIC side please PIN to proper UPlink means please do proper pinning.

0 Kudos
carterfields
Contributor
Contributor

Thank you, dhanarajramesh,

Production and iSCSI are not using the same port groups, nor are they using the same uplinks.  I'll look into changing the fallback, however it wont help in this case.

0 Kudos
carterfields
Contributor
Contributor

Maybe this will help too..

See, dvUplink1 & 2 under dvsProd are not the same dvUplinks for dvsISCSI.

Capture1.PNGCapture2.PNG

0 Kudos