Folks,
I'm having having serious application performance problem with my vSphere environment where I have deployed a couple of stand alone SQL Server database VMs (4x VMs) to be used by some of the Application Servers VMs (4x stand alone VMs) which is running Oracle Forms (Java) servers. The users are connecting from their own workstation from various geographical locations in Asia Pacific, the Data Center is in Australia.
User (150 users of Win7 workstations) --- RDP TCP/3389---> Terminal Server pool VMs (one of 45x available VMs) --- Oracle Form HTTP app TCP/80 ---> Application Server --- SQL Server TCP/1433---> SQL Server VMs.
Somehow, Network Operations team report that there are some consistent occurrence of TCP retransmit package in between the Application Server to the Database and also between the Terminal Servers.
the occurrence happens in all type of scenario same ESXi host or even different ESXi host
Hardware:
HP BL 465c G7/G8 Blades on c7000 enclosure
HP Virtual Connect modules
Software:
vSphere 5.1 U1 for all HP Blades running ESXi and the VCenter server.
All terminal servers (RDSH) and Guest OS: Windows Server 2008 R2 Std. SP1
Database: SQL Server 2005 and 2008 R2
What could cause TCP retransmission on the virtual environment ?
wreedMH, what sort of issue that you are facing on your HP Blade Server environment ?
What I did was:
1. Perform firmware update on the HP Blade, CNA and also updated the ESXi drivers to the latest October 2015 using HP SPP.
2. Upgrade to the latest VMware ESXi 5.1 Update 3b build number 2323236.
3. Upgrade the latest VMware tools to v10.0.5 after the two above has been completed.
4. Follow VMware KB: Large packet loss at the guest OS level on the VMXNET3 vNIC in ESXi Small Rx Buffers = 8192 and Rx Ring #1 Size = 4096
Hope that helps.
Update: Browsing through the ESXi 5.1 Update 3 Release Notes somehow I've found out below statement:
Burst of data packets sent by applications might be dropped due to limited queue size on a vDS or on a standard vSwitch
On a vNetwork Distributed Switch (vDS) or on a standard vSwitch where the traffic shaping is enabled, burst of data packets sent by applications might drop due to limited queue size.
This issue is resolved in this release.
but when I look through the dVS settings and also the vswitch, I cannot find anything about traffic shaping in both vSwitch and vDS:
Hi there,
unfortunately pinpointing of this exact issue could be very broad - but for starters did you try increasing the TCP buffer size and turning on Receive Side Scaling in the in-guest NIC Configuration for all the problematic servers? You can do it in realtime with the impact being only one or two pings lost - but still, better to do it out of business hours
What about the performance of the VMs? How are they utilized and are(n't) they CPU starved? Maybe a cabling check could be in order as well, just in case.
Thanks for the suggestion Alistar,
Yes I have done as per this suggestion VMware KB: Large packet loss at the guest OS level on the VMXNET3 vNIC in ESXi 5.x / 4.x but somehow the packet loss is still happening.
I have checked using vCPU ready % script that the VMs that are having problem with the packet loss is not having CPU constrained.
Did you ever get this fixed? We are seeing similar issues.
wreedMH, what sort of issue that you are facing on your HP Blade Server environment ?
What I did was:
1. Perform firmware update on the HP Blade, CNA and also updated the ESXi drivers to the latest October 2015 using HP SPP.
2. Upgrade to the latest VMware ESXi 5.1 Update 3b build number 2323236.
3. Upgrade the latest VMware tools to v10.0.5 after the two above has been completed.
4. Follow VMware KB: Large packet loss at the guest OS level on the VMXNET3 vNIC in ESXi Small Rx Buffers = 8192 and Rx Ring #1 Size = 4096
Hope that helps.