Hello, I'm new to the forums and somewhat of a VMware beginner, so please excuse any stupid questions. I presume this post requesting assistance will likely result in questions from the community about our environment in order to narrow in on the cause of issue so I will do my best to answer promptly as it will be a bit of a discovery excercise for myself also (as I'm new to this organization and their VM environment).
One of our DB admins came to me recently with a request to investigate latency issues with a SQL query. Basically he has an application (Forefront Identity Manager) that is querying a SQL database, and his DB monitoring shows latency issues. Both the application server and SQL server are VM's (ESX 5), and there are in fact numerous instances of both application and SQL server (production, test, QA environment). The latency issue affects all instances, and seems to be network related. I read some articles on the subject and looked at our environment, both at the VM and guest O/S level and I can't figure out what the problem is. It is worth noting that when the application and SQL database exist on the same guest machine, there is no problem. After looking at the disk, memory, CPU performance metrics, I don't believe this is a cause for the problem - but perhaps I am missing something.
Below is the email from our DB admin with details:
"The issue I was mentioning earlier is network performance between the FIM Synchronizationservice and the FIM database server:
1. With the DPA (Database Perf Analyzer) I can see that the predominant waits are Async_Network_IO
Here is an example:
2. Resource Monitor shows a total network IO of max 8 – 10 Mbps during the same time (when this FIM Sync process runs)
3. I can easily reproduce this behavior in any FIM environment
4. We have many FIM instances (FIM, FIM R2, QA, Prod etc.)
5. Using other methods (file transfer, Iperf) I see that the potential networking throughput is much higher.
6. The issue seems to be specific to the connection between the FIM Synchronization service and the FIM database server.
7. My conclusion, from all the above observations (especially #5 and 6), is that:
Disk is not the bottleneck in this scenario. DPA does not show much disk IO pressure.
The total network IO being “capped”, as mentioned earlier, to 8 – 10
Mbps (just about 1 MB/s) during the time when FIM Sync process runs seems to
limit the entire FIM Sync process before the disks would get any stress.
Update:
More testing results from DA Administrator:
FYI
I’ve tried the following
All the above tests exhibited much higher network throughputs (> 300
Mbps), over 30 x higher than the ~ 10 Mbps I get from any FIM instance.
I’d say that this:
FIM Sync is not “returning” the acknowledgment to
SQL for the initial queries until it completes processing the individual RBAR
updates
Here is it for a [Full Import and Full Sync] :