VMware Performance Community
vdiiomark
Enthusiast
Enthusiast
Jump to solution

VMmark 2.5 Mailserver error (loadgen)

First, let me state I have read through the forum's.  I have all parts of VMmark 2.5 working correctly, except for the Mailserver workload.  It does start up, but then exits with an error.  In reading other questions, it seems related to answers given about the mailserver responding too slowly. 

However, here is the issue.  I have no other workloads running (as configured in the VMMARK2.CONFIG file).  I am also monitoring my storage via esxtop to see the reported latencies and transfer.  During the first 5 minutes (when loadgen is running), the highest read or write latency found was less than 5ms.

The DAVG and GAVG are showing nearly identical results in esxtop, all less than 5ms along with the QAVG/cmd column showing 0.01 or less at all times.  Historical info for the datastore in vCenter shows similar results.  So, by all accounts nearly no storage queueing or latency.  With no other jobs running, the CPU and memory on the system are also very low.

So my question is why does the Mailserver workload fail?  Any pointers much appreciated.  Attached is a zip file of my "Mailserver" folder output from the clients "c:\vclient\mailserver" directory.

Reply
0 Kudos
1 Solution

Accepted Solutions
RebeccaG
VMware Employee
VMware Employee
Jump to solution

Yes, you can manually run the Exchange workload using LoadGen directly.

I believe "Scheduled Run Length" should be 3h.

I'm not sure where you got the Tasks / User / Day parameter from, but I can confirm the rest of the simulation parameters are correct.

To start the Exchange workload using LoadGen, follow the steps under "Configure LoadGen" which should be on page 100 of the 2.5 Benchmarking Guide. Instead of using vmmark2initializationtemplate.xml, use C:\vclient\mailserver\vmmark2-mailserver0.xml. This is the same file that is used by the VMmark benchmark harness, so you can be sure the manual Loadgen run is identical to a VMmark Loadgen run. You should not have to edit any fields, and then select "Start initialization followed by simulation" to start the run.

Exception logs will be generated at C:\Program Files\Exchange Load Generator and look in the log files LoadGenInit* and LoadGenSim*.

View solution in original post

Reply
0 Kudos
8 Replies
jpschnee
VMware Employee
VMware Employee
Jump to solution

Hi,

The answer can be found in your mailserverclient_error-0.txt file.

Starting simulation...
ERROR -- Caught exception SwordfishExceptionTooMuchLoad:
Number of unfinished tasks including those in the queue and those are being executed is greater than 1.5 times the number of users.
Engine State: Started
Total Test Runtime: 00:12:07
Task Interval: 32
Total Users: 1000
Active Users: 1000
Task Q Length (shared): 0 (busy workers: 0 of 0)
Task Q Length (WIN-GNJ9G2CS97U): 1423 (busy workers: 80 of 80)
   at Microsoft.Exchange.Swordfish.TaskEngine.dispatchExchUsers()
   at Microsoft.Exchange.Swordfish.TaskEngine.dispatchExchTasks()
   at Microsoft.Exchange.Swordfish.TaskEngine.<dispatchTasks>b__18()
   at Microsoft.Exchange.Swordfish.IL.ILUtil.DoTryFilterCatch(ThreadStart tryClause, Predicate`1 filterClause, Action`1 catchClause)
   at Microsoft.Exchange.Swordfish.TaskEngine.dispatchTasks()
   at Microsoft.Exchange.Swordfish.TaskEngine.Start()
   at Microsoft.Exchange.Swordfish.Cmd.SwordfishCmd.Main(String[] args)
Simulation has failed.

Your mailserver was unable to process the queue fast enough and it died.  When I look at the latencies reported (within your Mailserver0.wrf file, I see latencies over 2000ms (95%).  Make sure the latencies you're reviewing in esxtop are correlating to the data disk for your mailserver. 

What is the storage like behind your Mailserver0?

-Joshua
Reply
0 Kudos
RebeccaG
VMware Employee
VMware Employee
Jump to solution

Your mailserver is working fine for about the first 3 minutes of data collection (which begins after the 5 minute LoadGen intialization is finished). The failure occurs at minute 4, which is when the Mailserver is not completing requests quickly enough. You said you were monitoring latencies during the first 5 minutes, but make sure you're also monitoring during the run after the LoadGen Initialization, because that's when the failure occurs.

vdiiomark
Enthusiast
Enthusiast
Jump to solution

Thank you for the quick reply. 

I just migrated my Mailserver vm onto another datastore.  Previously I was on an iSCSI attached, hybrid storage system with SSD caching. 

I just re-ran the test with similar results.  With the run I just performed, I am on a local HBA RAID controller, with read and write caching in front of 4, 15k rpm disks.  Minimal but enough for a single workload.

Just to confirm that my latencies are minimal, I am attaching the vCenter perormance log of the results during the run. 

Mailserver setup was running from 12:12 until 12:16.  Loadgen started at approximately 12:18 and ran about 10 minutes before dying again. 

These traces show the following "Write Rate, Read Latency, Write Latency"  .  I have attached a png capture of these values.

SO, again what I am seeing does not correlate to what the error reported.  With both read and write latencies always below 5 ms, the failure makes no sense to me.

Reply
0 Kudos
jpschnee
VMware Employee
VMware Employee
Jump to solution

You need to focus on the latencies reported by Loadgen.  The loadgen actions performed are likely adding to the latencies.  In my opinion 4 disks are just not enough, try doubling that and seeing if the issue goes away. 

-Joshua
vdiiomark
Enthusiast
Enthusiast
Jump to solution

I have migrated my vm to another system and tried different storage devices as datastores, in all cases, I get a failure several minutes in.

The only potential bottleneck appears to be CPU related, not disk related.  When configured using 4 vCPU's as directed, the CPU load average hovers somewhere around 75%.  That is the only variable that seems high, with memory and disk usage appearing fairly normal.

Are there any documented guidelines for recreating the workload using the Load Generator directly? 

I believe the parameters are as follows:

User Count = 1,000

Tasks / User / Day = 278

Length of 'Simulation Day' = 2h, 30m, 0s

Scheduled Run Length = 3h, 30m, 0s

Can anyone confirm if these are the correct Load Gen settings to replicate the run created by VMmark?  I would like to run an accurate mail load directly, to eliminate other variables.

Reply
0 Kudos
RebeccaG
VMware Employee
VMware Employee
Jump to solution

Yes, you can manually run the Exchange workload using LoadGen directly.

I believe "Scheduled Run Length" should be 3h.

I'm not sure where you got the Tasks / User / Day parameter from, but I can confirm the rest of the simulation parameters are correct.

To start the Exchange workload using LoadGen, follow the steps under "Configure LoadGen" which should be on page 100 of the 2.5 Benchmarking Guide. Instead of using vmmark2initializationtemplate.xml, use C:\vclient\mailserver\vmmark2-mailserver0.xml. This is the same file that is used by the VMmark benchmark harness, so you can be sure the manual Loadgen run is identical to a VMmark Loadgen run. You should not have to edit any fields, and then select "Start initialization followed by simulation" to start the run.

Exception logs will be generated at C:\Program Files\Exchange Load Generator and look in the log files LoadGenInit* and LoadGenSim*.

Reply
0 Kudos
vdiiomark
Enthusiast
Enthusiast
Jump to solution

Thanks for all the great input.  Last question:

In the "VMmark 2.5 Benchmarking" document, on page 100, under the section entitled "Configure LoadGen", there are 10 steps listed.

Step #10 is "Start the Initilization phase". 

SO, the question is, if I already performed this step here during setup, shouldn't initilization be skipped when running the Load Gen independently?  There are several options to run LoadGen, one is to initilize only (as above), another is to initialize and run the workload (as you suggested above), and the last option is to run the workload only.  It seems to me that the correct option would be to "Run the Workload only", but perhaps I am wrong.

Thanks.

Reply
0 Kudos
RebeccaG
VMware Employee
VMware Employee
Jump to solution

Hi, you do need to run the initialization phase if you are running LoadGen independently. This is because you're not completing the 'Mailserver restore' phase which is normally part of a VMmark run. The Mailserver restore returns the mailserver database to a state in which initialization has just been completed. So running initialization is your way of doing this manually. It's also a good way to catch any exceptions that may be occurring during intialization.

Reply
0 Kudos