VMware Performance Community
vs_ang
Contributor
Contributor
Jump to solution

Weathervane exceptions..

Hello VMmark team,

I am trying to benchmark a vSAN cluster. I started with 4 tiles. The run is being marked as non-compliant. There are quite a few exceptions reported from the weathervane application. I have two VMware clusters for the SUT and the client systems and an external SAN storage system with SSDs providing the shared iSCSI datastores for infrastructure operations. All systems are connected via two 10Gbps ethernet switches. The vSAN traffic has 2x dedicated 10Gbps ports on the SUT. There are 2x 10Gbps ports dedicated for the vMotion and iSCSI datastores. I do not see any dropped packets, re-transmits, etc., on the switches, so I am not sure if these exceptions are due to the network. Can you please advise what else could be potentially causing these issues? Thanks.

Warnings Messages::

  p0 : WeathervaneAuction0 Exceptions : 5

  p0 : WeathervaneElastic0 Exceptions : 1

  p1 : WeathervaneAuction0 Exceptions : 2

  p1 : WeathervaneElastic0 Exceptions : 231

  p2 : WeathervaneAuction0 Exceptions : 5

  p2 : WeathervaneElastic0 Exceptions : 446

  rampdown : WeathervaneAuction0 Exceptions : 4

  rampdown : WeathervaneElastic0 Exceptions : 377

  p0 : WeathervaneAuction1 Exceptions : 3

  p0 : WeathervaneElastic1 Exceptions : 1

  p1 : WeathervaneAuction1 Exceptions : 1

  p1 : WeathervaneElastic1 Exceptions : 1

  p2 : WeathervaneAuction1 Exceptions : 4

  p2 : WeathervaneElastic1 Exceptions : 2

  rampdown : WeathervaneAuction1 Exceptions : 3

  rampdown : WeathervaneElastic1 Exceptions : 1

  p0 : WeathervaneAuction2 Exceptions : 5

  p0 : WeathervaneElastic2 Exceptions : 149

  p1 : WeathervaneAuction2 Exceptions : 6

  p1 : WeathervaneElastic2 Exceptions : 373

  p2 : WeathervaneAuction2 Exceptions : 5

  p2 : WeathervaneElastic2 Exceptions : 348

  rampdown : WeathervaneAuction2 Exceptions : 1

  rampdown : WeathervaneElastic2 Exceptions : 228

  p0 : WeathervaneAuction3 Exceptions : 4

  p0 : WeathervaneElastic3 Exceptions : 108

  p1 : WeathervaneAuction3 Exceptions : 5

  p1 : WeathervaneElastic3 Exceptions : 184

  p2 : WeathervaneAuction3 Exceptions : 5

  p2 : WeathervaneElastic3 Exceptions : 312

  rampdown : WeathervaneAuction3 Exceptions : 2

  rampdown : WeathervaneElastic3 Exceptions : 218

Summary ::

Run_Is_NOT_Compliant

Turbo_Setting : 0

Number_of_Workloads_Missing : 0

Number_of_Compliance_Issues (identified by '*' or '+') : 6

Issues Found :

    Tile0-weathervaneelastic-p0

    Tile2-weathervaneelastic-p0

    Tile2-weathervaneauction-p1

    Tile2-weathervaneauction-p2

    Tile3-weathervaneauction-p2

    Tile3-weathervaneelastic-p2

Median_Phase : p2

I looked in the wrf files from weathervane, and I see the following sampling of exceptions:

19:26:28.029 [pool-3-thread-91] WARN  c.v.w.w.common.core.Operation - Operation:run Execution Failed for GetNextBid for behavior UUID 41ad0ea8-7390-44d3-99f7-96b5ee58113d Failure Reason = com.vmware.weathervane.workloadDriver.common.exceptions.OperationFailedException: Incomplete response received when retrieving current bid for auction 30294

19:26:28.124 [pool-3-thread-91] WARN  c.v.w.w.common.core.Operation - Operation:run restarting userId = 5033, operation = GetNextBid, behavior UUID 41ad0ea8-7390-44d3-99f7-96b5ee58113d Failure Reason = com.vmware.weathervane.workloadDriver.common.exceptions.OperationFailedException: Incomplete response received when retrieving current bid for auction 30294

..

19:28:07.232 [epollEventLoopGroup-3-42] WARN  i.n.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.

io.netty.handler.timeout.ReadTimeoutException: null

|         200| 3733.09|   0.021|   37484|       0|       0|GetNextBid:15313/0(60/0.000/0.000), GetUserProfile:354/0(2/0.014/0.000), AddImageForItem:69/0(5/0.037/0.000), AddItem:523/0(2/0.015/0.000), Login:359/0(3/0.015/0.000), GetPurchaseHistory:339/0(2/0.039/0.000), GetImageForItem:120/0(5/0.023/0.000), GetActiveAuctions:5448/0(2/0.011/0.000), UpdateUserProfile:185/0(2/0.017/0.000), PlaceBid:1153/0(2/0.010/0.000), GetBidHistory:164/0(2/0.014/0.000), JoinAuction:2656/0(3/0.042/0.000), HomePage:344/0(2/0.021/0.000), GetItemDetail:2800/0(2/0.025/0.000), Register:0/0(2/0.000/0.000), NoOperation:0/0(9999999/0.000/0.000), Logout:345/0(3/0.014/0.000), GetCurrentItem:2520/0(3/0.015/0.000), GetAuctionDetail:2786/0(2/0.032/0.000), GetAttendanceHistory:164/0(2/0.012/0.000), LeaveAuction:1842/0(2/0.011/0.000), | Sep 28,2019 19:28:23 EDT

|        Time|      TP|  Avg RT|     Ops|     Ops|     Ops|Per Operation: Operation:Total/FailedRT(RT-Limit/AvgRT/AvgFailingRT)| Timestamp

|       (sec)| (ops/s)|   (sec)|   Total|  Failed| Fail RT| 

Reply
0 Kudos
1 Solution

Accepted Solutions
vs_ang
Contributor
Contributor
Jump to solution

Hi,

Thank you for the response. I enabled esxtop collection and ran with 2 tiles to produce the exceptions. Looking at the logs through the visualizer tool, I didn't see any obvious CPU/Memory/Disk issues. I also looked at the weathervane log files. There are indeed quite a few Java exceptions in the logs, however it's not clear to me that they are caused by any infrastructure related issues.

Also, in the score file, the exceptions are marked with either * or +, but how do I use those to link back to the exceptions? i.e. the compliance issues seem to be some deviation on the expected numbers? How do I go about finding what those deviations are and how they are caused?

VMmark 3.1 TileScore : v1.2 12202018

Computing_results_for_test_with_tile_count: 0 ...

Turbo Mode Enabled

Tiles = 2 : Enabled Workloads : WV   DS3Web WV_Elastic Standby  (6)

Esxtop Power Mode Enabled

First Sample: 1570204440 Fri Oct  4 11:54:00 2019

Info: 2 : 1570204560 : 2 : 6

Calculating Turbo Timing & Scoring : Run_Is_NOT_Compliant

Run_start 1570204440  : Fri Oct  4 11:54:00 2019

Start_time 1570204560 : Fri Oct  4 11:56:00 2019

End_time 1570206180 : Fri Oct  4 12:23:00 2019

Run_end 1570206300 : Fri Oct  4 12:25:00 2019

Duration_in_minutes : 27.00

Steady_state_start 1570204860 : Fri Oct  4 12:01:00 2019

Steady_state_end 1570205760 : Fri Oct  4 12:16:00 2019

Phase_0_begin 1570204860 : Fri Oct  4 12:01:00 2019

Phase_1_begin 1570205160 : Fri Oct  4 12:06:00 2019

Phase_2_begin 1570205460 : Fri Oct  4 12:11:00 2019

TILE_0_Scores:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby

p0                         3594.93              570.61      1000.60       748.80       534.00         1.00

p1                         3583.22              575.16      1038.60       599.40       365.40         1.00

p2                         3589.78              574.47      1043.60       771.40       564.80         1.00

TILE_0_Ratios:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby     Geo.Mean

p0                            1.00                1.00         1.36         1.50         1.54         1.00         1.26

p1                            1.00                1.01         1.41         1.20         1.05         1.00         1.12

p2                            1.00                1.00         1.42         1.54         1.63         1.00         1.29

TILE_0_QoS:    WeathervaneAuction% WeathervaneElastic%    DVDStoreA    DVDStoreB    DVDStoreC

p0                     0.69 | 0.01        1.25 | 1.20*       685.47       803.00       895.39

p1                     0.61 | 0.01         1.07 | 0.64       603.70       702.33       824.17

p2                     0.51 | 0.00         0.92 | 0.50       583.07       714.58       805.06

TILE_1_Scores:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby

p0                         3590.95              581.78       968.60       721.80       514.60         1.00

p1                         3601.37              571.88       985.20       560.00       345.80         1.00

p2                         3591.02              581.17       974.60       733.40       521.60         1.00

TILE_1_Ratios:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby     Geo.Mean

p0                         3594.93              570.61      1000.60       748.80       534.00         1.00

p1                         3583.22              575.16      1038.60       599.40       365.40         1.00

p2                         3589.78              574.47      1043.60       771.40       564.80         1.00

TILE_1_QoS:    WeathervaneAuction% WeathervaneElastic%    DVDStoreA    DVDStoreB    DVDStoreC

p0                     0.94 | 0.37        1.77 | 1.76*       757.03       917.33      1047.67

p1                     0.46 | 0.00         0.54 | 0.26       743.37       857.50      1021.33

p2                     0.57 | 0.00        1.01 | 1.01*       755.17       887.25      1006.39

p0_score =   2.49

p1_score =   2.21

p2_score =   2.53

Infrastructure_Operations_Scores:      vMotion     SVMotion     XVMotion       Deploy

Completed_Ops_PerHour                     3.50         2.00         2.00         1.00

Avg_Seconds_To_Complete                   5.11        78.22       111.85       288.65

Failures                                  0.00         0.00         0.00         0.00

Ratio                                     0.13         0.11         0.11         0.12

Number_Of_Threads                            1            1            1            1

EsxtopPower_Results:

p0     Avg_Watts       Target

          284.77           10.0.0.35

          352.80           10.0.0.36

          286.00           10.0.0.37

          278.00           10.0.0.38

p1     Avg_Watts       Target

          283.40           10.0.0.35

          356.33           10.0.0.36

          286.00           10.0.0.37

          278.00           10.0.0.38

p2     Avg_Watts       Target

          287.10           10.0.0.35

          350.37           10.0.0.36

          286.00           10.0.0.37

          278.00           10.0.0.38

Warnings Messages::

  p1 : WeathervaneAuction0 Exceptions : 1

  p2 : WeathervaneAuction0 Exceptions : 1

  rampdown : WeathervaneAuction0 Exceptions : 1

Summary ::

Run_Is_NOT_Compliant

Turbo_Setting : 1

Number_of_Workloads_Missing : 0

Number_of_Compliance_Issues (identified by '*' or '+') : 3

Issues Found :

    Tile0-weathervaneelastic-p0

    Tile1-weathervaneelastic-p0

    Tile1-weathervaneelastic-p2

Median_Phase : p0

Unreviewed_VMmark3_EsxtopPower_Avg_Watts :  1201.57

Unreviewed_VMmark3_Applications_Score    :     2.49

Unreviewed_VMmark3_Infrastructure_Score  :     0.12

Unreviewed_VMmark3_Score                 :     2.02

Unreviewed_VMmark3_Power_Efficiency*     :   1.6781

thanks,

View solution in original post

Reply
0 Kudos
2 Replies
jpschnee
VMware Employee
VMware Employee
Jump to solution

Historically, these types of exceptions are a result of storage bottlenecks and/or under-provisioned clients.  My suggestion would be to do a baseline run with just 1 tile while enabling esxtop collection.  See the section "How to Enable and Analyze esxtop Performance Data" in the VMmark User's Guide for details.  Afterwards, you can review the detail and continue to add tiles until you start seeing non-compliant results.

Beyond this you can also review the SAR data for weathervane.  This is found within the "workloadfiles" directory of a VMmark results folder (ex: /root/VMmark3/results/Results_20190924155520-1tile-run1/workloadfiles).  You'll unzip the corresponding tile's weathervane-outputN.zip where N is the tile number and you'll see a massive quantity of additional weathervane specific log files that you can review.  Execute "man sar" to see the options for sar and viewing it's output.

-Joshua
Reply
0 Kudos
vs_ang
Contributor
Contributor
Jump to solution

Hi,

Thank you for the response. I enabled esxtop collection and ran with 2 tiles to produce the exceptions. Looking at the logs through the visualizer tool, I didn't see any obvious CPU/Memory/Disk issues. I also looked at the weathervane log files. There are indeed quite a few Java exceptions in the logs, however it's not clear to me that they are caused by any infrastructure related issues.

Also, in the score file, the exceptions are marked with either * or +, but how do I use those to link back to the exceptions? i.e. the compliance issues seem to be some deviation on the expected numbers? How do I go about finding what those deviations are and how they are caused?

VMmark 3.1 TileScore : v1.2 12202018

Computing_results_for_test_with_tile_count: 0 ...

Turbo Mode Enabled

Tiles = 2 : Enabled Workloads : WV   DS3Web WV_Elastic Standby  (6)

Esxtop Power Mode Enabled

First Sample: 1570204440 Fri Oct  4 11:54:00 2019

Info: 2 : 1570204560 : 2 : 6

Calculating Turbo Timing & Scoring : Run_Is_NOT_Compliant

Run_start 1570204440  : Fri Oct  4 11:54:00 2019

Start_time 1570204560 : Fri Oct  4 11:56:00 2019

End_time 1570206180 : Fri Oct  4 12:23:00 2019

Run_end 1570206300 : Fri Oct  4 12:25:00 2019

Duration_in_minutes : 27.00

Steady_state_start 1570204860 : Fri Oct  4 12:01:00 2019

Steady_state_end 1570205760 : Fri Oct  4 12:16:00 2019

Phase_0_begin 1570204860 : Fri Oct  4 12:01:00 2019

Phase_1_begin 1570205160 : Fri Oct  4 12:06:00 2019

Phase_2_begin 1570205460 : Fri Oct  4 12:11:00 2019

TILE_0_Scores:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby

p0                         3594.93              570.61      1000.60       748.80       534.00         1.00

p1                         3583.22              575.16      1038.60       599.40       365.40         1.00

p2                         3589.78              574.47      1043.60       771.40       564.80         1.00

TILE_0_Ratios:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby     Geo.Mean

p0                            1.00                1.00         1.36         1.50         1.54         1.00         1.26

p1                            1.00                1.01         1.41         1.20         1.05         1.00         1.12

p2                            1.00                1.00         1.42         1.54         1.63         1.00         1.29

TILE_0_QoS:    WeathervaneAuction% WeathervaneElastic%    DVDStoreA    DVDStoreB    DVDStoreC

p0                     0.69 | 0.01        1.25 | 1.20*       685.47       803.00       895.39

p1                     0.61 | 0.01         1.07 | 0.64       603.70       702.33       824.17

p2                     0.51 | 0.00         0.92 | 0.50       583.07       714.58       805.06

TILE_1_Scores:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby

p0                         3590.95              581.78       968.60       721.80       514.60         1.00

p1                         3601.37              571.88       985.20       560.00       345.80         1.00

p2                         3591.02              581.17       974.60       733.40       521.60         1.00

TILE_1_Ratios:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby     Geo.Mean

p0                         3594.93              570.61      1000.60       748.80       534.00         1.00

p1                         3583.22              575.16      1038.60       599.40       365.40         1.00

p2                         3589.78              574.47      1043.60       771.40       564.80         1.00

TILE_1_QoS:    WeathervaneAuction% WeathervaneElastic%    DVDStoreA    DVDStoreB    DVDStoreC

p0                     0.94 | 0.37        1.77 | 1.76*       757.03       917.33      1047.67

p1                     0.46 | 0.00         0.54 | 0.26       743.37       857.50      1021.33

p2                     0.57 | 0.00        1.01 | 1.01*       755.17       887.25      1006.39

p0_score =   2.49

p1_score =   2.21

p2_score =   2.53

Infrastructure_Operations_Scores:      vMotion     SVMotion     XVMotion       Deploy

Completed_Ops_PerHour                     3.50         2.00         2.00         1.00

Avg_Seconds_To_Complete                   5.11        78.22       111.85       288.65

Failures                                  0.00         0.00         0.00         0.00

Ratio                                     0.13         0.11         0.11         0.12

Number_Of_Threads                            1            1            1            1

EsxtopPower_Results:

p0     Avg_Watts       Target

          284.77           10.0.0.35

          352.80           10.0.0.36

          286.00           10.0.0.37

          278.00           10.0.0.38

p1     Avg_Watts       Target

          283.40           10.0.0.35

          356.33           10.0.0.36

          286.00           10.0.0.37

          278.00           10.0.0.38

p2     Avg_Watts       Target

          287.10           10.0.0.35

          350.37           10.0.0.36

          286.00           10.0.0.37

          278.00           10.0.0.38

Warnings Messages::

  p1 : WeathervaneAuction0 Exceptions : 1

  p2 : WeathervaneAuction0 Exceptions : 1

  rampdown : WeathervaneAuction0 Exceptions : 1

Summary ::

Run_Is_NOT_Compliant

Turbo_Setting : 1

Number_of_Workloads_Missing : 0

Number_of_Compliance_Issues (identified by '*' or '+') : 3

Issues Found :

    Tile0-weathervaneelastic-p0

    Tile1-weathervaneelastic-p0

    Tile1-weathervaneelastic-p2

Median_Phase : p0

Unreviewed_VMmark3_EsxtopPower_Avg_Watts :  1201.57

Unreviewed_VMmark3_Applications_Score    :     2.49

Unreviewed_VMmark3_Infrastructure_Score  :     0.12

Unreviewed_VMmark3_Score                 :     2.02

Unreviewed_VMmark3_Power_Efficiency*     :   1.6781

thanks,

Reply
0 Kudos