VMware Performance Community
ChristianFredri
Contributor
Contributor

STAF Trust Level incorrect for the following machines: 1 machines [VM reboot may be required] : Client0

After failed run, exiting and starting a new run results in:
STAF Trust Level incorrect for the following machines: 1 machines [VM reboot may be required] : Client0

No other detail, the STAF file looks fine and I have rebooted the VM twice.

Tags (1)
0 Kudos
8 Replies
fredab2
VMware Employee
VMware Employee

Please  tar up the Results directory and attach back to this thread

Thanks,

Fred

0 Kudos
fredab2
VMware Employee
VMware Employee

You can also try the following in the meantime and let me know results.

From PrimecCient Linux terminal:

1. ping client0

2. ping IP_ADDR_OF_CLIENT0

3. STAF client0 ping ping

4. STAF IP_ADDR_OF_CLIENT0 ping ping

From Client0 Linux terminal:

5. ping PrimeClient

6. ping IP_ADDR_OF_PRIMECLIENT

7. STAF PrimeClient ping ping

8. STAF IP_ADDR_OF_PRIMECLIENT ping ping

Fred

0 Kudos
fredab2
VMware Employee
VMware Employee

Also, did you check the following from Page 83 of VMmark3 User Manual:

STAF Complains About Trust Level

If STAF complains that systems are at trust level 3, this suggests that the trust level settings in the STAF.cfgfile are not being correctly processed. To correct this, make sure each STAF.cfg file contains the following:

       trust machine 192.168.*.* level 5 

(substituting the first three octets of the network to which the machines are connected).

STAF is Unable to Connect to a Particular Server

If STAF is unable to connect to a particular server, this might be due to one of the following causes:

  • Acommunicationorconfigurationproblem.From a terminal window on the client, type:staf alias ping ping(where alias is the alias in the client’s hosts file (for example, DS3WebA0)).The proper response is PONG.
  • DissimilarversionsofSTAFareinstalled.Make sure all clients and all workload virtual machines are running the same version of STAF.
  • Reviewthe/etc/hostsfilesonbothsystemsforaccuracy.

0 Kudos
ChristianFredri
Contributor
Contributor

I noted that the "PrimeClient" entry was not in the host file on the Client0 VM. It has been added.

I have tested, this all worked without issue (I had only tested staf command from the PrimeClient, which had worked prior).

0 Kudos
ChristianFredri
Contributor
Contributor

Attached.

0 Kudos
fredab2
VMware Employee
VMware Employee

I believe I found your issue from that zip file. In the VMmark.properties file you have the following:

Deploy/DeployVMinfo = Client0:10.16.102.200,Standby0:10.16.102.219,DS3WebA0:10.16.102.201,DS3WebB0:10.16.102.202,DS3WebC0:10.16.102.203,DS3DB0:10.16.102.204,AuctionWebA0:10.16.102.213,AuctionWebB0:10.16.102.213,AuctionAppA0:10.16.102.215,AuctionAppB0:10.16.102.216,AuctionDB0:10.16.102.218,AuctionMSQ0:10.16.102.212,AuctionLB0:10.16.102.211,AuctionNoSQL0:10.16.102.217,ElasticWebA0:10.16.102.206,ElasticWebB0:10.16.102.207,ElasticAppA0:10.16.102.208,ElasticAppB0:10.16.102.209,ElasticDB0:10.16.102.210

Deploy/DeployVMinfo is for the names of the VMs that are created during the deploy Infrastructure workload NOT meant for provisioning.  I believe the benchmark is trying to recreate / overwrite Client0.

Change the line to this and retry.

Deploy/DeployVMinfo = DeployVM1:10.16.102.245, DeployVM1:10.16.102.246, DeployVM2:10.16.102.247,DeployVM2:10.16.102.248,DeployVM3:10.16.102.249

Fred

0 Kudos
fredab2
VMware Employee
VMware Employee

DeployVMinfo corrections alleviated the STAF issues described.  Latest issue is failing Weathervane QOS (see attached Score_1*.txt file).

The issue is with the QOS and Weathervane workload. This is usually due to the response times from the storage. This benchmark is mostly run on all flash storage. I know you are using your production storage in the production configuration.  You can try kicking the benchmark in off hours and it might pass (less load on storage environment). Also, you can try the following in the mean time.

In, VMmark3.properties make these changes:

TurboMode = true                     (To shorten the run from 3 hours)

InfrastructureList =                   (To run only the workloads and no deploy, vmotions, svmotions, or xvmotion)

Fred

0 Kudos
fredab2
VMware Employee
VMware Employee

Christian,

The issues with QOS are due to latency issues between SUTs and storage.  Here is a single tile run I made a little while back on a pair of Haswell (E5-2699) servers connected via HBA fiber to some EMC arrays. Notice how much lower the QOS numbers are than your runs.

DVDStore QOS numbers are in ms and your runs are 400ms on average slower than mine. Weathervane QOS numbers are a little more confusing.

Tile_-_Qos: Weathervane%

p0         9.84 | 6.38*

On the left is the normalized response time average.  On the right is the maximum operation type response time percent failed.   Both metrics are in % (of 100).

2 - 5% on left usually passes compliance.

My run:

TILE_0_Scores:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby
p0                         3595.42              573.70       869.55       618.05       428.82         1.00
p1                         3599.19              571.54       875.35       583.42       412.55         1.00
p2                         3597.09              563.91       866.08       622.42       450.10         1.00 
TILE_0_Ratios:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby     Geo.Mean
p0                            1.00                1.00         1.18         1.24         1.24         1.00         1.13
p1                            1.00                1.00         1.19         1.17         1.19         1.00         1.11
p2                            1.00                1.00         1.18         1.24         1.30         1.00         1.14 
TILE_0_QoS:    WeathervaneAuction% WeathervaneElastic%    DVDStoreA    DVDStoreB    DVDStoreC
p0                     0.86 | 0.05                         0.48 | 0.32                        1050.18      1242.44      1399.88
p1                     0.90 | 0.05                         0.70 | 0.40                        1024.63      1206.84      1365.72
p2                     0.89 | 0.03                         0.69 | 0.39                        1047.56      1213.83      1364.51

Your run below:

TILE_0_Scores:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby
p0                         3551.06              577.51       711.48       497.52       365.93         1.00
p1                         3524.22              573.59       690.40       490.95       344.38         1.00
p2                         3583.98              578.26       703.45       501.05       374.20         1.00
TILE_0_Ratios:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby     Geo.Mean
p0                            0.99                1.01         0.97         0.99         1.06         1.00         1.00
p1                            0.98                1.00         0.94         0.98         0.99         1.00         0.98
p2                            1.00                1.01         0.96         1.00         1.08         1.00         1.01
TILE_0_QoS:    WeathervaneAuction% WeathervaneElastic%    DVDStoreA    DVDStoreB    DVDStoreC
p0                    9.84 | 6.38*        3.62 | 3.33*      1473.89      1650.24      1771.15
p1                   15.67 | 16.10*        5.36 | 4.25*      1555.03      1720.73      1846.81
p2                    4.85 | 0.29        5.98 | 4.60*      1476.06      1615.59      1694.47
 
The compliance issues you have (QOS) in your run show that the benchmark ran all aspects (workloads and infrastructure) but the Weathervane response times were slower than the benchmarks minimum requirements for compliance.   
So functionally all aspects of the VMmark3 benchmark ran without failures but were deemed too latent to be compliant because of QOS.

If you can also lower the following parameters in VMmark3.properties to use this in your own environment (non-compliant runs) for comparability if you wish:

Weathervane/AuctionUsers =

Weathervane/ElasticUsers =

Fred
0 Kudos