8 Replies Latest reply on Jun 18, 2018 3:23 PM by fredab2

    STAF Trust Level incorrect for the following machines: 1 machines [VM reboot may be required] : Client0

    ChristianFredrickson Lurker

      After failed run, exiting and starting a new run results in:
      STAF Trust Level incorrect for the following machines: 1 machines [VM reboot may be required] : Client0

      No other detail, the STAF file looks fine and I have rebooted the VM twice.

        • 1. Re: STAF Trust Level incorrect for the following machines: 1 machines [VM reboot may be required] : Client0
          fredab2 Novice
          VMware Employees

          Please  tar up the Results directory and attach back to this thread

           

          Thanks,

          Fred

          • 2. Re: STAF Trust Level incorrect for the following machines: 1 machines [VM reboot may be required] : Client0
            fredab2 Novice
            VMware Employees

            You can also try the following in the meantime and let me know results.

             

            From PrimecCient Linux terminal:

            1. ping client0

            2. ping IP_ADDR_OF_CLIENT0

            3. STAF client0 ping ping

            4. STAF IP_ADDR_OF_CLIENT0 ping ping

             

            From Client0 Linux terminal:

            5. ping PrimeClient

            6. ping IP_ADDR_OF_PRIMECLIENT

            7. STAF PrimeClient ping ping

            8. STAF IP_ADDR_OF_PRIMECLIENT ping ping

             

            Fred

            • 3. Re: STAF Trust Level incorrect for the following machines: 1 machines [VM reboot may be required] : Client0
              fredab2 Novice
              VMware Employees

              Also, did you check the following from Page 83 of VMmark3 User Manual:

               

              STAF Complains About Trust Level

              If STAF complains that systems are at trust level 3, this suggests that the trust level settings in the STAF.cfgfile are not being correctly processed. To correct this, make sure each STAF.cfg file contains the following:

                     trust machine 192.168.*.* level 5 

              (substituting the first three octets of the network to which the machines are connected).

              STAF is Unable to Connect to a Particular Server

              If STAF is unable to connect to a particular server, this might be due to one of the following causes:

              • Acommunicationorconfigurationproblem.From a terminal window on the client, type:staf alias ping ping(where alias is the alias in the client’s hosts file (for example, DS3WebA0)).The proper response is PONG.
              • DissimilarversionsofSTAFareinstalled.Make sure all clients and all workload virtual machines are running the same version of STAF.
              • Reviewthe/etc/hostsfilesonbothsystemsforaccuracy.

               

              • 4. Re: STAF Trust Level incorrect for the following machines: 1 machines [VM reboot may be required] : Client0
                ChristianFredrickson Lurker

                I noted that the "PrimeClient" entry was not in the host file on the Client0 VM. It has been added.

                I have tested, this all worked without issue (I had only tested staf command from the PrimeClient, which had worked prior).

                • 6. Re: STAF Trust Level incorrect for the following machines: 1 machines [VM reboot may be required] : Client0
                  fredab2 Novice
                  VMware Employees

                   

                   

                  I believe I found your issue from that zip file. In the VMmark.properties file you have the following:

                   

                  Deploy/DeployVMinfo = Client0:10.16.102.200,Standby0:10.16.102.219,DS3WebA0:10.16.102.201,DS3WebB0:10.16.102.202,DS3WebC0:10.16.102.203,DS3DB0:10.16.102.204,AuctionWebA0:10.16.102.213,AuctionWebB0:10.16.102.213,AuctionAppA0:10.16.102.215,AuctionAppB0:10.16.102.216,AuctionDB0:10.16.102.218,AuctionMSQ0:10.16.102.212,AuctionLB0:10.16.102.211,AuctionNoSQL0:10.16.102.217,ElasticWebA0:10.16.102.206,ElasticWebB0:10.16.102.207,ElasticAppA0:10.16.102.208,ElasticAppB0:10.16.102.209,ElasticDB0:10.16.102.210

                  Deploy/DeployVMinfo is for the names of the VMs that are created during the deploy Infrastructure workload NOT meant for provisioning.  I believe the benchmark is trying to recreate / overwrite Client0.

                   

                  Change the line to this and retry.

                   

                  Deploy/DeployVMinfo = DeployVM1:10.16.102.245, DeployVM1:10.16.102.246, DeployVM2:10.16.102.247,DeployVM2:10.16.102.248,DeployVM3:10.16.102.249

                   

                  Fred

                  • 7. Re: STAF Trust Level incorrect for the following machines: 1 machines [VM reboot may be required] : Client0
                    fredab2 Novice
                    VMware Employees

                    DeployVMinfo corrections alleviated the STAF issues described.  Latest issue is failing Weathervane QOS (see attached Score_1*.txt file).

                     

                    The issue is with the QOS and Weathervane workload. This is usually due to the response times from the storage. This benchmark is mostly run on all flash storage. I know you are using your production storage in the production configuration.  You can try kicking the benchmark in off hours and it might pass (less load on storage environment). Also, you can try the following in the mean time.

                     

                    In, VMmark3.properties make these changes:

                     

                    TurboMode = true                     (To shorten the run from 3 hours)

                    InfrastructureList =                   (To run only the workloads and no deploy, vmotions, svmotions, or xvmotion)

                     

                    Fred

                    • 8. Re: STAF Trust Level incorrect for the following machines: 1 machines [VM reboot may be required] : Client0
                      fredab2 Novice
                      VMware Employees

                      Christian,

                       

                      The issues with QOS are due to latency issues between SUTs and storage.  Here is a single tile run I made a little while back on a pair of Haswell (E5-2699) servers connected via HBA fiber to some EMC arrays. Notice how much lower the QOS numbers are than your runs.

                       

                      DVDStore QOS numbers are in ms and your runs are 400ms on average slower than mine. Weathervane QOS numbers are a little more confusing.

                       

                      Tile_-_Qos: Weathervane%

                      p0         9.84 | 6.38*

                      On the left is the normalized response time average.  On the right is the maximum operation type response time percent failed.   Both metrics are in % (of 100).

                      2 - 5% on left usually passes compliance.

                       

                      My run:

                      TILE_0_Scores:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby
                      p0                         3595.42              573.70       869.55       618.05       428.82         1.00
                      p1                         3599.19              571.54       875.35       583.42       412.55         1.00
                      p2                         3597.09              563.91       866.08       622.42       450.10         1.00 
                      TILE_0_Ratios:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby     Geo.Mean
                      p0                            1.00                1.00         1.18         1.24         1.24         1.00         1.13
                      p1                            1.00                1.00         1.19         1.17         1.19         1.00         1.11
                      p2                            1.00                1.00         1.18         1.24         1.30         1.00         1.14 
                      TILE_0_QoS:    WeathervaneAuction% WeathervaneElastic%    DVDStoreA    DVDStoreB    DVDStoreC
                      p0                     0.86 | 0.05                         0.48 | 0.32                        1050.18      1242.44      1399.88
                      p1                     0.90 | 0.05                         0.70 | 0.40                        1024.63      1206.84      1365.72
                      p2                     0.89 | 0.03                         0.69 | 0.39                        1047.56      1213.83      1364.51

                      Your run below:

                      TILE_0_Scores:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby
                      p0                         3551.06              577.51       711.48       497.52       365.93         1.00
                      p1                         3524.22              573.59       690.40       490.95       344.38         1.00
                      p2                         3583.98              578.26       703.45       501.05       374.20         1.00
                      TILE_0_Ratios:  WeathervaneAuction  WeathervaneElastic    DVDStoreA    DVDStoreB    DVDStoreC      Standby     Geo.Mean
                      p0                            0.99                1.01         0.97         0.99         1.06         1.00         1.00
                      p1                            0.98                1.00         0.94         0.98         0.99         1.00         0.98
                      p2                            1.00                1.01         0.96         1.00         1.08         1.00         1.01
                      TILE_0_QoS:    WeathervaneAuction% WeathervaneElastic%    DVDStoreA    DVDStoreB    DVDStoreC
                      p0                    9.84 | 6.38*        3.62 | 3.33*      1473.89      1650.24      1771.15
                      p1                   15.67 | 16.10*        5.36 | 4.25*      1555.03      1720.73      1846.81
                      p2                    4.85 | 0.29        5.98 | 4.60*      1476.06      1615.59      1694.47
                       
                      The compliance issues you have (QOS) in your run show that the benchmark ran all aspects (workloads and infrastructure) but the Weathervane response times were slower than the benchmarks minimum requirements for compliance.   
                      So functionally all aspects of the VMmark3 benchmark ran without failures but were deemed too latent to be compliant because of QOS.

                      If you can also lower the following parameters in VMmark3.properties to use this in your own environment (non-compliant runs) for comparability if you wish:

                      Weathervane/AuctionUsers =

                      Weathervane/ElasticUsers =

                       

                      Fred