5 Replies Latest reply on May 27, 2019 6:35 AM by mathieugont

    VMmark: staf returns RC = 17 (file-system problem)

    mathieugont Novice

      Dear community

       

      I am trying to run VMmark -> Weathervane/Auction but, even if my staf commands like "staf $host MISC WHOAMI" succeed, it systematically crashes on FS operation with RC = 17.

      In other words, staf says there is an issue with the filesystem.

       

      Any idea?

        • 1. Re: VMmark: staf returns RC = 17 (file-system problem)
          dmorse Enthusiast
          VMware Employees

          Hi mathieugont

           

          In other words, staf says there is an issue with the filesystem.

          Right; this tells me, there definitely seems to be an issue with your underlying storage infrastructure.  Did you see my request on the other thread you created?

           

          Namely:

          Can you elaborate on:

          • Server make/model (ESXi hosts)
          • Storage make/model (including types of vendor/capacity of your drives, whether they're HDDs [and if so what speed] or SSDs, and what RAID/LUN configurations they are in?  Are you using vSAN?
          • ESXi and vCenter version, including exact build numbers

          Thanks, David

          • 2. Re: VMmark: staf returns RC = 17 (file-system problem)
            mathieugont Novice

            Hi David

             

            Sorry, I should give more details.

            To answer to your questions:

            - Dual socket BDW E5-2699v4 servers

            - 780GB DRAM per server

            - 1x local ATA disk (1.8TB HDD) per server (Western Digital?), partition format GPT

            - ESXi-6.7.0.13006603

            - Each server's HDD is a datastore based on VMFS-6.82

            - no vSan (i guess)

            - vSphere Client 6.7.0.30000

            - All my clients are on a hardware server (ESXi001), except the tile and the prime clients hosted on another one (ESXi005).

            • 3. Re: VMmark: staf returns RC = 17 (file-system problem)
              mathieugont Novice

              Below some additional info...

               

              The HDD of my ESXi server was full because I created additional disk. Can it be the reason of RC = 17?

               

              After deleting these disk, I made 600GB of free space and rerun. Find the stdout/stderr in attachment.

              According to the 1st error (the others are same), it is now RC = 16:

               

              The process failed to start, RC: 16, STAFResult: STAFConnectionProviderConnect: Client SSL handshake timed out: 22, Endpoint: ssl://AuctionLB0

               

              But I can run:

              >>> time staf AuctionNoSQL0 ping ping

              Response

              --------

              PONG

               

              real    0m20.076s

              user    0m0.003s

              sys     0m0.004s

              >>> ping -c 5 AuctionNoSQL0

              PING AuctionNoSQL0 (172.22.197.237) 56(84) bytes of data.

              64 bytes from AuctionNoSQL0 (172.22.***.***): icmp_seq=1 ttl=64 time=0.576 ms

              64 bytes from AuctionNoSQL0 (172.22.***.***): icmp_seq=2 ttl=64 time=0.635 ms

              64 bytes from AuctionNoSQL0 (172.22.***.***): icmp_seq=3 ttl=64 time=0.659 ms

              64 bytes from AuctionNoSQL0 (172.22.***.***): icmp_seq=4 ttl=64 time=0.664 ms

              64 bytes from AuctionNoSQL0 (172.22.***.***): icmp_seq=5 ttl=64 time=0.663 ms

               

              --- AuctionNoSQL0 ping statistics ---

              5 packets transmitted, 5 received, 0% packet loss, time 4001ms

              rtt min/avg/max/mdev = 0.576/0.639/0.664/0.040 ms

               

              To be continued...

              • 4. Re: VMmark: staf returns RC = 17 (file-system problem)
                dmorse Enthusiast
                VMware Employees

                Hi mathieugont

                 

                Thank you for providing your hardware specs; just to quote them again:

                - 1x local ATA disk (1.8TB HDD) per server (Western Digital?), partition format GPT

                - ESXi-6.7.0.13006603

                - Each server's HDD is a datastore based on VMFS-6.82

                - no vSan (i guess)

                I see two problems here:

                 

                1. You don't have vSAN, as you guessed, but that is not a requirement.  What is required is shared storage (i.e. some kind of SAN, iSCSI, NFS, etc.).
                Per the VMmark User's Guide:

                System Under Test Storage Requirements

                Configure the system under test with enough shared datastore capacity to hold the disks and paging files for

                all the virtual machines required for the VMmark Benchmark runs. This is approximately 891GB per tile (not

                counting the storage space required by the prime and tile clients, addressed in “Storage Requirements for

                VMmark Clients” on page 35).

                NOTE In order to provide source and target datastores for the storage relocation operations that are part of

                the benchmark, VMmark requires a minimum of two datastore partitions.

                If using VMware vSAN™ as the primary storage solution, a secondary storage solution will be needed for

                infrastructure operations.

                Additionally, the benchmark requires that all ESXi hosts used in a test have access to the same shared storage.

                2. The fact that you only have one local (not shared) HDD per ESXi host / server is problematic for several reasons:

                • 1 x local ATA 1.8TB HDD = probably 10K RPM (not SSD) - the IOPS (throughput) will be very low (~500?) - this is well below the requirements of VMmark.

                Per the User's Guide:

                The VMmark benchmark needs high-throughput, low-latency storage. While the exact bandwidth

                requirements will vary based on other aspects of the environment, a single VMmark tile can drive about 3500

                IOPS (Input/Output Operations Per Second), with additional tiles typically each driving somewhat less. The

                latency requirements will also vary based on other aspects of the environment; a review of published VMmark

                results will provide a sense of the storage solutions that work with VMmark. Thus in addition to ensuring that

                you have enough storage capacity, you should also make sure your storage system will have adequate

                performance.

                 

                So, I think a lot of the problems you have with the VMmark3 benchmark is due to the lack of a high-performance, shared storage array.  I would recommend looking at VMmark 3.x Results -- those PDFs have a section on their storage hardware, most of which consistent of all-flash storage arrays.

                • 5. Re: VMmark: staf returns RC = 17 (file-system problem)
                  mathieugont Novice

                  Hi David

                   

                  Thanks for your advice.

                  Please look at the new post

                  VMmark/Auction fails with RC=1

                  (I could not replied to this one... weird).