Highlighted
Enthusiast
Enthusiast

esxtop collection is causing `VMmark3-ParsePowerData failed to start/complete`

Hi,

if i enable esxtop collection in the VMmark3.properties:

Esxtopcollection = true

EsxtopLUN = /vmfs/volumes/VMmark-iSCSI-03

VMmark3 job fails at the end, with the following error (STAX_Job_1_User.log):

20200630-10:50:41 Info  VMmark3-ParsePowerData                                

20200630-10:50:56 Info  Process: VMmark3-ParsePowerData failed to start/complet

                        e. Returned: RC = 3, STAFResult = None                

20200630-10:50:56 Info  VMmark3-ParsePowerData Failed to Run Correctly    

and VMmark3-ParsePowerData.txt:

Sample 539: 128235 : LineLen : 128179

Sample 540: 128235 : LineLen : 128179

Sample 541: 128235 : LineLen : 128179

Error

Anyone has seen something like this?

Regards,

Vladimir

0 Kudos
9 Replies
Highlighted
VMware Employee
VMware Employee

Vladimir,

I haven't seen that issue and just ran a test case (with Turbo = true)  in my environment and had no issues with the esxtop collection from both hosts in my cluster.

Please attach your VMmark3.properties and all the *.log files in your results directory and someone on the VMware VMmark3 team will look into it.

Fred

0 Kudos
Highlighted
Enthusiast
Enthusiast

Hi Fred,

I've attached all requested files.

Regards,

Vladimir@

0 Kudos
Highlighted
VMware Employee
VMware Employee

It looks like the VMmark3 run ran for the entire time and had issues during collection of files.

  Do you end up with  *-Esxtop.csv.tgz. files in the result directory? 

  if no:   Do you see *-Esxtop.csv.tgz files on the SUT hosts in  /vmfs/volumes/VMmark-NVMeoF-03?

Can you make another quick 1 tile run with:

Turbo = true

Reporter = False

Debug=0

EsxtopCollection = true

Fred

0 Kudos
Highlighted
Enthusiast
Enthusiast

csv.tgz files are present in the 'results' directory.

It's the VMmark3-ParsePowerData.pl script that fails:

[root@primeclient Results_20200615142015_NVMeoF_20VMs]# /root/VMmark3/tools/VMmark3-ParsePowerData.pl

/root/VMmark3/tools/VMmark3-ParsePowerData.pl:v1.3

ESXTOP Tgz Files:

  esx514.dvb400.service-now.com : Decompressing : Done

  esx515.dvb400.service-now.com : Decompressing : Done

ESXTOP csv files:

  esx514.dvb400.service-now.com : Validating : Sample 2: 110448 : LineLen : 110392

Sample 3: 110448 : LineLen : 110392

Sample 4: 110448 : LineLen : 110392

...

Sample 541: 110448 : LineLen : 110392

Error

From the perl script:

sub Validate {

    my $Head = `head -n 1 $_[0]`;

    @TempA = split',',$Head;

    my $Start = $#TempA+1;

    my $LineNum = 1;

    my $Ret = "good";

...

       @TempA = split',',$line;

       my $LineLen = $#TempA+1;

       if($Start != $LineLen){

            print "Sample $LineNum: $Start : LineLen : $LineLen\n";

            $Ret = "bad";

Looks like that number of fields in the csv files header (110448) is not the same as the number of fields in subsequent lines/esxtop values (110392).

Regards,

Vladimir

0 Kudos
Highlighted
Enthusiast
Enthusiast

Similar thing happened with the 'turbo' run:

...

Sample 3: 128458 : LineLen : 128402

Sample 4: 128458 : LineLen : 128402

Sample 5: 128458 : LineLen : 128402

Sample 6: 128458 : LineLen : 128402

...

Difference in the number of columns is the same as before - 56.

Regards,

Vladimir

0 Kudos
Highlighted
Enthusiast
Enthusiast

Looks like these 56 (14 x 4) are related to vmrdma devices:

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\Name"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\Driver"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\State"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\Team Uplink"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\Packets Transmitted/sec"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\Mega Bits Transmitted/sec"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\Packets Received/sec"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\Mega Bits Received/sec"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\% Outbound Packets Dropped"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\% Received Packets Dropped"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\Queue Pairs Allocated"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\Completion Queues Allocated"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\Shared Receive Queues"

"\\esx514.dvb400.service-now.com\RDMA Device(vmrdma0:nmlx5_rdma:Active)\Memory Regions Allocated"

Regards,

Vladimir

0 Kudos
Highlighted
VMware Employee
VMware Employee

Vladimir,

Please attach the  *-EsxtopPower.csv files that were collected.

Fred

0 Kudos
Highlighted
Enthusiast
Enthusiast

Hi Fred,

*-Esxtop.csv.tgz files are collected.

*-EsxtopPower.csv files are generated by VMmark3-ParsePowerData.pl (out of *-Esxtop.csv.tgz files).

But VMmark3-ParsePowerData.pl fails due to different number of counters and values in the *-Esxtop.csv files.

I'm attaching the first two lines from that file, so that you can see what i'm talking about.

[root@primeclient Results_20200615142015_NVMeoF_20VMs]# while read -r line; do echo $line | awk -F, '{print NF}'; done < esxtop-first-two-lines.csv

110448

110392

NMONVisualizer can't open it, so it's quite useless.

Looks like an esxtop bug.

Regards,

Vladimir

0 Kudos
Highlighted
VMware Employee
VMware Employee

Vladimir,

Sorry, I meant for you to attach the *-Esxtop.csv.tgz files so we can look at running the parser on this end.

Fred

0 Kudos