as shown in fig. attached below:
before this happened, i did re-provision all the tiles(from 0 to 3),
And also tried rebooting PrimeClient, Client VMs and all the workload VMs before executing the VMmark run.
(the IP mapped each VM in the primeclient's /etc/hosts file correctly and i could manually staf ping ping to the VMs via terminal)
but this strange issue occurred, which prevents the test from succeed.
"staxprocessstarterror signal raised. continuing job.", "full status unknown state", "guest info failures for the following machines"
basically the logs only tell the errors but not the reasons(neither stx_job or guestinfofiles...etc)
the test result folder is zipped and attached,
can anyone help???
Hi there, when you did a 'staf ping' from the prime client to the ElasticAppA0, ElasticAppB0, and ElasticDB0, did you use the hostname that is present in the prime client's hosts file?
And when you ssh into ElasticAppA0, ElasticAppB0, and ElasticDB0, 'staf primeclient ping ping' returns 'pong'?
Have you altered any of VMs' hosts files since provisioning?
Try looking in the troubleshooting section of the VMmark User's Guide, "STAF and STAX Issues". Since the staf ping is not working, the issue is either going to be network related, or that STAF is not running correctly on the Elastic VMs. I think you've already understood this by trying to reboot the VMs in question, but try taking a look at that section if you haven't already.
From the PrimeClient and Client0 can you ssh into the following 3 VMs -ElasticDB0, ElasticAppA0, and ElasticAppB0. If you can, then can you ssh back to Client0 and PrimeClient from each of the 3 VMs listed.
If not, try rebooting those 3 VMs and see if ssh works and if so try run again.
Fred
Also, can you zip up and attach the provisioning directory from where you said you just provisioned the VMs again.
Fred
fredab2 hi, thanks for reply
the related folders are attached.
the ssh worked just fine on either prime, client0 or other VMs, they can ssh to each other with no problems.
i did 4-tiles turbo run and got the same issue, and i couldn't figure out why(screenshot is below):
(the ssh of the failed VMs are good)
i've tried rebooting all the VMs and perform another test run but still failed, as shown in fig. below:
(these workloads can be staf pinged from primeclient)
Hi there, when you did a 'staf ping' from the prime client to the ElasticAppA0, ElasticAppB0, and ElasticDB0, did you use the hostname that is present in the prime client's hosts file?
And when you ssh into ElasticAppA0, ElasticAppB0, and ElasticDB0, 'staf primeclient ping ping' returns 'pong'?
Have you altered any of VMs' hosts files since provisioning?
Try looking in the troubleshooting section of the VMmark User's Guide, "STAF and STAX Issues". Since the staf ping is not working, the issue is either going to be network related, or that STAF is not running correctly on the Elastic VMs. I think you've already understood this by trying to reboot the VMs in question, but try taking a look at that section if you haven't already.
hi RebeccaG
thanks for the prompt!
after some deep check on physical network i discovered that there were physical issueson my switch's ports
which cause the following error repeatedly : "get guest info failures...", "could not staf ping the following X machines..." and "staxprocessstarterror signal raised. continuing job.", etc
the test became just fine after i use another normal switch
I'm glad to hear you were able to resolve the issue. Thanks for following up here about the solution.