VMware Cloud Community
HassanAlKak88
Expert
Expert
Jump to solution

Failed to start HACore profile on node

Hello,

Failed to deploy vCenter HA Cluster (vcenter 6.5.0 Build 5318154) with the following error:

"A general system error occured: Failed to start HACore profile on node 192.168.18.22" (192.168.18.22 is the IP address for the witness server filled up during the configuration wizard)

pastedImage_1.png

We tried the below with no luck:

  1. Power off and delete the Passive and Witness nodes.
  2. Log in to the Active node by using SSH or via Direct Console.
  3. Log in as the root user and enable the Bash shell: # shell
  4. Run the following command to remove the vCenter HA configuration: # destroy-vcha -f
  5. Reboot the Active node: # reboot
  6. Wait until the Active node is back online and start vCenter HA cluster configuration again

The following is the status of services from our vcenter:

pastedImage_2.png

Also there is a ping from our vcenter appliance to the above IP address:

pastedImage_3.png

Please advise,


If my reply was helpful, I kindly ask you to like it and mark it as a solution

Regards,
Hassan Alkak
1 Solution

Accepted Solutions
SupreetK
Commander
Commander
Jump to solution

Seeing the below errors in vpxd.log -->

2018-09-14T17:32:22.276+03:00 error vpxd[7F73A580C700] [Originator@6876 sub=vpxUtil opID=FlowBasedWizard-apply-911-ngc-80] /usr/bin/python failed with error [1] and output [2018-09-14T14:32:22.095Z   Failed to start statsmonitor services. Error: Operation timed out

2018-09-14T17:32:22.276+03:00 info vpxd[7F73A580C700] [Originator@6876 sub=Default opID=FlowBasedWizard-apply-911-ngc-80] [VpxLRO] -- ERROR task-65223 -- FailoverClusterConfigurator -- vim.vcha.FailoverClusterConfigurator.deploy: vmodl.fault.SystemError:

--> Result:

--> (vmodl.fault.SystemError) {

-->    faultCause = (vmodl.MethodFault) null,

-->    faultMessage = <unset>,

-->    reason = "Failed to start HACore profile on node 192.168.18.22"

-->    msg = ""

--> }

Before initiating vCHA, perform the below steps on the vCSA active node -

  • Take a snapshot of the vCSA VM.
  • Login to vCSA with root credentials via SSH.
  • Modify statsmonitor service config for vMon

               sed -i '/StartTimeout/d' /etc/vmware/vmware-vmon/svcCfgfiles/statsmonitor.json

               sed -i '/ApiHealthFile/a "StartTimeout": 600,' /etc/vmware/vmware-vmon/svcCfgfiles/statsmonitor.json

  • Reload vMon service config through SIGHUP

               kill -HUP $(cat /var/run/vmon.pid)

  • Stop and start statsmonitor service.

               /usr/lib/vmware-vmon/vmon-cli -k statsmonitor

               /usr/lib/vmware-vmon/vmon-cli -i statsmonitor

  • Enable vCHA.

Cheers,

Supreet

View solution in original post

8 Replies
SupreetK
Commander
Commander
Jump to solution

Did you check the comments on the thread A general system error occurred: Failed to start HACore profile on node​? If it does not help, share the time stamp of the failure along with the vpxd.log from the master node and vmon-syslog.log from the witness node.

Cheers,

Supreet

0 Kudos
HassanAlKak88
Expert
Expert
Jump to solution

Regarding the time stamp, the following screenshot mention it:

pastedImage_0.png

SupreetK​ , can you advise how can i share the log file please.


If my reply was helpful, I kindly ask you to like it and mark it as a solution

Regards,
Hassan Alkak
0 Kudos
SupreetK
Commander
Commander
Jump to solution

You can attach the log files to your reply or put them on a Dropbox link. What is the time zone for the above time stamp? Also, did you check the suggested resolution from the thread A general system error occurred: Failed to start HACore profile on node​?

Cheers,

Supreet

0 Kudos
HassanAlKak88
Expert
Expert
Jump to solution

Hello dear,

our time zone is "GMT +3". and please find the vpxd.log on the following URL: https://iccleb-my.sharepoint.com/:u:/g/personal/hassan_alkak_cci-me_com_lb/EZL9vYoM5LpKjFVRMQ46IcQBP... 


If my reply was helpful, I kindly ask you to like it and mark it as a solution

Regards,
Hassan Alkak
0 Kudos
SupreetK
Commander
Commander
Jump to solution

Seeing the below errors in vpxd.log -->

2018-09-14T17:32:22.276+03:00 error vpxd[7F73A580C700] [Originator@6876 sub=vpxUtil opID=FlowBasedWizard-apply-911-ngc-80] /usr/bin/python failed with error [1] and output [2018-09-14T14:32:22.095Z   Failed to start statsmonitor services. Error: Operation timed out

2018-09-14T17:32:22.276+03:00 info vpxd[7F73A580C700] [Originator@6876 sub=Default opID=FlowBasedWizard-apply-911-ngc-80] [VpxLRO] -- ERROR task-65223 -- FailoverClusterConfigurator -- vim.vcha.FailoverClusterConfigurator.deploy: vmodl.fault.SystemError:

--> Result:

--> (vmodl.fault.SystemError) {

-->    faultCause = (vmodl.MethodFault) null,

-->    faultMessage = <unset>,

-->    reason = "Failed to start HACore profile on node 192.168.18.22"

-->    msg = ""

--> }

Before initiating vCHA, perform the below steps on the vCSA active node -

  • Take a snapshot of the vCSA VM.
  • Login to vCSA with root credentials via SSH.
  • Modify statsmonitor service config for vMon

               sed -i '/StartTimeout/d' /etc/vmware/vmware-vmon/svcCfgfiles/statsmonitor.json

               sed -i '/ApiHealthFile/a "StartTimeout": 600,' /etc/vmware/vmware-vmon/svcCfgfiles/statsmonitor.json

  • Reload vMon service config through SIGHUP

               kill -HUP $(cat /var/run/vmon.pid)

  • Stop and start statsmonitor service.

               /usr/lib/vmware-vmon/vmon-cli -k statsmonitor

               /usr/lib/vmware-vmon/vmon-cli -i statsmonitor

  • Enable vCHA.

Cheers,

Supreet

HassanAlKak88
Expert
Expert
Jump to solution

Thx Dear,

Solved after applying your steps, we found them on the thread, but we were waiting the approve from responsible.

Thanks again,


If my reply was helpful, I kindly ask you to like it and mark it as a solution

Regards,
Hassan Alkak
AlanaBarboza
Contributor
Contributor
Jump to solution

Hey guys!

I'm kind of a newbie in VMWare administration and a got this errors while enabling vCHA. My peer and witness was created just fine, but at the end of vCHA deploy it show this error "Failed to start HACore profile on node". Also in the vCenter HA session the buttons "edit" and "initiate failover" was visible but not enable.

I was not feeling very brave to do this kind of alterations so I try to reboot my active vCenter node (i had a backup in case it didn't comeback) and work just fine!

A was able to activate the vCHA with no errors.

Hope my experience help someone.

0 Kudos
ziansong
Contributor
Contributor
Jump to solution

Why should I follow your steps or report errors? VCenter version is 6.7.0 13007421

0 Kudos