VMware Cloud Community
Eldron
Enthusiast
Enthusiast

HA Failure on one host "No Active Primaries Found"

Can anyone tell me what that means, or where I can look for an answer? I put this server in to Maint. Mode, and I get this error when I exit Maint. Mode and tried to reactive HA on the host.

I have verified that all servers, and VC, can see each other by short name and FQDN.

Reply
0 Kudos
14 Replies
waynegrow
Expert
Expert

Have you tried to disable and re-enable HA for the Cluster?

Also, see this post. It might better explain. Was it the Primary you placed into Maintenance Mode?

http://www.vmware.com/community/thread.jspa?messageID=664404&#664404

Message was edited by:

waynegrow

Eldron
Enthusiast
Enthusiast

I did disabled, and then re-enable, HA at the cluster level. It did not fix my problem, unfortunately.

The host was the most recent one added to the cluster, so I am fairly certain that it is not the primary.

Edit:

New message on failure: Error \[10001]: Instance Already Exists

Message was edited by:

Eldron

Reply
0 Kudos
Eldron
Enthusiast
Enthusiast

Ever have one of those moments where you just want to slap your self?

I was playing with disk volumes, and had an extra SAN LUN presented to my server. I think that was why it was failing. The other hosts could not see this LUN, so my storage was not visible to all hosts. As soon as I unrepresented the LUN, HA started working again.

Reply
0 Kudos
sbartle
Contributor
Contributor

I got this message after a vswif problem, subnet misconfiguration, I logged onto the console dropped and recreated both vswifs (redundant vswif), ran the reconfigure for HA and all came good.

Reply
0 Kudos
syousef
Contributor
Contributor

I'm running into the same issue, but it's not a vswif issue. I'm just gonna try to rebuild the server.

Reply
0 Kudos
opbz
Hot Shot
Hot Shot

When you create an HA cluster the first server you insert is the primary. If anything happens to this server HA gets confused

THe fix is a bit messy.

1:Remove all hosts from HA cluster

2: remove clusterring (has to be done to clear out old entries in DB of VC)

3: delete the HA cluster again has to be done to clear out older entries in DB

4: recreate HA and finally add hosts

this should resolve the problem you have

Reply
0 Kudos
jjamieson
Enthusiast
Enthusiast

When you say "Remove Clustering" you mean disable HA or something? Or move the hosts out of the cluster?

I'm going to just try a few things but what a PITA. The ESX server that was apparently "primary" (which doesn't make sense, because it was the second machine added) is offline and can't be takin online for now - and the 3rd (and new) box I've installed specifically for redundancy via HA won't enable the HA agent because of this issue.

It's 2008, let's try to build software that doesn't get "confused" eh?

Reply
0 Kudos
opbz
Hot Shot
Hot Shot

when you are setting HA and DRS the first server you add to it becomes the primary... no idea why the second server became primary...I do not mean remove the cluster I mean remove the server from it and readd them ...

You are basically trying to clear out stuff from the database. Hence the remove and readd bit

Reply
0 Kudos
jqualls
Contributor
Contributor

This fixed it for me. Kudos! For once, the forums came through! Smiley Happy

Reply
0 Kudos
NathalieMarshal
Contributor
Contributor

I'm now getting this on one of the 2 esx servers in our test cluster since I ran update manager on it this morning to pick up the latest updates. I can't migrate anything on to the server either, so I think I'm going to have to rebuild the server. Did anyone else have any problems with the latest set of ESX updates??

Reply
0 Kudos
NorbK
Enthusiast
Enthusiast

I had this exact same error and found that if I simply turned off HA on the cluster and then re-enabled it, all machines were able to reconfigure HA successfully. Gave me a much better feeling than disconnecting and reconnecting ESX hosts. If you think of it, it makes sense that this works. All machines are out of HA, then HA is enabled and it has to pick one as the primary to get HA started in the cluster.
Reply
0 Kudos
fishbat
Contributor
Contributor

This worked for me also - Thanks.

Reply
0 Kudos
Ytsejamer1
Enthusiast
Enthusiast

I had this problem as well. While I checked and tried everything mentioned here, I found that the problem with my "rogue" node was actually the hosts file in /etc. The IP for a one of the other nodes was incorrect. I made sure they were all correct and matched up on each node in the cluster. That seemed to take care of it for me.

Thanks for all your posts...all helped point me in the direction to narrow things down.

Reply
0 Kudos
MSpencerINX
Contributor
Contributor

Well it looks like it's almost a year later.. However, I just ran into the same problem and found this post.

We happen to be talking about this now.. Here's what I have found for this error.

From reading above..

With respect to Primary's the first 5 hosts are all primaries when configured in a cluster. (ESX3.5, VCenter 2.5)

These start with the first host added to the cluster.

If you have 6 hosts and one fails one of the remaining hosts will promote the remaining (6th) host to a primary member.

To view the current primary's for your systems

cat /var/log/vmware/aam/aam_config_util_listprimaries.log

What resolved this problem "No Active Primaries Found" for me was to

open SSH to the host having the problem.

type "hostname" and press return to view the current host name.. (mine was incorrect (not matching the DNS record).

If the host name needs to change you can change this is the VI Client under host configuration, DNS Settings, Host Name

The other way was to open SSH

nano /etc/hosts (Edit/verify the hostname FQDN and shortname) nano /etc/sysconfig/network (edit verify the name) hostname newhostname

Again I had to reboot to fully resolve this so HA would enable.

Hopefully this helps others..

Regards,

Mark

Reply
0 Kudos