VMware Cloud Community
chayolle
Contributor
Contributor
Jump to solution

Getting crazy with HA !! Please help...

Hi,

First of all, I'm quite new in the vm world and I'm pretty sure that I made something stupid wrong which you guys will easily find ! (Hopefully).

Here is my setup:

2 Dell PE R610 Servers both running ESXi 4 with exactly the same hardware configuration.

1 Dell PE R610, with less memory and storage, running ESXi 4 that I will use later for vCenter, (vCenter is currently on my laptop)

1 Dell MD3000i SAN configured correctly, and actually working.

I created a cluster, added the 2 hosts, namely srv1 and srv2 (with respectively IP address 192.168.126.202 and 192.168.126.203). Till here everything went fine and was quite easy to do !!

I would now like to enable HA on this cluster, therefore, I enabled it in the settings of my cluster and the following problem occured: "cannot complete the configuration of the ha agent on the host"

Here are screenshots of the errors and also my network and other configs...

!1.JPG|thumbnail=true!

I really hope you can help me as I have a deadline for making everything fine by this friday Smiley Sad

Thanks in advance,

LF

VSP | VTSP
0 Kudos
1 Solution

Accepted Solutions
pramodupadhyay5
Enthusiast
Enthusiast
Jump to solution

Make the entry of the FQDN of each server in /etc/sysconfig/network

HOSTNAME= FQDN of the server

Make this entry in both server and make sure the FQDN is same in /etc/hosts and /etc/sysconfig/network otherwise u will not get any display

If you found this or any other answer useful please consider the use of the Helpful or Correct buttons to award points

View solution in original post

0 Kudos
37 Replies
krowczynski
Virtuoso
Virtuoso
Jump to solution

Have you tried out.

RM on your host and "Reconfigure for VMWare HA"?

Or try this one!

Hi

we can remove the vpx agent manually and reinstall it. I guess this would also reinstall the aam(HA) agent.

  • rpm -qa | grep vpx --- this will list the version of VPXA agent installed.

  • rpm -qa | grep aam -- this will list the version of aam agent installed.

  • rpm -e <vpxa agent version name> -- this will remove the vpxa agent.

copy the latest vpx agent from VC installed location (VMware\Infrastructure\VirtualCenter Server\upgrade) to the ESX server and run from the ESX server

  • sh <copied file name>

Hope this helps!

MCP, VCP3 , VCP4
0 Kudos
krowczynski
Virtuoso
Virtuoso
Jump to solution

And for configuring HA properly your DNS must be well configured..

I see in your screen only IP Adresses, that is not good!!

MCP, VCP3 , VCP4
0 Kudos
chayolle
Contributor
Contributor
Jump to solution

Hi,

Thanks for our quick answer. I already tried the "reconfigure for HA" I think a hundred times !! Still no luck...

Can you explain me how to configure the DNS?

VSP | VTSP
0 Kudos
krowczynski
Virtuoso
Virtuoso
Jump to solution

You must have a clean dns resolution.

Example:

Name: Server01.localdomain.com

IP: 10.10.10.100

So if you do an nslookup on IP and name, you must get the same result.

And when you put a host to vcenter alway use the name, not the IP, it is better!!

MCP, VCP3 , VCP4
0 Kudos
chayolle
Contributor
Contributor
Jump to solution

Hi,

I tried to remove the host and add it again with the host name instead of the IP Address and it gives me an error: "cannot contact the specified host.................... etc". I tried srv2, srv2.localdomain, srv2.localdomain.com... Does this mean that my DNSs are not configured correctly?

Should I configure them directly on the host? (Where you can set the password and configure IP address etc?

Thx

VSP | VTSP
0 Kudos
dnetz
Hot Shot
Hot Shot
Jump to solution

I'd second the DNS/name resolution for causing this, ESX and vCenter absolutely needs good name resolution or almost all operations will slow down or fail. Even with partial working name resolution you can discover a lot of odd behavior. So set up a proper DNS server (with a slave) and point your ESX and vCenter machines to use it and set up all machines with proper hostnames. The ESX hosts must at the very least be able to resolve eachother's short names for HA to work.

See this knowledge base article for troubleshooting name resolution: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100373...

Hope it helps!

krowczynski
Virtuoso
Virtuoso
Jump to solution

It seems so, what result you get by an nslookup??

MCP, VCP3 , VCP4
0 Kudos
chayolle
Contributor
Contributor
Jump to solution

DNS Request timeout... :S

In my host configuration I get in "Configure Management Network", and then DNS Configuration, and I have the following:

For srv2:

Primary DNS Server :

Alternate DNS Server :

Hostname: srv2.localdomain

For srv3:

Primary DNS Server :

Alternate DNS Server :

Hostname: srv3.localdomain

Obviously I need to fill in those DNS server entries but I don't have any DNS servers available ! How do I create a DNS server with this configuration ??? Really lost there :S:S:S

: Just to clarify things, I am on ESXi 4 and not ESX, can this explain my problems?

VSP | VTSP
0 Kudos
krowczynski
Virtuoso
Virtuoso
Jump to solution

Check this out!

http://communities.vmware.com/thread/202379

MCP, VCP3 , VCP4
prakashraj
Expert
Expert
Jump to solution

Hi,

Check out the link already given by "dnetz" and complete those instructions

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!

Prakash

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!
0 Kudos
chayolle
Contributor
Contributor
Jump to solution

I'll try to modify the hosts file but I don't have any Linux command:(

Will keep updated...many thanks for your help guys

VSP | VTSP
0 Kudos
pramodupadhyay5
Enthusiast
Enthusiast
Jump to solution

reslove this issue

Follows the following steps

1. u must enter the FQDN of all the esxi server in the virtual center server (etc/hosts)

2. u must enter the FQDN of all the esxi server in the etc/hosts of very esxi server and u must make entry for dns in the /etc/resolv.conf

this might help u

If you found this or any other answer useful please consider the use of the Helpful or Correct buttons to award points
0 Kudos
chayolle
Contributor
Contributor
Jump to solution

I used vi etc/hosts in srv2 and it gives me the foll:

192.168.128.202 srv2.localdomain localhost

I added the foll:

192.168.128.203 srv3.localdomain

Now i can ping srv3.localdomain, but not srv3 Smiley Sad

VSP | VTSP
0 Kudos
admin
Immortal
Immortal
Jump to solution

This is probably the correct KB to fix this problem. It is part of a larger troubleshooting KB for HA here: 1003691 Diagnosing a VMware High Availability cluster configuration failure

which covers just about all possible causes.

Rick Blythe

Social Media Specialist

VMware Inc.

0 Kudos
pramodupadhyay5
Enthusiast
Enthusiast
Jump to solution

1.U have to make the FQDN Enteries of all the servers in every esx server(/etc/hosts) and in virtual center server(etc/hosts)

2.Make the dns entry in every esx server (/etc/resolv.conf)

You can ping each server buy using FQDN but not by Hostname

After completing above steps , try to reconfigure HA

If you found this or any other answer useful please consider the use of the Helpful or Correct buttons to award points
0 Kudos
chayolle
Contributor
Contributor
Jump to solution

Hi,

I finally modified the etc/hosts file on both servers and vCenter:

IP address srv2.localdomain.com srv2

IP address srv3.localdomain.com srv3

etc...

I am now able to ping and nslookup in and every server with its IP, short name (srv2) and FQDN (srv2.localdomain)...

I removed all the hosts from the cluster and added them back with their short name and it was successful.

Now when I try to enable HA on the cluster I have the foll. error: (See picture)

VSP | VTSP
0 Kudos
chayolle
Contributor
Contributor
Jump to solution

Hi,

Thanks for the link to this KB, I've followed each and every step and tried everything and still can't get it to work. It's really frustrating as in all the demos everything looks so ease with VM :(. I'm now stuck in that issue and I have a tight deadline to respect - everything must be working for a client by this friday. By everything I mean DRS (Which is Ok), HA and FT, and Consolidated backup which I did not even hasd the time to look at Smiley Sad

VSP | VTSP
0 Kudos
pramodupadhyay5
Enthusiast
Enthusiast
Jump to solution

try to restart the vmware management service on the servers and then reconfigure for HA

If you found this or any other answer useful please consider the use of the Helpful or Correct buttons to award points
0 Kudos
chayolle
Contributor
Contributor
Jump to solution

Just restarted and reconfigure for HA, still no luck... where am I wrong ?!?!?!

VSP | VTSP
0 Kudos