VMware Cloud Community
x007alfa
Enthusiast
Enthusiast

vSphere HA Cluster - Something is missing and I don't know what...

Hi all,

I have a newly formed cluster in my lab.

My cluster is composed as follows:

  1. 3x HPE DL380 Gen8. specs:
    1. 2x Intel Xeon E5-2670v2 10c/20t;
    2. 16*8GB DDR3 ECC (tot 128GB);
    3. 2x SAS 15k rpm 146GB disks as local swap location;
    4. ESXi is installed on a 32GB SD card;
    5. 530FLR-SFP+ (Dual port 10gbps SFP+ card flexibleLom);
    6. X520-DA1 (Single port 10gbps SFP+ card pcie);
    7. I350-T4 (Quad Port 1gbps RJ45 card pcie);
  2. 1x Dell PowerConnect 8024F 24 port SFP+ 10gbps switch;
  3. 1x HPE StoreVirtual 4530. specs:
    1. 12x HPE SAS 15k rpm 600GB SAS drives in RAID6;
    2. X520-DA2 (Dual port 10gbps card);

I used the 530flrs to connect to the SAN via the switch.

I used the 520-da1 and 1 of the 1gbps nics for my vmnetwork and configured the 1gbps port to be in standby for failover.

I used 2 more ports on the i350 to make a gateways network as I have a pfsense instance running on the cluster.

The cluster is up as of right now.

I installed ESXi 7.0.0 HPE Custom Image and I deplyed the relative VCSA on host1 to manage the cluster.

All guides I follow get me to the point where HA should be enabled but something is missing...

I have HA enabled now but cluster has a yellow triangle on it and says that there are a few VMs waiting for HA failover retry... whatever that means...

If then I head to HA monitoring I see that Host2 is the master and that there are no hosts connected to it........ O.O"

What should I do? I'm stuck.....

Thanks for any help...

Fabio

Reply
0 Kudos
29 Replies
x007alfa
Enthusiast
Enthusiast

Up... please... someone.... :smileycry:

Reply
0 Kudos
scott28tt
VMware Employee
VMware Employee

Screenshots of errors/messages usually help.


-------------------------------------------------------------------------------------------------------------------------------------------------------------

Although I am a VMware employee I contribute to VMware Communities voluntarily (ie. not in any official capacity)
VMware Training & Certification blog
Reply
0 Kudos
depping
Leadership
Leadership

I assume:

- Each host has a management interface

- Each host can ping the other host in the cluster

- You tried clicking "reconfigure for HA" on each host in the vSphere client?

Reply
0 Kudos
depping
Leadership
Leadership

Please provide the exact error message, preferably a screenshot.

Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

problem1.PNGproblem2.PNG

I think these are the only info I can provide... do you need something else?

I see on a couple hosts cyclically HA Agent unreachable and that HA Agent cannot be configured...

It's really worrying me... I have a lot of uplinks and cable runs are short so RTT is almost non existent...

I have a set of uplinks set for failover monitoring I think it's called...

I don't know what to think...

Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

I checked everything and yes all points are ok... each host has a redundant uplink for management (every dswitch has redundant uplinks) each host can ping eachother with RTT <1ms.

I tried the reconfigure thing a few times... Smiley Sad

Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

Don't know if it can be helpful I have drawn a quick overview of how I laid out the network... I know a single 10G switch is a bit sketchy but... I'm poor....... Smiley Sad

Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

I basically fowwed these steps to configure the cluster...

1. Install ESXi to the host and assign address, dns name and domain ext.

2. setup vmk for software iscsi to connect each host to the shared storage.

3. install vcenter appliance on h1;

4. bring all hosts in inventory in the vcsa;

5. create a datacenter and cluster

6. cluster is created with no special options at first.

7. include hosts in cluster.

8. set up networks

9. activate HA and reconfigure each host for HA.

10. bring in the VMs in the SAN to the inventory...

That is about it... is there something more that should be done? is there a full step by step guide?

These steps I cobbled together from 7 guides literally so it could be I missed something...

Reply
0 Kudos
depping
Leadership
Leadership

Sorry for the slow response, the screenshot indicates that there's an issue with the network as 2 out of 3 hosts are not reachable somehow. Is there any type of firewall on the network and/or port blocks? Normally the "management vmkernel interface" is used for communication between the hosts. did you do a ping (vmkping) test using that interface?

Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

Hi sorry myself now for really long time but I'm away for work and have very limited time online...

I honestly didn't understand... do I need to deploy 3 instaces of the vcsa?

Like should my setup be 1vcsa instance per host on the local hard drives? each host has a raid1 of 146gb disks... should I deploy them there?

How does it really work? every guide I look at says something different... Smiley Sad

Thanks for the help and yes they can see eachother...

Also will I be able to load balance the VMs across the hosts with DRS?

Reply
0 Kudos
depping
Leadership
Leadership

vCenter Server is the management solution, you deploy 1 instance of it for your environment. Within vCenter Server you then create clusters to which you add the hosts.

As mentioned, the screenshots indicate there's an issue from a networking stance. However, it is not easy to troubleshoot this without having access to all various layers. Maybe it is best to either call support, or ask a local VMware expert to look at the environment. Considering your questions it may also be wise to follow a VMware install and configure training course for vSphere.

Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

I did all that Smiley Happy I have a cluster built up with my 3 servers and 1 instance of vcsa hosted in the cluster.

The network is setup as described in various guides and it all seems to be fine...

Now I'm away from office for work so I want to be sure I did alright when I go back...

Thanks for your help and I'll get back to you as soon as I get back...

Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

I've decided to format everything and start from scratch this weekend so I don't impede anybody's work...

Is it correct that I can just pile up all services and functionalities on a single redundant port? like management, redundancy, iscsi all that jaz maybe on my dual port 10Gbps card?

Don't tell me to go look for a tutorial because I already did and cannot find a single document with a step by step how to... which honestly is kind of frustrating because the assumption here seems like I have to call an expert to fix my problems if I'm not an expert myself... which makes absolutely no sense whatsoever...

So my procedure this weekend will be:

  1. Install a blank ESXi 7 image from HPE's custom images repo on all my 3 hosts.
  2. Assign hostnames and setup the Management network from console by selecting my 2 10Gbps ports on the 530FLR card on each host.
  3. Add the  fault tolerance logging service on the management vmkernelnic for each host.
  4. Add a new vmkernelnic on the vMotion stack to allow vmotion to run in HA.
  5. Try to vmkping all hosts in pairs and see that everyone can see eachother on both vmkernelnics.

At this point I would skip adding my SAN to the whole deal and just load on one of the hosts the VCSA Instance on the local disks.

Configure it to have a FQDN and IP as per my DNS configuration.

Configure a Datacenter, Cluster and add my hosts to the cluster.

The only guide I found told me to skip configuring an HA cluster from the start and activating the functionality after I added the hosts to said cluster...

Is it all correct until this point?

If you feel I should change something please tell me I'm open to suggestions...

Thanks again for the support and yes I know I'm annoying but I would like to avoid calling for an expert support which will undoubtedly cost me a fortune... and is experience for me since we are planning to provide virtualization solutions in the future to our customers...

Configuring a single host is piece of cake but HA is a whole new level of complication... and I really really think something could be done to simplify the whole experience...

Reply
0 Kudos
depping
Leadership
Leadership

1) yes you can enable all services on a single VMkernel interface, although we usually don't recommend it, it is possible and supported.

2) Yes you can install a VCSA locally, after that you can use "Storage vMotion" to migrate it to the SAN when you have that working

3) Make sure DNS is working for vCenter and the Hosts, that prevents a lot of issues, and NTP preferably as well!

4) If you are not using vSAN then you can enable HA directly and add the hosts, if you are using vSAN then you need to enable vSAN first and after that you enable HA

Hope that helps.

Reply
0 Kudos
depping
Leadership
Leadership

just read the initial post, just create a Datacenter, followed by a cluster with HA enabled and then add your hosts, that is a supported workflow.

We have a huge amount of documentation by the way, most of what you ask above is documented here: https://docs.vmware.com/

VMware Docs Home

Then there are also a massive number of blogs with tutorials out there which you can find through google. And no worries, everyone is free to ask questions on this community forum.

Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

Hi I'm finally done rebuilding the entire circus...

It's been.......... a trip... XD

So I have the hosts up and running. all licensed up.

I have a VCSA deployed in the SAN and the passive and whitness nodes deployed so that vCenter HA is running.

vSphere HA is still not wanting any of this shiss... When I activate using a guide from vmware docs the election gets done for 1 host and randomly one of the hosts become master and the others time out from the election.

I read from another post that SSL verification is a thing that needs to be done so I went into the vCenter SSL configuration tab but no host is available in there... what does it mean? Smiley Sad

Thanks!

Fabio

Reply
0 Kudos
depping
Leadership
Leadership

You don't need SSL certs configured to use HA. I've configured it a thousands of times without SSL certs configured. What you need is a management network, the ability to ping from 1 host to another, a shared datastore and that is it.

If it keeps failing, contact support. I suspect there's a a configuration issue somewhere, and I would think that they can fix this for you in no time, I have configured this thousands of times and never had any challenges configuring it.

Reply
0 Kudos
depping
Leadership
Leadership

Only thing I can imagine is that there's a firewall on the network which is blocking some ports?

Reply
0 Kudos
x007alfa
Enthusiast
Enthusiast

The first picture is the Status of HA after enabling... Configuration issues 0... h1 is master and h2 and h3 slaves are apparently initializing...

ha status.PNG

This second picture is during enabling of HA functionality... as you can see h1 completed the election and the other 2 are waiting for something to happen and I have no clue what...

The only firewall I have is a pfsense VM that has a rule to let anything go by inside of my network... only blocks access from outside...

election phase.PNG

Reply
0 Kudos