VMware Performance Community
Nodnarb
Enthusiast
Enthusiast

VMmark 3 causing vCenter HA to fail over?

Hi all,

I've encountered what seems to be a strange coincidence between running VMmark v3 and my vCenter High Availability cluster gracefully failing over hours later.

My setup: I have several clusters under my vCenter (6.5U1e) environment. The cluster I'm testing with VMmark is a 3-node vSAN evaluation cluster. Each ESXi host in the cluster is updated to the latest ESXi patches for 6.5 U1. My vCenter is configured for vCenter HA (Basic option). The primary, secondary and witness vCHA nodes all run on the other, non-vSAN evaluation clusters. The prime client also runs on one of the non-vSAN evaluation clusters. The prime client communicates to the worker VMs via an isolated VLAN over a vDS shared by the non-vSAN clusters and the vSAN test cluster.

When I run VMmark, the test runs for a few hours (as expected) and I get a result. Hours later, though, my running vCenter gets gracefully shut down via vCenter HA and the secondary HA node becomes the primary. This only happens overnight after I have run VMmark earlier that day. Once the node has failed over it will stay running solid without additional failovers. I have to manually power on the shut-down vCenter node the following day. Once it's up it stays running in standby just fine. I can even fail back to the original vCenter node without any issues...until I run VMmark again!

My latest experiment: I ran VMmark over a week ago and vCenter HA failed over to my secondary vCenter HA node. Instead of failing back this time I left vCenter running on the secondary node. All previous times I failed back to the primary node before running VMmark again. This time, however, I ran VMmark while the secondary vCenter HA node was active. The test was run yesterday afternoon and finished after business hours. This morning I came into work the secondary vCenter HA node had been shut down and the primary node was made active. Since it's failed over in instances where both the primary and secondary nodes were active when VMmark had been run I'm tempted to rule out a problem with the vCenter nodes themselves.

My question: has anyone else run VMmark with vCenter HA configured? If not can anyone reproduce this? I have a VMware support ticket open (18707875202) and I've uploaded vCenter support bundles if any of the VMmark team can get access to those and wants to have a look.

Thanks for your help and let me know if any other information or logs would be useful.

Tags (2)
0 Kudos
3 Replies
vmEck
Hot Shot
Hot Shot

What account is shutting down the vCenter Server VM? Could there be a configuration within VMmark that is set to automatically shut it down once testing is complete? If you destroy the VCHA cluster does it behave the same way (VC get's

0 Kudos
dmorse
VMware Employee
VMware Employee

My question: has anyone else run VMmark with vCenter HA configured? If not can anyone reproduce this?

VMmark has no HA component, so I'm not sure why this is happening.  Our internal development team has not run with HA.

It has to be some other external factor instead of VMmark (especially since it occurs hours after the benchmark has finished).

0 Kudos
Nodnarb
Enthusiast
Enthusiast

I'm not sure what account is used to shut it down. There's nothing in Tasks and Events on the vCenter VM itself. As far as I can tell by looking in vcha.log (/var/log/vmware/vcha) something's timing out between the active and standby vCenter VMs and VCHA appears to start up the passive vCenter VM. Is there a log in the VCSA itself that would show an account issuing a shutdown command? Since there's nothing logged in Tasks/Events it makes me think the shutdown comes from within a guest OS command rather than a command through VMware Tools that would get logged in Tasks/Events.

I haven't tried destroying VCHA yet, was hoping someone here could let me know if they ran through VMmark with HA configured and didn't have any problems. I can say that since my last failover (after running VMmark) on 2/22 that another failover has not happened since then.

0 Kudos