vRealize Automation 8.1 Cluster Failure - Reason: ...

future2000 · ‎08-23-2021

Our 3 node vRealize Automation 8.1 cluster has completely fallen over when we were away during covid lockdowns ;o(

On coming back in I was strangely asked on all three nodes to reset the root password which it said had expired. This I don't think was likely possible as I disabled root password expiration. Nevertheless I changed the passwords and logged in as root. The NSX-V Load Balancer showed all nodes as down and I was unable to hit the login page.

On logging into the systems I was unable to perform any tasks. For example...

vracli status deploy

[ERROR] (401)

vracli status

[ERROR] (401).

/opt/script/deploy.sh

Running check eth0-ip

Running check non-default-hostname

Running check node-name

Running check single-aptr

Running check nodes-ready

Running check nodes-count

make: *** [/opt/health/Makeful:48: nodes-count] Error 1

make: Target 'deploy not remade because of errors.

The above loops forever.

Restarted all the nodes and vIDM multiple times. Verified network and system seem ok. I

systemctl shows

systemd-networkd-wait-online.service loaded failed failed.

systemd-tmpfiles-clean.service loaded failed failed.

system-tmpfiles-setup.service loaded failed failed.

The network on the system appears entirely functional however.

This appears this has fallen over completely. Any ideas out there?

All

vRealize Automation 8.1 Cluster Failure - Reason: Unauthorized