Hello All,
I've recently redeployed my lab environment with vCF4.1 everything was working fine. Unfortunately last week one of the lads made a network change that caused a full partition in the vSAN cluster. It was resolved quickly, and all VMs restarted via HA.
However, after this vRA8 would not start. It kept failing on deploying the "client security" application. I have dug into it over the past couple of days and determined, eventually that it was related to vIDM and vRA having issues. I attempted to re-register with vIDM, via vRSLCM with no success. Last night I decided to scrap the deployment since it was barely configured and redeploy via vRSLCM. I am a bit stuck at this point, as this is an NFR deployment and I do not have an active support code for vRA8 (despite having an entitlement ). Any advice or direction would be very much appreciated.
That is also failing, and the symptom on the vRA8 side of things is identical. An exception on the invocation of "vracli reset vidm":
=========================
[2020-07-14 08:49:15.356+0000] Removing existing auth clients
=========================
+ PRELUDE_CLIENTS=prelude-N8vJtz2zwB,prelude-user-4jPN36K8Rj
+ vracli reset vidm --remove-clients-only --confirm --exclude-clients prelude-N8vJtz2zwB,prelude-user-4jPN36K8Rj
2020-07-14 08:49:21,207 [ERROR] Exception while deleting vidm configuration
Traceback (most recent call last):
File "/opt/python-modules/urllib3/connectionpool.py", line 384, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/opt/python-modules/urllib3/connectionpool.py", line 380, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.6/http/client.py", line 1346, in getresponse
response.begin()
File "/usr/lib/python3.6/http/client.py", line 307, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.6/http/client.py", line 268, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/python-modules/requests/adapters.py", line 449, in send
timeout=timeout
File "/opt/python-modules/urllib3/connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "/opt/python-modules/urllib3/util/retry.py", line 367, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/opt/python-modules/urllib3/packages/six.py", line 686, in reraise
raise value
File "/opt/python-modules/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/opt/python-modules/urllib3/connectionpool.py", line 386, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/opt/python-modules/urllib3/connectionpool.py", line 306, in _raise_timeout
raise ReadTimeoutError(self, url, "Read timed out. (read timeout=%s)" % timeout_value)
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='127.0.0.1', port=40063): Read timed out. (read timeout=5)
If your vIDM setup is a cluster of 3 nodes, was it affected by the HA event? If yes then maybe vIDM is not in a fully useable state and you need to use vRLCM to properly shutdown and/or power on the nodes again? Learned the hard way that vIDM cluster can't be powered on/off in vCenter and you need to use the options in vRLCM to do it.