VMware Cloud Community
alainrussell
Enthusiast
Enthusiast

VSAN Health - 503 Error

I'm seeing an error when trying to load the VSAN health tab, no buttons show and I see a 503 error.

I've checked the permissions on the cert files and they look ok - the van health log file is showing the error below, this is a clean Venter install (6.0, upgrade to U1) and a new VSAN install - I can't find anything online in regards to these errors.

2015-10-09T05:09:43.880Z CRITICAL vsan-health[MainThread] [VsanHealthServer::UncaughtExcpetionHandler] Traceback (most recent call last):

   File "/usr/lib/vmware-vpx/vsan-health/VsanHealthServer.py", line 354, in <module>

    Initialize(options=gCmdOptions, remainingOptions=gCmdRemainingOptions)

   File "/usr/lib/vmware-vpx/vsan-health/VsanHealthServer.py", line 305, in Initialize

    ImportTypesAndManagedObjects()

   File "/usr/lib/vmware-vpx/vsan-health/VsanHealthServer.py", line 316, in ImportTypesAndManagedObjects

    import pyMoVsan

   File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/__init__.py", line 38, in <module>

    __import__(name, globals(), locals(), [])

   File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanVcClusterHealthSystemImpl.py", line 4134, in <module>

    VsanVcClusterHealthSystemImpl("vsan-cluster-health-system")

   File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanVcClusterHealthSystemImpl.py", line 847, in __init__

    VsanEventUtil.registerHealthAlarms(self.conn.si.content)

   File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanEventUtil.py", line 621, in registerHealthAlarms

    cls.registerAlarm(content, content.rootFolder, cls.eventIds, enable=True)

   File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanEventUtil.py", line 569, in registerAlarm

    unregisterEventIds = cls._getUnRegisterTestId(content, mo, eventIds)

   File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanEventUtil.py", line 542, in _getUnRegisterTestId

    if alarm.info.expression is None or \

   File "/usr/lib/vmware-vpx/pyJack/pyVmomi/VmomiSupport.py", line 537, in __call__

    return self.f(*args, **kwargs)

   File "/usr/lib/vmware-vpx/pyJack/pyVmomi/VmomiSupport.py", line 360, in _InvokeAccessor

    return self._stub.InvokeAccessor(self, info)

   File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanPyVmomiProfiler.py", line 95, in InvokeAccessor

    out = self._InvokeAccessor(mo, info)

   File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanPyVmomiProfiler.py", line 103, in _InvokeAccessor

    return self._InvokeMethod(mo, info, (prop,))

   File "/usr/lib/vmware-vpx/vsan-health/pyMoVsan/VsanPyVmomiProfiler.py", line 73, in _InvokeMethod

    obj = ds.Deserialize(resp, info.result)

   File "/usr/lib/vmware-vpx/pyJack/pyVmomi/SoapAdapter.py", line 749, in Deserialize

    self.parser.ParseFile(response)

   File "/usr/lib/vmware-vpx/pyJack/pyVmomi/SoapAdapter.py", line 648, in EndElementHandler

    raise TypeError(data)

TypeError: vCenter

Reply
0 Kudos
8 Replies
niljersweden
Contributor
Contributor

This is the solution for this issue.

Can someone guide me through a vCenter server windows and not an appliance?

Thx,

Resolution

This is a known issue affecting
VMware Virtual SAN
VMware vCenter Server Appliance


Currently, there is no resolution.
To work around this issue, correct the permissions for the certificate file:

  1. Log in to the vSphere vCenter Server Appliance using SSH.
  2. Run this command to enable access to the Bash shell:

    shell.set --enabled true

  3. Type shell and press Enter.
  4. Correct the group permissions, by running these commands:

    cd /etc/vmware-vpx/ssl
    chgrp cis rui.* vcsoluser.*
    chmod g+r rui.* vcsoluser.*

  5. After performing the preceding steps, the /var/log/vmware/eam/eam.log file,on the vCenter Server Appliance may report an error similar to:

    [YYYY-MM-DD]T[HH:MM:SS] | ERROR | eam-0 | VcConnection.java | 179 | Failed to login to vCenter as extension. vCenter has probably not loaded the EAM extension.xml yet.: Cannot complete login due to an incorrect user name or password.

    To resolve this issue, see After replacing the vCenter Server certificates in VMware vSphere 6.0, the ESX Agent Manager solutio....

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Additional Information

Reply
0 Kudos
alainrussell
Enthusiast
Enthusiast

This didn't fix the issue for me - my permissions already matched what was required. I ended up setting up a brand new venter as it was a new environment so not too difficult, this worked straight away. Interestingly I've just hit the same error again after restarting the vCenter appliance to fix an error with vmware replication installation - on the health check I'm seeing the same 503 Errors:

The vCenter server is not responding, please check your setup (statusCode 503)

Unexpected status code: 503

Unexpected status code: 503

Cannot get the health service instance.

I also noticed on the vCenter appliance that the health check is looking at /vsanHealth/ - which in my case shows:

503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http16LocalServiceSpecE:0x7f5cfd51ebf0] _serverNamespace = /vsanHealth _isRedirect = false _port = 8006)

Reply
0 Kudos
alainrussell
Enthusiast
Enthusiast

I'm thinking this is some sort of SSL issue - I thought I'd get the health info from RVC but am getting an error trying to run that as well (as below), I've opened a support case so will see how that goes.

/localhost> ls

Errno::EPIPE: Broken pipe

/usr/lib64/ruby/1.9.1/openssl/buffering.rb:235:in `syswrite'

/usr/lib64/ruby/1.9.1/openssl/buffering.rb:235:in `do_write'

/usr/lib64/ruby/1.9.1/openssl/buffering.rb:249:in `write'

/usr/lib64/ruby/1.9.1/net/protocol.rb:191:in `write0'

/usr/lib64/ruby/1.9.1/net/protocol.rb:167:in `block in write'

/usr/lib64/ruby/1.9.1/net/protocol.rb:182:in `writing'

/usr/lib64/ruby/1.9.1/net/protocol.rb:166:in `write'

/usr/lib64/ruby/1.9.1/net/http.rb:1739:in `send_request_with_body'

/usr/lib64/ruby/1.9.1/net/http.rb:1724:in `exec'

/usr/lib64/ruby/1.9.1/net/http.rb:1189:in `transport_request'

/usr/lib64/ruby/1.9.1/net/http.rb:1177:in `request'

/usr/lib64/ruby/1.9.1/net/http.rb:1125:in `request_post'

/opt/vmware/rvc/gems/rbvmomi-1.7.0/lib/rbvmomi/trivial_soap.rb:90:in `block in request'

<internal:prelude>:10:in `synchronize'

/opt/vmware/rvc/gems/rbvmomi-1.7.0/lib/rbvmomi/trivial_soap.rb:88:in `request'

/opt/vmware/rvc/gems/rbvmomi-1.7.0/lib/rbvmomi/connection.rb:87:in `call'

/opt/vmware/rvc/gems/rbvmomi-1.7.0/lib/rbvmomi/basic_types.rb:205:in `_call'

/opt/vmware/rvc/gems/rbvmomi-1.7.0/lib/rbvmomi/basic_types.rb:74:in `block (2 levels) in init'

/opt/vmware/rvc/lib/rvc/util.rb:213:in `collect_children'

/opt/vmware/rvc/lib/rvc/extensions/Folder.rb:27:in `children'

/opt/vmware/rvc/lib/rvc/vim.rb:31:in `children'

/opt/vmware/rvc/lib/rvc/inventory.rb:76:in `rvc_children'

/opt/vmware/rvc/lib/rvc/modules/basic.rb:144:in `ls'

/opt/vmware/rvc/lib/rvc/command.rb:42:in `invoke'

/opt/vmware/rvc/lib/rvc/shell.rb:129:in `eval_command'

/opt/vmware/rvc/lib/rvc/shell.rb:73:in `eval_input'

/opt/vmware/rvc/bin/rvc:178:in `<main>'

Reply
0 Kudos
niljersweden
Contributor
Contributor

Hi,

Any feedback from Vmware regarding the support case?

Br,

//Jerry

Reply
0 Kudos
alainrussell
Enthusiast
Enthusiast

I closed the case in the end, when running through checks I noticed we had a typo in a reverse DNS entry for the VCSA appliance. Fixing this - and rebuilding everything from scratch with the correct DNS solved the problem - is not stable and working as expected.. thanks.

Reply
0 Kudos
zdickinson
Expert
Expert

To clarify, "is now stable" and not "is not stable".  Correct?  Thank you, Zach.

Reply
0 Kudos
alainrussell
Enthusiast
Enthusiast

Sorry - "Now" stable Smiley Happy

Reply
0 Kudos
aaronwsmith
Enthusiast
Enthusiast

For anyone searching the web for the elusive VSAN Health Check Plugin 503 error, note another possible cause discussed in this thread:

Re: Problem with VSAN Health Check windows server (Unexpected status code: 503)

Dell OpenManage integration seems to break VSAN Health Check, in my experience it was true for VSAN 6.1 atop vCenter on Windows and vCenter Appliance.  Unregistering OpenManage from the affected vCenters in addition to verifying the CA certificates via 3 KBs I noted in my reply to the above discussion resolved the issue for us.

Reply
0 Kudos