VMware Cloud Community
CyberTron123
Enthusiast
Enthusiast

Trouble getting VSAN 6 Storage Providers up and running

Hi!

I have set up a test Environment with Three hosts and disks ssd and so on. Everything goes perfect, vcenter is installed, vsan is up and running But it failes to register the storage providers

Error messages:

Registration/unregistration of a VASA vendor provider on a Virtual SAN host fails

Storage provider resync failed

I can't open the VSAN Default Storage Profile becuase no providers are found. Never had these troubles Before when it was in beta.

any ideas what to look for ?

17 Replies
zdickinson
Expert
Expert

What's the networking setup?  Perhaps IGMP related?  Thank you, Zach.

ramakrishnak
VMware Employee
VMware Employee

Few things you need to look for

a. Goto Hosts and Cluster -> vcenter ->  Manage -> Storage Providers Tab.

you should see VSAN Providers registered and **online**  for all the hosts which are in the VSAN VC cluster.

if not try the toolbar after the "+" sign which says " Synchronizes all Virtual SAN storage providers with the current state of the environment "

This should register all the vsan providers.

b. If still they don't register. then you can go to each ESXi host and check if vsanvpd service is running and look at the vsapvpd logs.

They will provide info on why the registration is failing on that host.

Logs are at /var/log/vsanvpd.log

service: /etc/init.d/vsanvpd status

c. As the last resort. try moving the hosts out of vsan VC cluster after putting them in maintenance mode and bring them back. this will trigger the auto-registration of vsan providers

d. Check if the Storage management service (SMS) certificate has expired (KB article 2078070)

e. If these don't resolve. please file an SR with the vc-support bundles which will help us to root cause the issue

Thanks,

CyberTron123
Enthusiast
Enthusiast

Hi

Done all those things, but I think I have found the problem, i get ALOT och errors like :

IO Timeout. Using timput value for probe interval.

nmp_ResetDeviceLogThrottling:3345: last error status from device mpx.vmhba32:c0:t0:l0

coult not open device "naa.50014ee65a9299:1" for volume open: no underlyiing device for major,minor

No FS driver claimed device "NAA......" : no filesystem on the device

Disk handle open failure for device "nAaa...." status: busy

and so on and so on.

I have an supported sas card (LSI 9207-8i). It worked perfectlyt with the 6.0 beta 2 and Rc1  (i am using firmware 20.00.0.0 and driver 20.0.0.0 as suggested here:

But i am guessing that the vsan does not communicate correctly with the disks so that is why the providers doesn't work.

what can I do ?

Reply
0 Kudos
ramakrishnak
VMware Employee
VMware Employee

These messages are harmless.

AFAICT they are nothing to do with your main issue. i would suggest file a SR, so we can root cause the issue

also we will probably have a KB on these nmp reset msgs so we can eliminate any false alarms

Thanks,

Reply
0 Kudos
CyberTron123
Enthusiast
Enthusiast

Hi

I have filed an SR: 15647395804

I will update this post when I have an answer :smileygrin:

/Michael

Reply
0 Kudos
elerium
Hot Shot
Hot Shot


I'm also seeing a similar issue, although not exactly the same, also VSAN 6, new setup.


In my case, the VSAN storage providers show up but all show disconnected, clicking the resync all doesn't change anything. Already have a SR open #15654667704


It may have to do with a VSAN SSL problem for VSANVP service on hosts or vcenter. Not sure how it ended up like this since the hosts are all new installs of VSAN 6. I did upgrade vCenter from 5.5 to 6.0, and tried KB 2078070 already. Hopefully support can figure it out soon.


hosts logs /var/log/vsanvpd.log showing:


2015-04-28T22:17:46Z vsanSoapServer: verify_cert_with_store:813:Cannot verify cert with CA store /etc/vmware/ssl/castore.pem: certificate has expired (10)
2015-04-28T22:17:46Z vsanSoapServer: verify_cert_with_store:813:Cannot verify cert with CA store /etc/vmware/ssl/vsanvp_castore.pem: self signed certificate (18)


on vCenter server C:\ProgramData\VMware\vCenterServer\logs\vmware-sps\sps.log showing:

2015-04-28T12:05:09.918-07:00 ERROR opId= com.vmware.vim.sms.provider.vasa.cert.CertificateAuthority - Failed to propogate root certificate and CRL to VPs


During registration:

2015-04-28T12:05:39.465-07:00 INFO  opId= org.apache.axis2.transport.http.HTTPSender - Unable to sendViaPost to url[https://10.0.2.84:8080/vasa/services/vasaService]

org.apache.axis2.AxisFault: Transport error: 405 Error: Method Not Allowed


During unregistration:

2015-04-28T12:06:00.283-07:00 ERROR opId= com.vmware.vim.sms.provider.vasa.VasaProviderImpl - Exception during unregisterVasaCertificate()

com.vmware.vim.sms.fault.VasaServiceException: org.apache.axis2.AxisFault: SSL error


 

Reply
0 Kudos
ramakrishnak
VMware Employee
VMware Employee

> In my case, the VSAN storage providers show up but all show disconnected, clicking the resync all doesn't change anything.

Can you quickly check by

from UI, unregister the providers which are offline and click on the "Synchronizes all Virtual SAN Storage Providers" button

Thanks

Reply
0 Kudos
elerium
Hot Shot
Hot Shot

I gave that a try (unregistering all providers and resync all), it results in the same disconnected state. The support engineer already tried removing/re-adding the hosts and rebuilding the cluster, still the same result.

Reply
0 Kudos
Bleeder
Hot Shot
Hot Shot

Any luck?

I wonder if you might have your certificate mode set to custom.  I do, and this seems to be problematic when looking at my sps.log.

I see this over and over:

2015-06-08T22:38:13.734-05:00 [Timer-1] DEBUG opId= com.vmware.vim.sms.provider.vasa.cert.CertificateAuthority - [getRootCert] Querying VECS for root certificates...

2015-06-08T22:38:13.813-05:00 [Timer-1] DEBUG opId= com.vmware.vim.sms.provider.vasa.cert.RootCertAndCrlPropagator - No provider found to propogate root certificates and crls

Reply
0 Kudos
elerium
Hot Shot
Hot Shot

No luck for me, I still have a SR from 5 weeks ago and is now escalated. I've rebuilt the VSAN, re-installed hosts from scratch, rebuilt cluster. Only thing left I haven't done is rebuild vCenter since no good documentation for that is released yet.

I have this issue with both default VMCA and custom CA. VMware support had me revert to default VMCA for the time being since it was making troubleshooting for them more difficult.

Reply
0 Kudos
jonretting
Enthusiast
Enthusiast

Hmm... In the past I have solved Storage Provider issues by way of using a windows domain certificate authority, to issue certs to the ESXi hosts. This was only specific to a VCS on Windows, as I have yet to encounter storage provider registration problems on the appliance version. Long story short this guy made a great Powershell script to streamline the process. I would suggest diving into his entire article before jumping to the applicable part for generating certs for your ESXi hosts. There are a couple gotchas in the script, and if memory serves correct its documented in the script. http://www.derekseaman.com/2013/10/vsphere-5-5-install-pt-1-introduction.html I have not used the script in vSphere 6, but since its powercli based it should work fine.

Once again by issuing the certs manually, or by way of Derek's script, can fix most registration problems. It is pretty important in 5.5 to have proper certificates and authorities setup. In vSphere 6 the appliance uses a new internal CA to do this, but in 5.5 the ESXi host generates a self-signed certificate if i recall.

Hope this helps in some way Smiley Happy

Reply
0 Kudos
elerium
Hot Shot
Hot Shot

I actually implemented Derek's vSphere 6.0 VMCA subordinate solution which uses a windows domain certificate authority a month ago shortly after I opened by SR with VMWare. I was able to successfully deploy VMCA as subordinate with my Windows CA as root, however I still couldn't get storage providers working (same errors). Ultimately VMWare support had me revert back to the default VMWare CA since it added a layer of complexity to their troubleshooting.

vSphere 6.0 has different deployment options for CAs than 5.5 (can read about it here: http://www.derekseaman.com/2015/02/vsphere-6-0-install-pt-3-certificate-management.html). The script I tried for VMCA as subordinate uses an updated 6.0 powershell linked here:

http://vexpert.me/toolkit60

When VMWare has KB instructions on how to rebuild vSphere 6.0 for Windows or good instructions for conversion to VCSA I might give that a try, rebuilding vCenter is the only thing I haven't done.

jonretting
Enthusiast
Enthusiast

There is a great "Fling" for transitioning to to the appliance. VCS to VCVA Converter – VMware Labs and the blog post here http://www.virtuallyghetto.com/2015/03/long-awaited-fling-windows-vcenter-server-to-vcsa-converter-a...

I have used it and it works brilliantly. TBH I really think Windows Vcenter has been obsoleted by 6.0 with all its new capabilities.

Cheers

Reply
0 Kudos
Bleeder
Hot Shot
Hot Shot

A quick warning.. That fling is great.. when it works.  There is still a bug where it doesn't work if your vCenter is on Windows Server 2012 R2, so don't waste your time if you're in that situation.  I'm not sure what's taking so long to fix that issue.  This is probably why the fling isn't officially supported yet.

elerium
Hot Shot
Hot Shot

I finally fixed the storage provider issue! After reading a new KB article on how to review certificates inside VECS (VMware KB:    Manually reviewing certificates in VMware Endpoint Certificate Store for vSphere 6.0) and reading more about VECS from vSphere 6.0 Documentation Center‌ , I found that my SMS certificate was expired. I tried to use C:\Program Files\VMware\vCenter Server\vmcad\certificate-manager.bat, option 6 "Replace Solution User Certificates with VMCA Certificates" but it did not correct the problem. Deleting the SMS store from VECS and restarting all the services fixed it for me (take a backup before you do this!). For anyone with the same issue the command was: C:\Program Files\VMware\vCenter Server\vmafdd>vecs-cli store delete --name SMS

You can check if your SMS certificate expired before deleting it by running C:\Program Files\VMware\vCenter Server\vmafdd>vecs-cli entry list --store SMS --text

In regards to converting from VCS to VCSA, the VCSA doesn't support VUM yet which would result in needing another Windows machine to run VUM. Until VUM support comes with VCSA I probably would not switch platforms.

Bleeder
Hot Shot
Hot Shot

Sounds like KB 2079087 and 2078070 need an update for vSphere 6.0.  Anyway, I wish that was my problem, but I checked and my SMS certificate is good for quite a while yet.

Reply
0 Kudos
jonretting
Enthusiast
Enthusiast

"A quick warning.. That fling is great.. when it works.  There is still a bug where it doesn't work if your vCenter is on Windows Server 2012 R2, so don't waste your time if you're in that situation.  I'm not sure what's taking so long to fix that issue.  This is probably why the fling isn't officially supported yet."

My jaw dropped when i read that, what a shame Smiley Sad Guess I should be thankful the VCS was 2008R2...

Great warning

Reply
0 Kudos