VMware Horizon Community
michalhoppe
Contributor
Contributor

VDI inoperable - unable to connect to vCenter in View Administrator

Greetings,

I have an puzzling and intermittent issue where the View Administrator loses connectivity to the vCenter.  The problem manifests itself in the View Administrator with the following errors:

  • The service is not working properly.  Certificate is untrusted but the thumbprint for the certificate is accepted.
  • Cannot connect to the vCenter Server vcenter.mycompany.tld because the user name or password is not valid.

Looking at the View Connection Server and the vCenter logs, it looks like there are 2 possibilities:

I checked the notrealviewusername AD group memberships: it belongs to 9 groups.  This is well within the acceptable limit.

I also added our AD domain to the SSO default domain list.

However, the issue persists intermittently.

Digging into this further, I focused on the SSO server and found the following from imsSystem.log:

2013-08-22 10:38:44,815, ,<key>,,<SSO server IP>,CONN_POOL_GET_CONNECTION,16158,FAIL,LDAP_CONNECTION_FAILED,SYSTEM,SYSTEM,SYSTEM,SYSTEM,SYSTEM,SYSTEM,SYSTEM,slot-0-bind,,,,,,

So it looks like the SSO can't bind to our AD through LDAP.  However, the connectivity between the AD servers configured in SSO has been confirmed during this problem.  The SSO server Windows event log is also clear of any AD/LDAP errors.

I found that restarting the SSO service and letting it settle down for 5 minutes clears the issue.

This problem renders our View infrastructure inoperative from time to time, as both management and connection to new or existing desktops is affected.

Has anyone else seen this?

- M

0 Kudos
7 Replies
Linjo
Leadership
Leadership

What versions of View, vCenter and vSphere are you using?

How many connection-brokers?

How many users and pools do you have?

How often does this happen?

// Linjo

Best regards, Linjo Please follow me on twitter: @viewgeek If you find this information useful, please award points for "correct" or "helpful".
0 Kudos
michalhoppe
Contributor
Contributor

Linjo,

What versions of View, vCenter and vSphere are you using?

  • View 5.2
  • vCenter 5.1
  • vSphere 5.1

How many connection-brokers?

  • 2 external (with RSA 2 factor auth)
  • 2 internal (AD auth only)

How many users and pools do you have?

  • Testing only at the moment: 50 users, 4 pools

How often does this happen?

  • intermittent - once every 1-3 weeks

- M

0 Kudos
Vdiallstar
Contributor
Contributor

Is time configured correctly between AD and the connection server?

Ensure connection server, vCenter & host are pulling from the same time source

0 Kudos
michalhoppe
Contributor
Contributor

Vdiallstar,

Vdiallstar wrote:

Is time configured correctly between AD and the connection server?

Ensure connection server, vCenter & host are pulling from the same time source

All of the servers are part of the same domain.  They all use the same time source.

Thanks,

Michal

0 Kudos
michalhoppe
Contributor
Contributor

The issue happened again just a few minutes ago.  To add another bit of information, I noticed this event log warning that roughly correlates to the time of the SSO error:

Event ID 4227 Warning TCPIP

TCP/IP failed to establish an outgoing connection because the selected local endpoint was recently used to connect to the same remote endpoint. This error typically occurs when outgoing connections are opened and closed at a high rate, causing all available local ports to be used and forcing TCP/IP to reuse a local port for an outgoing connection. To minimize the risk of data corruption, the TCP/IP standard requires a minimum time period to elapse between successive connections from a given local endpoint to a given remote endpoint.

We have seen this once before.

I have an hourly scheduled task running that counts the number of TCP and UDP connections.  My logs indicate an average of 85 TCP and 30 UDP continuous connections.  I have not seen a spike yet.  To get a finer grained picture of what's happening with the connections on the system, I changed the logging from every 1 hour to every 5 minutes.  Hopefully if there's a spike, we will catch it.  The script will dump all netstat process information on >100 TCP or UDP connections, so we should know which process has many connections.

If anyone has any ideas, please let me know!

- Michal

0 Kudos
epa80
Hot Shot
Hot Shot

Pulling this out of nowhere. Did you ever find a resolution to this? We're experiencing pretty much the same exact behavior. Both our Virtual Centers will show as disconnected in our View Console, with a message of "The service is not working properly. Certificate is untrusted but the thumbprint for the certificate is accepted", however, after maybe a minute or 2 of absolutely NO interaction by myself or my team, they go back to green and working just fine. At no point do we see an interruption of services. Does that mean it's not a problem? I guess, still annoying.

0 Kudos
michalhoppe
Contributor
Contributor

epa80,

If memory serves, it was a known issue with the vSphere 5.1 SSO service.  It will not close TCP/IP connections properly until it runs out of ports.  Restarting the service would clear this error - I believe there was an update for that.

- Michal

0 Kudos