We have been having intermittent issues with connecting to our vCenter server (4.0u1) with vSphere client. We typically get an error that data could not be read from the connection or a 503: Service Unavailable. Restarting services didn't make a difference but rebooting would bring it back for a few days. Then the cycle would begin again.
Using TCPView from sysinternals on the server itself, I saw thousands of localhost (127.0.0.1) connections in the TCP state LAST_ACK. All of them originated from, and connected to, the VPXD process. If I close the connections using TCPView (since they're technically closed anyway) normal service is restored. I started watching the number of connections in this state and as I suspected, the number grows exponentially over time.
Has anyone else encountered this issue? If so, is there a fix for it?
I have been tracking issues similar to this internally here at VMware. Do you have an SR open?
Cheers,
/Jonathan
I have not opened a request yet.
if you could please open one and let me know the number. I have some steps that we are using to track down the problem. The more reported cases that we see the quicker we can get it resolved.
Cheers,
/Jonathan
SR# 1462009621 has been opened.
Perfect. I have taken ownership of it, I will be in touch.
Just noticed this post and wanted to pass on some information. Your observations match ours to the letter except we had this situation with VI 3.5. The problem in our case was caused by BMC Performance Manager for Virtual Servers.
Using its API this product (running from a seperate server) attaches to VCenter, collects performance data before detaching without properly closing its session port (always 8085 on server). We would observe hundreds of NetStat entries as below:
TCP server:4677 server.domain.com:8085 LAST_ACK
Every few minutes BMC was reconnecting, starting a new session, reading more data and abandoning yet another session on LAST_ACK. This would continue until no further connections were possible which included Virtual Center clients (3.5 or 4.x)
BMC kept coming back to us with parameters to change but to no avail. To us it looks like their interpretation of the API (or protocol) is flawed. We no longer run the BMC product and have not had the problem since. Perhaps you have BMC or something similar? If however any fix has been developed on the VMware end to snipe these sessions it would be interesting to hear. In summary, no BMC = no observed problem.