VMware Cloud Community
NuggetGTR
VMware Employee
VMware Employee

vCenter 5.1 backtrace host disconnection

Hi all,

Wow I can say I have had the worst experience with 5.1 upgrade.

I have never had an issue upgrading of vSphere since v3.0.2 but I guess I was bound to hit a snag eventually

I had planned to upgrade my environment of 200 hosts and 4000 virtuals from 5 to 5.1. Made sure all pre reqs were meant and then proceeded with the first vCenter which is about 60 hosts and 2000 virtuals.

SSO installed fine,

vCenter inventory service upgraded fine,

vCenter Server upgraded bummed out with the 26002 error not being able to register with inventory service. then rolled back... they should of called it completely uninstall, not roll back.

Proceeded to install again from scratch... same error.

Removed the SSL folder install worked fine but profile storage service bummed out,

Finally got the install working and the vcenter database was screwed.

so thought I would quit while I was down and rolled back the snapshot of the vcenter and asked the DBAs to restore the database.... DBAs oops we over wrote the backup and we only keep one day.... WTF!!!!

So I bite the bullet and spent 2 days without sleep and rebuilt everything from scratch. new vcenter, new database, new inventory install, new SSO install. and then attached the 60 ESXi 5 hosts.

Now everything looked to be running ok but I have an issue, the ESXi hosts will all disconnect then reconnect seconds later everytime this happens causes the vcenter to hit 100% cpu and become unresposive untill all the ESXi hosts have reconnected, this is annoying as things like vmotions etc will fail due to the disconnects.

I am getting some SSL EOF/ backtrace in the VPXD logs, I currently have my VMware TAM taking it back to the Engineers but thought I would post to see if anyone has seen this. I dont use CA certs just the standard self signed per default install.

My plan is to upgrade the Hosts to 5.1 but just want to make sure the vCenter is fine first.

I cant upload logs or anything due to the secure site But I will attach below a snippet of the errors seen when the disconnects happen. I have seen similar errors but they dont fit with what im seeing.

2012-09-29T00:04:54.345+10:00 [04556 error 'Default'] SSLStreamImpl::DoClientHandshake for SSL(TCPClientSocket(this=00000000ae5ef0b0, state=CONNECTED, _connectSocket=TCP(fd=-1), error=(null)) TCPStreamWin32(socket=TCP(fd=66872) local=vcenterIP:3758,  peer=hostIP:443)): SSL_connect failed with Unexpected EOF
2012-09-29T00:04:54.361+10:00 [06624 error 'Default'] SSLStreamImpl::DoClientHandshake for SSL(TCPClientSocket(this=00000000ae5fa0f0, state=CONNECTED, _connectSocket=TCP(fd=-1), error=(null)) TCPStreamWin32(socket=TCP(fd=67984) local=0.0.0.0:3757,  peer=hostip:443)): SSL_connect failed with Unexpected EOF
2012-09-29T00:04:54.361+10:00 [04556 error 'HttpConnectionPool-000000'] [ConnectComplete] Connect failed to <cs p:0000000081188820, TCP:host4.local:443>; cnx: (null), error: class Vmacore::Ssl::SSLException(SSL Exception: Unexpected EOF)
2012-09-29T00:04:54.361+10:00 [06624 error 'HttpConnectionPool-000000'] [ConnectComplete] Connect failed to <cs p:000000008ac4bab0, TCP:host13.local:443>; cnx: (null), error: class Vmacore::Ssl::SSLException(SSL Exception: Unexpected EOF)
2012-09-29T00:04:54.361+10:00 [11868 warning 'VpxProfiler' opID=HB-host-310@1498093-59042227] [VpxdHostSync] GetChanges host:host1.local (IP Address) [GetChangesTime] took 42281 ms
2012-09-29T00:04:54.361+10:00 [11868 warning 'VpxProfiler' opID=HB-host-310@1498093-59042227] [VpxdHostSync] DoHostSync:000000000A1D5660 [DoHostSyncTime] took 42281 ms
2012-09-29T00:04:54.361+10:00 [11868 warning 'vpxdvpxdInvtHostCnx' opID=HB-host-310@1498093-59042227] [VpxdInvtHostSyncHostLRO] DoHostSync failed for host host-310
2012-09-29T00:04:54.361+10:00 [11868 warning 'vpxdvpxdInvtHostCnx' opID=HB-host-310@1498093-59042227] [VpxdInvtHostSyncHostLRO] Host sync failed to host-310
2012-09-29T00:04:54.361+10:00 [11868 error 'vpxdvpxdInvtHostCnx' opID=HB-host-310@1498093-59042227] [VpxdInvtHostSyncHostLRO] FixNotRespondingHost failed for host host-310, marking host as notResponding
2012-09-29T00:04:54.361+10:00 [07776 error 'Default'] SSLStreamImpl::DoClientHandshake for SSL(TCPClientSocket(this=000000009f5e2a20, state=CONNECTED, _connectSocket=TCP(fd=-1), error=(null)) TCPStreamWin32(socket=TCP(fd=63436) local=vcenterIP:3763,  peer=HostIP:443)): SSL_connect failed with Unexpected EOF
2012-09-29T00:04:54.361+10:00 [10592 info 'vpxdvpxdVmomi' opID=HB-host-1772@1212909-1f45a976] [ClientAdapterBase::InvokeOnSoap] Invoke done (host14.local, vpxapi.VpxaService.retrieveChanges)
2012-09-29T00:04:54.361+10:00 [07776 error 'HttpConnectionPool-000000'] [ConnectComplete] Connect failed to <cs p:0000000089de4520, TCP:host10.local:443>; cnx: (null), error: class Vmacore::Ssl::SSLException(SSL Exception: Unexpected EOF)
2012-09-29T00:04:54.361+10:00 [02892 error 'vpxdvpxdVmomi' opID=HB-host-1884@255600-2cfa0bd7] [VpxdClientAdapter] Got vmacore exception: SSL Exception: Unexpected EOF
2012-09-29T00:04:54.361+10:00 [02892 error 'vpxdvpxdVmomi' opID=HB-host-1884@255600-2cfa0bd7] [VpxdClientAdapter] Backtrace:
--> backtrace[00] rip 0000000180100c98
--> backtrace[01] rip 0000000180101fae
--> backtrace[02] rip 000000018008aeab
--> backtrace[03] rip 0000000180004eb4
--> backtrace[04] rip 000000018011fea2
--> backtrace[05] rip 000000018005d1fc
--> backtrace[06] rip 00000001800523dc
--> backtrace[07] rip 000000018011a0e5
--> backtrace[08] rip 00000001801a9668
--> backtrace[09] rip 00000001801a9aed
--> backtrace[10] rip 00000001801ab260
--> backtrace[11] rip 000000018019b62a
--> backtrace[12] rip 0000000078622fdf
--> backtrace[13] rip 0000000078623080
--> backtrace[14] rip 0000000077d6b71a
--> backtrace[15] rip 0000000000000000
-->
2012-09-29T00:04:54.361+10:00 [08720 error 'Default'] SSLStreamImpl::DoClientHandshake for SSL(TCPClientSocket(this=00000000a2a7fa70, state=CONNECTED, _connectSocket=TCP(fd=-1), error=(null)) TCPStreamWin32(socket=TCP(fd=62428) local=vcenterIP:3766,  peer=hostip:443)): SSL_connect failed with Unexpected EOF
2012-09-29T00:04:54.361+10:00 [08720 error 'HttpConnectionPool-000000'] [ConnectComplete] Connect failed to <cs p:000000000997f640, TCP:host16.local:443>; cnx: (null), error: class Vmacore::Ssl::SSLException(SSL Exception: Unexpected EOF)
2012-09-29T00:04:54.361+10:00 [07920 error 'vpxdvpxdVmomi' opID=HB-host-1740@75877-52bb6b88] [VpxdClientAdapter] Got vmacore exception: SSL Exception: Unexpected EOF
2012-09-29T00:04:54.361+10:00 [07920 error 'vpxdvpxdVmomi' opID=HB-host-1740@75877-52bb6b88] [VpxdClientAdapter] Backtrace:
--> backtrace[00] rip 0000000180100c98
--> backtrace[01] rip 0000000180101fae
--> backtrace[02] rip 000000018008aeab
--> backtrace[03] rip 0000000180004eb4
--> backtrace[04] rip 000000018011fea2
--> backtrace[05] rip 000000018005d1fc
--> backtrace[06] rip 00000001800523dc
--> backtrace[07] rip 000000018011a0e5
--> backtrace[08] rip 00000001801a9668
--> backtrace[09] rip 00000001801a9aed
--> backtrace[10] rip 00000001801ab260
--> backtrace[11] rip 000000018019b62a
--> backtrace[12] rip 0000000078622fdf
--> backtrace[13] rip 0000000078623080
--> backtrace[14] rip 0000000077d6b71a
--> backtrace[15] rip 0000000000000000
-->

Any ideas?

________________________________________ Blog: http://virtualiseme.net.au VCDX #201 Author of Mastering vRealize Operations Manager
0 Kudos
3 Replies
NuggetGTR
VMware Employee
VMware Employee

Narrowed it down to a inventory service issue,

Today came into work and everything was running great, untill i tried to use the web client which informed me that the inventory service had stopped. strated it again and bam!! hosts start disconnecting again, heap of java errors in the inventory logs. Logged a case with VMware today.

________________________________________ Blog: http://virtualiseme.net.au VCDX #201 Author of Mastering vRealize Operations Manager
0 Kudos
NuggetGTR
VMware Employee
VMware Employee

Well sorted it out,

The EOF errors was because inventor service was on the same disk as vcenter server, and looks like it was causeing disk contention, moved it off onto its own drive and that solved that issue.

The disconnections was an SSL issue and upgrading ESXi from 5 to 5.1 fixed those.

all sweet Smiley Happy

________________________________________ Blog: http://virtualiseme.net.au VCDX #201 Author of Mastering vRealize Operations Manager
0 Kudos
suhaakin
Contributor
Contributor

Hello NuggetGTR,

Could you please check below KB. Looks like same issue.

http://kb.vmware.com/kb/2064246

0 Kudos