VMware Cloud Community
TheITGuySean
Contributor
Contributor
Jump to solution

Unable to add ESXi 5.1.0 Host to a vCenter 5.1.0 Server.

I have four ESXi blades in an enclosure. Due to a messy, partially broken VDS configuration from previous admin, I created a new, proper VDS, disconnected the four hosts from vCenter, configured a new simple virtual switch on each of them, blew away the broken VDS config on each of them, then reconnected them to vCenter, and added them to the new VDS.

At least, that was the plan. Two of the hosts are working perfectly having done what I described. The second two also seemed to work perfectly, but shortly after deployment to the VDS, they disconnected from vCenter.

All the VMs on those hosts are still functioning, and I can connect to the hosts directly with the vSphere client and have complete control of them. However, they will NOT reconnect to the vCenter instance.

I try to use the add host wizard from both the vSphere client and from the Web interface of the vCenter server and they both first give the certificate confirmation warning (we don't have signed certificates, so that's normal) and then just sit until they time out.

I have tried resetting the networking on the host back to a simple virtual switch, removing the stale vpxuser account, restarting the mangement services, restarting all services, and restarting the entire blade and vCenter server. All to no avail.

Anyone have any tips on what might be the issue, or what log files to check to see what might be complaining?

Reply
0 Kudos
1 Solution

Accepted Solutions
TheITGuySean
Contributor
Contributor
Jump to solution

Managed to resolve it. Had to rebuild a new standard vSwitch, and migrate the machines, uplinks and vmks over to that. Deleted the old standard vSwitch and it attached.  Got a hiccup on one of the hosts and it seemed to stall, then disconnect.  Had to delete the switch a second time, rebuilding it once again, and now both are attached and happily migrated to the proper VDS configuration.

View solution in original post

Reply
0 Kudos
6 Replies
msripada
Virtuoso
Virtuoso
Jump to solution

Hello Sean,

After adding the ESXi host, how much time the ESXi hosts are connected for?

Can you check if there are any port 902 issues? Are hosts holding any old DVS traces?

Check the vpxd.log on the vcenter from c:\programdata\vmware\virtualcenter\logs

check the hostd.log on the ESXi host from var/log

Thanks,

Ms

Reply
0 Kudos
fomer
Contributor
Contributor
Jump to solution

Hello,

what the build number of your vcenter?

Thank

Fred

Reply
0 Kudos
TheITGuySean
Contributor
Contributor
Jump to solution

vCenter build: 3814779

ESXi build: 799733

These hosts used to connect to this vCenter, as recently as Tuesday. They simple stopped talking once I removed them and tried to readd.

Reply
0 Kudos
TheITGuySean
Contributor
Contributor
Jump to solution

The ESXi hosts won't connect at all now. When I re-added them after rebuilding the networking on the host, they stayed connected for about 20 seconds then showed red X disconnected.  I removed them, and now they won't readd anymore. I suspect they did not disconnect cleanly and there is some stale config that is keeping them from connecting.

I do not have 902 issues, as the other two hosts on the exact same vlan connected without issues. 

I do not know what DVS traces are, but that sounds like a good place to look. Can you provide more info? I will google it in the meantime.

There is nothing in the hostd.log that gives me clues to the issue. The output while attempting to connect are below:

2017-03-09T12:43:40.076Z [38ED0B90 verbose 'Cimsvc'] Ticket issued for CIMOM version 1.0, user root

2017-03-09T12:43:40.124Z [38C16B90 verbose 'vim.PerformanceManager'] HostCtl Exception in stats collection: Sysinfo error on operation returned status : Not found. Please see the VMkernel log for detailed error information

2017-03-09T12:43:40.124Z [38C16B90 verbose 'vim.PerformanceManager'] HostCtl Exception in stats collection.  Turn on 'trivia' log for details

2017-03-09T12:43:46.751Z [38C16B90 verbose 'SoapAdapter'] Responded to service state request

2017-03-09T12:44:00.124Z [39143B90 verbose 'vim.PerformanceManager'] HostCtl Exception in stats collection: Sysinfo error on operation returned status : Not found. Please see the VMkernel log for detailed error information

2017-03-09T12:44:00.125Z [39143B90 verbose 'vim.PerformanceManager'] HostCtl Exception in stats collection.  Turn on 'trivia' log for details

2017-03-09T12:44:03.205Z [FFFC3D20 verbose 'Default'] Timed out reading between HTTP requests. : Read timeout after approximately 50000ms. Closing stream TCP(local=127.0.0.1:8307, peer=127.0.0.1:60146)

2017-03-09T12:44:03.205Z [FFFC3D20 verbose 'Default'] Timed out reading between HTTP requests. : Read timeout after approximately 50000ms. Closing stream TCP(local=127.0.0.1:8307, peer=127.0.0.1:53840)

2017-03-09T12:44:03.205Z [387E2B90 error 'SoapAdapter.HTTPService'] Failed to read request; stream: TCP(<null>), error: N7Vmacore16TimeoutExceptionE(Operation timed out)

2017-03-09T12:44:03.206Z [38C16B90 error 'SoapAdapter.HTTPService'] Failed to read request; stream: TCP(<null>), error: N7Vmacore16TimeoutExceptionE(Operation timed out)

2017-03-09T12:44:46.780Z [387A1B90 verbose 'SoapAdapter'] Responded to service state request

2017-03-09T12:44:51.077Z [38C16B90 info 'Vmomi'] Activation [N5Vmomi10ActivationE:0x3856cd10] : Invoke done [waitForUpdates] on [vmodl.query.PropertyCollector:ha-property-collector]

2017-03-09T12:44:51.077Z [38C16B90 verbose 'Vmomi'] Arg version:

--> "1"

2017-03-09T12:44:51.078Z [38C16B90 info 'Vmomi'] Throw vmodl.fault.RequestCanceled

2017-03-09T12:44:51.078Z [38C16B90 info 'Vmomi'] Result:

--> (vmodl.fault.RequestCanceled) {

-->    dynamicType = <unset>,

-->    faultCause = (vmodl.MethodFault) null,

-->    msg = "",

--> }

In the vpxd.log file on the vcenter server, I see:

2017-03-09T08:43:08.485-04:00 [06780 info 'commonvpxLro' opID=f2a3fee8] [VpxLRO] -- BEGIN task-internal-3038 -- datacenter-21 -- vim.Datacenter.queryConnectionInfo -- 2d62eb7c-d5db-c635-85d6-bab0fb6e2535(52245080-aabf-81eb-e9f3-47c82b29b453)

2017-03-09T08:43:08.500-04:00 [07068 error 'Default'] SSLStreamImpl::DoClientHandshake (00000000128db2c0) SSL_connect failed. Dumping SSL error queue:

2017-03-09T08:43:08.500-04:00 [07068 error 'Default'] [0] error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

2017-03-09T08:43:08.500-04:00 [07068 error 'HttpConnectionPool-000001'] [ConnectComplete] Connect failed to <cs p:000000000cf64bb0, TCP:host1.domain.local:443>; cnx: (null), error: class Vmacore::Ssl::SSLVerifyException(SSL Exception: Verification parameters:

--> PeerThumbprint: 1F:97:69:A6:21:E2:9D:BD:5D:3F:A0:AD:54:F6:FE:52:0D:AD:EF:44

--> ExpectedThumbprint:

--> ExpectedPeerName: host1.domain.local

--> The remote host certificate has these problems:

-->

--> * The host certificate chain is not complete.

-->

--> * Host name does not match the subject name(s) in certificate.

-->

--> * unable to get local issuer certificate)

2017-03-09T08:43:08.500-04:00 [06780 info 'commonvpxLro' opID=f2a3fee8] [VpxLRO] -- FINISH task-internal-3038 -- datacenter-21 -- vim.Datacenter.queryConnectionInfo --

2017-03-09T08:43:08.500-04:00 [06780 info 'Default' opID=f2a3fee8] [VpxLRO] -- ERROR task-internal-3038 -- datacenter-21 -- vim.Datacenter.queryConnectionInfo: vim.fault.SSLVerifyFault:

--> Result:

--> (vim.fault.SSLVerifyFault) {

-->    dynamicType = <unset>,

-->    faultCause = (vmodl.MethodFault) null,

-->    selfSigned = false,

-->    thumbprint = "1F:97:69:A6:21:E2:9D:BD:5D:3F:A0:AD:54:F6:FE:52:0D:AD:EF:44",

-->    msg = "",

--> }

--> Args:

-->

2017-03-09T08:43:11.230-04:00 [07072 info 'commonvpxLro' opID=54f30d1] [VpxLRO] -- BEGIN task-internal-3039 -- datacenter-21 -- vim.Datacenter.queryConnectionInfo -- 2d62eb7c-d5db-c635-85d6-bab0fb6e2535(52245080-aabf-81eb-e9f3-47c82b29b453)

2017-03-09T08:43:18.780-04:00 [07076 info 'commonvpxLro' opID=a0c2b16] [VpxLRO] -- BEGIN task-internal-3040 --  -- vmodl.query.PropertyCollector.retrievePropertiesEx -- 2d62eb7c-d5db-c635-85d6-bab0fb6e2535(52245080-aabf-81eb-e9f3-47c82b29b453)

2017-03-09T08:43:18.780-04:00 [07076 info 'commonvpxLro' opID=a0c2b16] [VpxLRO] -- FINISH task-internal-3040 --  -- vmodl.query.PropertyCollector.retrievePropertiesEx --

2017-03-09T08:44:21.288-04:00 [07052 info 'commonvpxLro' opID=D40802EE-0000031A-6b] [VpxLRO] -- BEGIN task-internal-3041 --  -- vmodl.query.PropertyCollector.cancelWaitForUpdates -- 52322d92-1f6b-c139-ad76-de650082a274(52f02d94-81fa-0443-86b2-34ae8c86ee4a)

2017-03-09T08:44:21.288-04:00 [07052 info 'commonvpxLro' opID=D40802EE-0000031A-6b] [VpxLRO] -- FINISH task-internal-3041 --  -- vmodl.query.PropertyCollector.cancelWaitForUpdates --

2017-03-09T08:44:21.288-04:00 [07076 error 'SoapAdapter.HTTPService.HttpConnection' opID=D40802EE-00000319-cf] Connection lost while waiting for the next request on stream TCPStreamWin32(socket=TCP(fd=3996) local=127.0.0.1:8085,  peer=127.0.0.1:62485): class Vmacore::SystemException(An established connection was aborted by the software in your host machine. )

and not much else except repeating notifications about data gathering from the two hosts that are currently connected.  It looks to me like the expected SSL verification based on the unsigned certificate, then a failure to connect.

Reply
0 Kudos
TheITGuySean
Contributor
Contributor
Jump to solution

/var/log # esxcli network vswitch dvs vmware list

/var/log # esxcli network vswitch standard list

vSwitch0

   Name: vSwitch0

   Class: etherswitch

   Num Ports: 128

   Used Ports: 9

   Configured Ports: 128

   MTU: 1500

   CDP Status: listen

   Beacon Enabled: false

   Beacon Interval: 1

   Beacon Threshold: 3

   Beacon Required By:

   Uplinks: vmnic1, vmnic0

   Portgroups: 200, VMkernel3, VMkernel2, 201

There are no VDS configurations left on this host, and the standard vswitch in place is working to allow me to connect to the host via telnet or via the vSphere client.  It just simply refuses to connect to vCenter.

Reply
0 Kudos
TheITGuySean
Contributor
Contributor
Jump to solution

Managed to resolve it. Had to rebuild a new standard vSwitch, and migrate the machines, uplinks and vmks over to that. Deleted the old standard vSwitch and it attached.  Got a hiccup on one of the hosts and it seemed to stall, then disconnect.  Had to delete the switch a second time, rebuilding it once again, and now both are attached and happily migrated to the proper VDS configuration.

Reply
0 Kudos