VMware Cloud Community
aha_tom
Enthusiast
Enthusiast
Jump to solution

vCloud Director 5.1 prepare host failed

We have a vSphere 5.1 cluster with 4 hosts mixed with ESXi 5.1 and 4.1.

3 of the 4 hosts are prepared and working in a vCloud Director 5.1 Provider VDC.

Recently I upgraded the one left from ESXi 4.1 to 5.1. After the upgrade, I tried to prepare it for the vCloud Director Provider VDC. But run into an error:

Failures occurred during prepare of host XXXXX

- java.net.SocketTimeoutException: Read timed out

- Read timed out

I have tried following things:

- Make sure there are no VMs running on the host. The host does actually go into Maintenance mode each time I run the preparation.

- Reboot the ESXi host

- Enable httpClient 80,443 on the ESXi firewall

- Run esxcli software vib list | grep vcloud to check if the agent is already installed. But nothing showed up.

All these made no difference. The installation still times out..

Anyone seen this before?

Tags (1)
Reply
0 Kudos
1 Solution

Accepted Solutions
aha_tom
Enthusiast
Enthusiast
Jump to solution

OK! The issue is now fully resolved. Guess what is the problem..... The internal USB controller on the blade was set to USB 1.1 in BIOS!!! I changed it to USB 2.0 and everything works twice as fast and I was able to successfully prepare the host for vCloud.

Don't know why the BIOS setting has been changed just for this blade. Glade to have this sorted out though.

View solution in original post

Reply
0 Kudos
11 Replies
IamTHEvilONE
Immortal
Immortal
Jump to solution

Most of the times when I see something like this, it's either a local storage issue or a network issue.  Granted that's 'most times'.

The error message you put in your post is a Java level Network level issue.  So vCloud is doing it's thing and we are getting issues somewhere else.

I know that might not help much, but it to help explain the error state you are currently in.

Reply
0 Kudos
aha_tom
Enthusiast
Enthusiast
Jump to solution

I called VMware support and through esxtop, we can see there are some abnormal high number for DVG/cmd on the USB driver. Possibly something wrong with the local embedded USB disk. I am going to retry the ESXi installation and see how it goes.

Reply
0 Kudos
IamTHEvilONE
Immortal
Immortal
Jump to solution

sure thing.  Keep us updated.  Unsupported or Untested USB keys were an issue in one environment, since it just didn't write very fast to it.  As soon as they reinstalled with boot from SAN it was just fine.

Reply
0 Kudos
aha_tom
Enthusiast
Enthusiast
Jump to solution

Tried ESXi re-install couple times. Same result. Still cannot prepare the host for vCloud Director. Escalating the issue to senior VMware support.

Reply
0 Kudos
aha_tom
Enthusiast
Enthusiast
Jump to solution

OK, it's been a week and the issue is still unresolved. VMware support tried number of things still could not understand why vCloud Director cannot prepare the host. From the log they can see the agent is actually installed. But when it tries to report back to vCloud Director, the http session keeps drop out. I have asked if it could be a network issue, the answer is no. I have also replaced the embedded USB stick, but made no difference. The VMware tech told me he will discuss with their seniro engineers next week. I hope they can come up with something useful this time.

Reply
0 Kudos
Lensar
VMware Employee
VMware Employee
Jump to solution

The same problem for me. Did you get any update?

Reply
0 Kudos
aha_tom
Enthusiast
Enthusiast
Jump to solution

I end up downgrade the host back to 4.1 and afterwards the prepare went smoothly. VMware support suggested following steps if I want to use ESXi 5.1, which is not tested by me:

  1. Took an ESXi 5.1 host and downloaded the vcloud director agent from VCD cell from /opt/vmware/vcloud-director/agent/vcloudagent-esx51-5.1.0-799577.vib
  2. Manually installed it using the following command on the esxi host: esxcli software vib install -v=/vmfs/volumes/10.131.0.200-Local/vcloudagent-esx51-5.1.0-799577.vib
  3. Post install, I stopped the VCD service and made the following changes in the db: Update managed_server set is_prepared=1 where display_name like '%<hostname>%';
  4. Restarted the service and the Host should show up as prepared.

VMware Support gave following reason for why it failed on ESXi 5.1

"The reason this issue is not seen in 4.1 and might be observed only in 5.1 with slower SD cards is that the newer bootbank takes a while to update the vibs and hence could possibly timeout with slower responding SD cardsThough you replaced the SD cards but the throughput/performance seems to be exactly as the old SD card."

The reason they gave does not make sense to me. As why something was working perfectly fine will break after upgrade? Isn't this some kind of bug? I have asked if they have any patches or hotfix for this issue. Haven't got an response yet.

Reply
0 Kudos
IamTHEvilONE
Immortal
Immortal
Jump to solution

As I mentioned in my first response that this is commonly due to local storage, that's only because any time I resolved the problem we swapped out local storage.

In one case they were using a non-standard USB key (e.g. they got it at Best Buy/Frys/etc) and was not certified for the host by the hardware vendor.  Replacing with a fast certified USB key resolved the problem.

The other case it was an ESXi embedded host where the internal CF Card was having issues.  It was faster to re-install with boot from SAN to at least prove it would work, which it did.  This technically excludes the ESXi embedded image as a problem, but generally pointed at the install/CF card as the issue.

Reply
0 Kudos
aha_tom
Enthusiast
Enthusiast
Jump to solution

OK! The issue is now fully resolved. Guess what is the problem..... The internal USB controller on the blade was set to USB 1.1 in BIOS!!! I changed it to USB 2.0 and everything works twice as fast and I was able to successfully prepare the host for vCloud.

Don't know why the BIOS setting has been changed just for this blade. Glade to have this sorted out though.

Reply
0 Kudos
IamTHEvilONE
Immortal
Immortal
Jump to solution

Agreed on getting it sorted.

Reply
0 Kudos
badgerx
Contributor
Contributor
Jump to solution

I'd just like to add to this post. I recently had a similar problem when deploying on vCD 5.5.5 and ESXi 5.5U3 hosts. The hosts were Dell R620 running on mirrored SD cards which were oldish Kingston Class 10. After doing some troubleshooting with VMware support we found the following in the vCD server logs (which we also see in the web GUI):

java.net.SocketTimeoutException: Read timed out  - Read timed out

And on the host itself in /var/log/vmkernel.log we could see the following repeating over and over while vcloud-agent was being installed

2016-01-14T13:22:35.061Z cpu0:35224)DVFilter: 4310: Heartbeat from slow path #8 timed out after 600 ms

2016-01-14T13:22:40.059Z cpu0:32856)DVFilter: 4310: Heartbeat from slow path #8 timed out after 598 ms

2016-01-14T13:22:45.061Z cpu0:33481)DVFilter: 4310: Heartbeat from slow path #8 timed out after 600 ms

2016-01-14T13:22:50.061Z cpu0:32800)DVFilter: 4310: Heartbeat from slow path #8 timed out after 599 ms

2016-01-14T13:22:55.061Z cpu0:32800)DVFilter: 4310: Heartbeat from slow path #8 timed out after 599 ms

2016-01-14T13:23:00.062Z cpu0:32800)DVFilter: 4310: Heartbeat from slow path #8 timed out after 599 ms

2016-01-14T13:23:05.062Z cpu0:33481)DVFilter: 4310: Heartbeat from slow path #8 timed out after 600 ms

2016-01-14T13:23:10.062Z cpu0:33481)DVFilter: 4310: Heartbeat from slow path #8 timed out after 600 ms

2016-01-14T13:23:15.062Z cpu0:33563)DVFilter: 4310: Heartbeat from slow path #8 timed out after 601 ms

2016-01-14T13:23:20.060Z cpu0:123522)DVFilter: 4310: Heartbeat from slow path #8 timed out after 598 ms

2016-01-14T13:23:25.062Z cpu0:32800)DVFilter: 4310: Heartbeat from slow path #8 timed out after 599 ms

2016-01-14T13:23:30.060Z cpu0:32883)DVFilter: 4310: Heartbeat from slow path #8 timed out after 598 ms

The above error according to the VMware tech indicated a disk issue. We then replaced the SD cards with new (and slightly faster) Samsung MicroSDHC EVO, Class 10, UHS-I.

After a fresh load of ESXi we could deploy vcloud-agent successfully.

Reply
0 Kudos