VMware Cloud Community
dalo
Hot Shot
Hot Shot
Jump to solution

Provisioning error

Hello,

I get a provisioning error at the end of the deployment of the openstack. It seems, thatall nodes are deployed, but they all have the state "Bootstrap failed"

The error in the VC is:

"Task execution failed: Task failed on the following nodes: ['172.31.93.15', '172.31.93.14']. Refer logs for more details.."


I login to the management station and found a similar entry in the log, but nothing with more detail:


/var/log/oms/oms.log:

[2015-06-04T10:25:44.544+0000] ERROR tomcat-http--21| com.vmware.openstack.manager.JobManager: mark task as failed: Task execution failed: Task failed on the following nodes: ['172.31.93.15', '172.31.93.14']. Refer logs for more details.

Then I login to on of the failed nodes, but I didn't found something useful.

Any hints? where are the interesting logs, especially on the controller[01+02] nodes?

Thanks

Reply
0 Kudos
1 Solution

Accepted Solutions
yjia
VMware Employee
VMware Employee
Jump to solution

From the log there is something that :

2015-06-10 06:23:22.187 11093 WARNING glance.store.vmware_datastore [c525af00-4990-4d45-a58e-7ddbff8c546b 3720695fd95044ce807ea9c490159ae9 78b2bae6e6dd444db2c0b28995f01e46 - - -] Unable to connect to the ESX(i) Host esx-tst-4.ethz.ch.

Got IOError Error [Errno 110] ETIMEDOUT

2015-06-10 06:23:22.199 11093 ERROR glance.store.vmware_datastore [c525af00-4990-4d45-a58e-7ddbff8c546b 3720695fd95044ce807ea9c490159ae9 78b2bae6e6dd444db2c0b28995f01e46 - - -] Communication error sending http PUT request to the server 192.168.37.99.

Can you please check the exsi host esx-tst-4.ethz.ch whether  is reachable or not?

View solution in original post

Reply
0 Kudos
10 Replies
KarolSte
Enthusiast
Enthusiast
Jump to solution

Hi Dalo,

Can you please attach /var/log/jarvis/ansible.log file from management server? It should tell us more about what failed during provisioning phase.

Best Regards,

Karol

Reply
0 Kudos
yjia
VMware Employee
VMware Employee
Jump to solution

Please login to the Management server and run

$>viogetlogs

then please upload the log file.

Regards,

Yixing

Reply
0 Kudos
dalo
Hot Shot
Hot Shot
Jump to solution

Hi,

Thank you very much for your help. In the /var/log/jarvis/ansible.log I could see the following entry:

2015-06-08 06:21:16,985 p=303 u=jarvis |  TASK: [config-local | import image into glance] *******************************

2015-06-08 06:22:34,328 p=303 u=jarvis |  failed: [172.31.93.14] => {"failed": true}

2015-06-08 06:22:34,328 p=303 u=jarvis |  msg: Error in creating image: Error communicating with http://172.31.93.10:9292 [Errno 32] Broken pipe

2015-06-08 06:22:34,329 p=303 u=jarvis |  FATAL: all hosts have already failed -- aborting

I t seems that 172.31.93.10 is the load balancer0. I can ping all of its IPs from the management server and the external IP from outside.

Best Regards

Daniel

Reply
0 Kudos
KarolSte
Enthusiast
Enthusiast
Jump to solution

Hi Daniel,

We've seen this issue before, but this could happen due to many reasons, including intermittent network issues. Could you retry the deployment? Just right click on the deployment in UI, choose "Edit OpenStack Deployment", and go through wizard again.

If it fails again with broken pipe, please run the command provided above by Yixing (viogetlogs) and attach resulted files, they will also include glance logs so we can debug it further.

Best Regards,

Karol

dalo
Hot Shot
Hot Shot
Jump to solution

Hi Karol,

I tried it a few times, with the same result.

The logs are attached.

Thank You for your help,

Daniel

Reply
0 Kudos
dalo
Hot Shot
Hot Shot
Jump to solution

Hi Karol and Yixing,

Could you find something helpful in the logs?

Im still stuck on this.

Thanks, Daniel

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

Hello,

Disclaimer: I haven't looked at the log and I am taking an educated guess.

Check the Datastore that you provisioned for Glance. Make sure that it has space, it is reachable by the hosts in the Mgmt Cluster and that it is enabled for read/write.

It seems like the last step that tries to upload the bundle ubuntu image is failing. If you don't find anything wrong with the datastore provisioned for Glance, then try changing it to another datastore available for the Mgmt Cluster.

Let me know if those steps help

arvind

yjia
VMware Employee
VMware Employee
Jump to solution

From the log there is something that :

2015-06-10 06:23:22.187 11093 WARNING glance.store.vmware_datastore [c525af00-4990-4d45-a58e-7ddbff8c546b 3720695fd95044ce807ea9c490159ae9 78b2bae6e6dd444db2c0b28995f01e46 - - -] Unable to connect to the ESX(i) Host esx-tst-4.ethz.ch.

Got IOError Error [Errno 110] ETIMEDOUT

2015-06-10 06:23:22.199 11093 ERROR glance.store.vmware_datastore [c525af00-4990-4d45-a58e-7ddbff8c546b 3720695fd95044ce807ea9c490159ae9 78b2bae6e6dd444db2c0b28995f01e46 - - -] Communication error sending http PUT request to the server 192.168.37.99.

Can you please check the exsi host esx-tst-4.ethz.ch whether  is reachable or not?

Reply
0 Kudos
dalo
Hot Shot
Hot Shot
Jump to solution

Hello Arvind,

Thank you.

I've two datastores with 1TB each. All of the three Hosts in the Management Cluster had rw Access to this two DS.

I tried now to seperate nova and glance datastores, but with the same results. The wizard deploys 15 VMs, did some other work and stops then with the error.

My ESXi are on 6.0, but that should be ok.

Is there a log file with more info about the deployment process?

Daniel

Reply
0 Kudos
dalo
Hot Shot
Hot Shot
Jump to solution

That was helpful, thank you!

All the ESX Hosts are reachable, but they had local Firewall rules on it.

So I disabled the local Firewalls completly and the deployment works.

Thank you all.

Reply
0 Kudos