Hello,
I get a provisioning error at the end of the deployment of the openstack. It seems, thatall nodes are deployed, but they all have the state "Bootstrap failed"
The error in the VC is:
"Task execution failed: Task failed on the following nodes: ['172.31.93.15', '172.31.93.14']. Refer logs for more details.."
I login to the management station and found a similar entry in the log, but nothing with more detail:
/var/log/oms/oms.log:
[2015-06-04T10:25:44.544+0000] ERROR tomcat-http--21| com.vmware.openstack.manager.JobManager: mark task as failed: Task execution failed: Task failed on the following nodes: ['172.31.93.15', '172.31.93.14']. Refer logs for more details.
Then I login to on of the failed nodes, but I didn't found something useful.
Any hints? where are the interesting logs, especially on the controller[01+02] nodes?
Thanks
From the log there is something that :
2015-06-10 06:23:22.187 11093 WARNING glance.store.vmware_datastore [c525af00-4990-4d45-a58e-7ddbff8c546b 3720695fd95044ce807ea9c490159ae9 78b2bae6e6dd444db2c0b28995f01e46 - - -] Unable to connect to the ESX(i) Host esx-tst-4.ethz.ch.
Got IOError Error [Errno 110] ETIMEDOUT
2015-06-10 06:23:22.199 11093 ERROR glance.store.vmware_datastore [c525af00-4990-4d45-a58e-7ddbff8c546b 3720695fd95044ce807ea9c490159ae9 78b2bae6e6dd444db2c0b28995f01e46 - - -] Communication error sending http PUT request to the server 192.168.37.99.
Can you please check the exsi host esx-tst-4.ethz.ch whether is reachable or not?
Hi Dalo,
Can you please attach /var/log/jarvis/ansible.log file from management server? It should tell us more about what failed during provisioning phase.
Best Regards,
Karol
Please login to the Management server and run
$>viogetlogs
then please upload the log file.
Regards,
Yixing
Hi,
Thank you very much for your help. In the /var/log/jarvis/ansible.log I could see the following entry:
2015-06-08 06:21:16,985 p=303 u=jarvis | TASK: [config-local | import image into glance] *******************************
2015-06-08 06:22:34,328 p=303 u=jarvis | failed: [172.31.93.14] => {"failed": true}
2015-06-08 06:22:34,328 p=303 u=jarvis | msg: Error in creating image: Error communicating with http://172.31.93.10:9292 [Errno 32] Broken pipe
2015-06-08 06:22:34,329 p=303 u=jarvis | FATAL: all hosts have already failed -- aborting
I t seems that 172.31.93.10 is the load balancer0. I can ping all of its IPs from the management server and the external IP from outside.
Best Regards
Daniel
Hi Daniel,
We've seen this issue before, but this could happen due to many reasons, including intermittent network issues. Could you retry the deployment? Just right click on the deployment in UI, choose "Edit OpenStack Deployment", and go through wizard again.
If it fails again with broken pipe, please run the command provided above by Yixing (viogetlogs) and attach resulted files, they will also include glance logs so we can debug it further.
Best Regards,
Karol
Hi Karol,
I tried it a few times, with the same result.
The logs are attached.
Thank You for your help,
Daniel
Hi Karol and Yixing,
Could you find something helpful in the logs?
Im still stuck on this.
Thanks, Daniel
Hello,
Disclaimer: I haven't looked at the log and I am taking an educated guess.
Check the Datastore that you provisioned for Glance. Make sure that it has space, it is reachable by the hosts in the Mgmt Cluster and that it is enabled for read/write.
It seems like the last step that tries to upload the bundle ubuntu image is failing. If you don't find anything wrong with the datastore provisioned for Glance, then try changing it to another datastore available for the Mgmt Cluster.
Let me know if those steps help
arvind
From the log there is something that :
2015-06-10 06:23:22.187 11093 WARNING glance.store.vmware_datastore [c525af00-4990-4d45-a58e-7ddbff8c546b 3720695fd95044ce807ea9c490159ae9 78b2bae6e6dd444db2c0b28995f01e46 - - -] Unable to connect to the ESX(i) Host esx-tst-4.ethz.ch.
Got IOError Error [Errno 110] ETIMEDOUT
2015-06-10 06:23:22.199 11093 ERROR glance.store.vmware_datastore [c525af00-4990-4d45-a58e-7ddbff8c546b 3720695fd95044ce807ea9c490159ae9 78b2bae6e6dd444db2c0b28995f01e46 - - -] Communication error sending http PUT request to the server 192.168.37.99.
Can you please check the exsi host esx-tst-4.ethz.ch whether is reachable or not?
Hello Arvind,
Thank you.
I've two datastores with 1TB each. All of the three Hosts in the Management Cluster had rw Access to this two DS.
I tried now to seperate nova and glance datastores, but with the same results. The wizard deploys 15 VMs, did some other work and stops then with the error.
My ESXi are on 6.0, but that should be ok.
Is there a log file with more info about the deployment process?
Daniel
That was helpful, thank you!
All the ESX Hosts are reachable, but they had local Firewall rules on it.
So I disabled the local Firewalls completly and the deployment works.
Thank you all.