I tried to do a fresh install of VIO 4.0.
The deployment process stops with the error: "Task execution failed: /var/lib/vio/ansible/site.yml failed because the following nodes were unreachable: [u'172.31.93.18']."
"172.31.93.18" is the Loadbalancer0 IP. I tried to ping the the IP from the OMS with success.
/var/log/jarvis/jarvis.log:
2017-11-13 12:40:34,368 INFO [pecan.commands.serve][MainThread] "GET /tasks/task-7f30071a-e419-4eac-bd42-6a504a9023b3 HTTP/1.1" 200 302
2017-11-13 12:40:35,417 ERROR [jarvis.ans.manager][Thread-5] Error running playbook /var/lib/vio/ansible/site.yml
2017-11-13 12:40:35,418 ERROR [jarvis.ans.task][Thread-5] task-7f30071a-e419-4eac-bd42-6a504a9023b3 Failed. Reason=/var/lib/vio/ansible/site.yml failed because the following nodes were unreachable: [u'172.31.93.18'].
I tried a "retry" and a fresh deploy, but without success.
Are there any better logs to check? What else can I do?
after digging deeper into this, I found the issue: we had to set the management VLAN to "traditional forwarding" on the cisco switch.
With this, the connection from the management VM was always available and not just after a initial packet from the source VM.
And because no one could tell me this, here are some intersting logs on the management host:
/var/log/osvmw/osvmw.log
/var/log/oms/oms.log
/var/log/jarvis/jarvis.log
/var/log/column/ansible.log
I tried now a compact mode installation, same issue. But if I move all VMs to one ESXi the installation works, so this is maybe related to the VDS Link between the ESXi?
Also I made a new VDS and tried it again, but with the same result.
I see that VIO creates a new portgroup on the VDS with the DHCP Server VMs attached.What kind of traffic goes over this port? Tagged, untagged? Maybe the physical switch reject this kind of traffic?
It look like a configuration problem.
Do you use a NSX deployment or a VDS deployment ?
It's a VDS deployment.
I tried now to collect the logs, even this fails:
# viocli deployment getlogs
[100 %] [###################################################]
/var/lib/vio/ansible/support-db.yml failed on the following nodes: [u'172.31.93.11', u'172.31.93.12', u'172.31.93.13']Warning: Could not load logs for role: db
but a ssh connection works:
root@ids-vio-1:~# ssh 172.31.93.11
The authenticity of host '172.31.93.11 (172.31.93.11)' can't be established.
ECDSA key fingerprint is SHA256:XUKtm5vpf3UlJ+Vrc2u5XvEiMGbLrndu4hy6fddLGO0.
Are you sure you want to continue connecting (yes/no)? ^C
Any Ideas?
after digging deeper into this, I found the issue: we had to set the management VLAN to "traditional forwarding" on the cisco switch.
With this, the connection from the management VM was always available and not just after a initial packet from the source VM.
And because no one could tell me this, here are some intersting logs on the management host:
/var/log/osvmw/osvmw.log
/var/log/oms/oms.log
/var/log/jarvis/jarvis.log
/var/log/column/ansible.log