Solved: Deploy VIO 4.0 error: "nodes were unreachable"

dalo · ‎11-13-2017

I tried to do a fresh install of VIO 4.0.

The deployment process stops with the error: "Task execution failed: /var/lib/vio/ansible/site.yml failed because the following nodes were unreachable: [u'172.31.93.18']."

"172.31.93.18" is the Loadbalancer0 IP. I tried to ping the the IP from the OMS with success.

/var/log/jarvis/jarvis.log:

2017-11-13 12:40:34,368 INFO [pecan.commands.serve][MainThread] "GET /tasks/task-7f30071a-e419-4eac-bd42-6a504a9023b3 HTTP/1.1" 200 302

2017-11-13 12:40:35,417 ERROR [jarvis.ans.manager][Thread-5] Error running playbook /var/lib/vio/ansible/site.yml

2017-11-13 12:40:35,418 ERROR [jarvis.ans.task][Thread-5] task-7f30071a-e419-4eac-bd42-6a504a9023b3 Failed. Reason=/var/lib/vio/ansible/site.yml failed because the following nodes were unreachable: [u'172.31.93.18'].

I tried a "retry" and a fresh deploy, but without success.

Are there any better logs to check? What else can I do?

dalo · ‎11-30-2017

after digging deeper into this, I found the issue: we had to set the management VLAN to "traditional forwarding" on the cisco switch.

With this, the connection from the management VM was always available and not just after a initial packet from the source VM.

And because no one could tell me this, here are some intersting logs on the management host:

/var/log/osvmw/osvmw.log

/var/log/oms/oms.log

/var/log/jarvis/jarvis.log

/var/log/column/ansible.log

View solution in original post

dalo · ‎11-14-2017

I tried now a compact mode installation, same issue. But if I move all VMs to one ESXi the installation works, so this is maybe related to the VDS Link between the ESXi?

Also I made a new VDS and tried it again, but with the same result.

I see that VIO creates a new portgroup on the VDS with the DHCP Server VMs attached.What kind of traffic goes over this port? Tagged, untagged? Maybe the physical switch reject this kind of traffic?

ZeMiracle · ‎11-21-2017

It look like a configuration problem.

Do you use a NSX deployment or a VDS deployment ?

dalo · ‎11-21-2017

It's a VDS deployment.

dalo · ‎11-21-2017

I tried now to collect the logs, even this fails:

# viocli deployment getlogs

[100 %] [###################################################]

/var/lib/vio/ansible/support-db.yml failed on the following nodes: [u'172.31.93.11', u'172.31.93.12', u'172.31.93.13']Warning: Could not load logs for role: db

but a ssh connection works:

root@ids-vio-1:~# ssh 172.31.93.11

The authenticity of host '172.31.93.11 (172.31.93.11)' can't be established.

ECDSA key fingerprint is SHA256:XUKtm5vpf3UlJ+Vrc2u5XvEiMGbLrndu4hy6fddLGO0.

Are you sure you want to continue connecting (yes/no)? ^C

Any Ideas?

dalo · ‎11-30-2017