on running deploy.sh i get below on vra 8.1
vco-app-66dc7fdc98-72q8s 2/3 CrashLoopBackOff 3 8m3s
vco-app-66dc7fdc98-7nbjz 3/3 Running 0 8m3s
vco-app-66dc7fdc98-qdb2p 2/3 PostStartHookError: command 'sh -c sh /opt/preload-images.sh || true' exited with 137: + cd /opt/base-images
+ ls photon-3.0.tar.gz shared.tar.gz vco-polyglot-node-12.12.0.tar.gz vco-polyglot-powercli-11.5.0-powershell-6.2.3.tar.gz vco-polyglot-python-3.7.3.tar.gz
+ xargs -n 1 -I '{}' sh -c 'basename {} .tar.gz | xargs mkdir'
+ + lsxargs photon-3.0.tar.gz -n shared.tar.gz 1 vco-polyglot-node-12.12.0.tar.gz -I vco-polyglot-powercli-11.5.0-powershell-6.2.3.tar.gz '{}' vco-polyglot-python-3.7.3.tar.gz sh
-c 'basename {} .tar.gz | xargs tar zxvf {} -C'
Please help with the same.
This has been fixed in 8.1 P1, please consider upgrading. As an immediate resolution you can ssh the particular node that this pod is crashing and execute
rm -rf /data/vco/var/run/vco-polyglot-runner-sock/docker*
rm -rf /data/vco/var/run/vco-polyglot-runner-sock/xtables.lock
upon the next pod start everything should be ok.
Tried the steps mentioned by you (without upgrade) , getting the below error still:
vco-app-5f55f796-6drzw 1/3 PostStartHookError: command 'sh -c sh /opt/preload-images.sh || true' exited with 137: + cd /opt/base-images
+ ls photon-3.0.tar.gz shared.tar.gz vco-polyglot-node-12.12.0.tar.gz vco-polyglot-powercli-11.5.0-powershell-6.2.3.tar.gz vco-polyglot-python-3.7.3.tar.gz
+ xargs -n 1 -I '{}' sh -c 'basename {} .tar.gz | xargs mkdir'
+ xargs -n 1+ -I '{}' sh -c 'basename {} .tar.gz | xargs tar zxvf {} -C'
ls photon-3.0.tar.gz shared.tar.gz vco-polyglot-node-12.12.0.tar.gz vco-polyglot-powercli-11.5.0-powershell-6.2.3.tar.gz vco-polyglot-python-3.7.3.tar.gz
+ rm photon-3.0.tar.gz shared.tar.gz vco-polyglot-node-12.12.0.tar.gz vco-polyglot-powercli-11.5.0-powershell-6.2.3.tar.gz vco-polyglot-python-3.7.3.tar.gz
+ i=0
+ '[' 0 -lt 30 ]
+ docker ps
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
+ echo 'Waiting for Docker to be ready'
+ sleep 2
+ expr 0 + 1
+ i=1
+ '[' 1 -lt 30 ]
+ docker ps
Can you double check that you did it on the correct nodes, based on what I see believe you should do it on 2 of the 3? After that you can delete the problematic vco pods so that the can be redeployed. If the issue continues to persist I would suggest opening an SR.
Yes , i did it on the correct nodes, i deleted the pods also, but they are failing again and again, Not sure why this behaviour is happening with 2 and 3 rd node.Primary node is always success.Will check with GSS
have you opened the SR yet?
by chance are you using 172.x.x.x network for deployment ?