vco-app-XXXX 8.1 pods - CrashLoopBackOff

Magicmanashu · ‎08-04-2020

on running deploy.sh i get below on vra 8.1

vco-app-66dc7fdc98-72q8s 2/3 CrashLoopBackOff 3 8m3s

vco-app-66dc7fdc98-7nbjz 3/3 Running 0 8m3s

vco-app-66dc7fdc98-qdb2p 2/3 PostStartHookError: command 'sh -c sh /opt/preload-images.sh || true' exited with 137: + cd /opt/base-images

+ ls photon-3.0.tar.gz shared.tar.gz vco-polyglot-node-12.12.0.tar.gz vco-polyglot-powercli-11.5.0-powershell-6.2.3.tar.gz vco-polyglot-python-3.7.3.tar.gz

+ xargs -n 1 -I '{}' sh -c 'basename {} .tar.gz | xargs mkdir'

+ + lsxargs photon-3.0.tar.gz -n shared.tar.gz 1 vco-polyglot-node-12.12.0.tar.gz -I vco-polyglot-powercli-11.5.0-powershell-6.2.3.tar.gz '{}' vco-polyglot-python-3.7.3.tar.gz sh

-c 'basename {} .tar.gz | xargs tar zxvf {} -C'

Please help with the same.

maverix7 · ‎08-05-2020

This has been fixed in 8.1 P1, please consider upgrading. As an immediate resolution you can ssh the particular node that this pod is crashing and execute

rm -rf /data/vco/var/run/vco-polyglot-runner-sock/docker*

rm -rf /data/vco/var/run/vco-polyglot-runner-sock/xtables.lock

upon the next pod start everything should be ok.

Magicmanashu · ‎08-05-2020

Tried the steps mentioned by you (without upgrade) , getting the below error still:

vco-app-5f55f796-6drzw 1/3 PostStartHookError: command 'sh -c sh /opt/preload-images.sh || true' exited with 137: + cd /opt/base-images

+ ls photon-3.0.tar.gz shared.tar.gz vco-polyglot-node-12.12.0.tar.gz vco-polyglot-powercli-11.5.0-powershell-6.2.3.tar.gz vco-polyglot-python-3.7.3.tar.gz

+ xargs -n 1 -I '{}' sh -c 'basename {} .tar.gz | xargs mkdir'

+ xargs -n 1+ -I '{}' sh -c 'basename {} .tar.gz | xargs tar zxvf {} -C'

ls photon-3.0.tar.gz shared.tar.gz vco-polyglot-node-12.12.0.tar.gz vco-polyglot-powercli-11.5.0-powershell-6.2.3.tar.gz vco-polyglot-python-3.7.3.tar.gz

+ rm photon-3.0.tar.gz shared.tar.gz vco-polyglot-node-12.12.0.tar.gz vco-polyglot-powercli-11.5.0-powershell-6.2.3.tar.gz vco-polyglot-python-3.7.3.tar.gz

+ i=0

+ '[' 0 -lt 30 ]

+ docker ps

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

+ echo 'Waiting for Docker to be ready'

+ sleep 2

+ expr 0 + 1

+ i=1

+ '[' 1 -lt 30 ]

+ docker ps

maverix7 · ‎08-05-2020

Can you double check that you did it on the correct nodes, based on what I see believe you should do it on 2 of the 3? After that you can delete the problematic vco pods so that the can be redeployed. If the issue continues to persist I would suggest opening an SR.

Magicmanashu · ‎08-05-2020

Yes , i did it on the correct nodes, i deleted the pods also, but they are failing again and again, Not sure why this behaviour is happening with 2 and 3 rd node.Primary node is always success.Will check with GSS

gradinka · ‎08-11-2020

have you opened the SR yet?

by chance are you using 172.x.x.x network for deployment ?

All

vco-app-XXXX 8.1 pods - CrashLoopBackOff