Hello,
we have a cluster with 3 nodes on the same vlan.
We try to remove one node of this cluster to add a new node from another vlan on an other datacenter.
We have open these flows between the 2 vlan's :
Step to do this :
- Update custom certificate with new SAN for the future new node : vracli certificate ingress --sha256 7431e5f4c3xxxxxxxx --set vrodit_my_certificate.pem
- Update deployment :/opt/scripts/deploy.sh
- remove node 3 of the cluster :vracli cluster leave
- Add new node in the cluster : vracli cluster join primary_node_hostname_or_IP
- Update deployment :/opt/scripts/deploy.sh
Below our error :
+ timeout 300s bash -c wait_noop_pods
+ kubectl patch vaconfig prelude-vaconfig --type json -p '[{"op": "add", "path": "/spec/deploy/ready", "value": false}]'
vaconfig.prelude.vmware.com/prelude-vaconfig patched
+ vracli db pause-failover
2021-12-29 08:16:40,443 [ERROR] Error pausing failover agent on pod postgres-3: DEBUG: connecting to: "user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-3.postgres.prelude.svc.cluster.local keepalives=1 fallback_application_name=repmgr options=-csearch_path="
ERROR: connection to database failed
DETAIL:
could not connect to server: Connection refused
Is the server running on host "postgres-3.postgres.prelude.svc.cluster.local" (10.244.3.6) and accepting
TCP/IP connections on port 5432?
DETAIL: attempted to connect using:
user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-3.postgres.prelude.svc.cluster.local keepalives=1 fallback_application_name=repmgr options=-csearch_path=
command terminated with exit code 6
NoneType: None
Error pausing failover agent on pod postgres-3: DEBUG: connecting to: "user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-3.postgres.prelude.svc.cluster.local keepalives=1 fallback_application_name=repmgr options=-csearch_path="
ERROR: connection to database failed
DETAIL:
could not connect to server: Connection refused
Is the server running on host "postgres-3.postgres.prelude.svc.cluster.local" (10.244.3.6) and accepting
TCP/IP connections on port 5432?
DETAIL: attempted to connect using:
user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-3.postgres.prelude.svc.cluster.local keepalives=1 fallback_application_name=repmgr options=-csearch_path=
command terminated with exit code 6
++ vracli load-balancer
+ FQDN=vrodit.mydomaine.com
+ '[' vrodit.mydomaine.com == '' ']'
+ '[' true = true ']'
+ INGRESS_URL=https://vrodit.mydomaine.com
+ vracli service status --set-config service.status.cache.lifetime=3600
+ log_stage 'Tear down existing deployment'
+ set +x
If you have an idea please 😉
Anyone on this forum have deployed a vRO cluster on two datacenter please ?
Any feedback will be very appreciated.
😉