Re: [vRO 8.6.x] - vRO cluster on 2 datacenters

ymichalak · ‎12-29-2021

Hello,

we have a cluster with 3 nodes on the same vlan.

We try to remove one node of this cluster to add a new node from another vlan on an other datacenter.

We have open these flows between the 2 vlan's :

Step to do this :

- Update custom certificate with new SAN for the future new node : vracli certificate ingress --sha256 7431e5f4c3xxxxxxxx --set vrodit_my_certificate.pem

- Update deployment :/opt/scripts/deploy.sh

- remove node 3 of the cluster :vracli cluster leave

- Add new node in the cluster : vracli cluster join primary_node_hostname_or_IP

- Update deployment :/opt/scripts/deploy.sh

Below our error :

+ timeout 300s bash -c wait_noop_pods
+ kubectl patch vaconfig prelude-vaconfig --type json -p '[{"op": "add", "path": "/spec/deploy/ready", "value": false}]'
vaconfig.prelude.vmware.com/prelude-vaconfig patched
+ vracli db pause-failover
2021-12-29 08:16:40,443 [ERROR] Error pausing failover agent on pod postgres-3: DEBUG: connecting to: "user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-3.postgres.prelude.svc.cluster.local keepalives=1 fallback_application_name=repmgr options=-csearch_path="
ERROR: connection to database failed
DETAIL:
could not connect to server: Connection refused
Is the server running on host "postgres-3.postgres.prelude.svc.cluster.local" (10.244.3.6) and accepting
TCP/IP connections on port 5432?

DETAIL: attempted to connect using:
user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-3.postgres.prelude.svc.cluster.local keepalives=1 fallback_application_name=repmgr options=-csearch_path=
command terminated with exit code 6
NoneType: None
Error pausing failover agent on pod postgres-3: DEBUG: connecting to: "user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-3.postgres.prelude.svc.cluster.local keepalives=1 fallback_application_name=repmgr options=-csearch_path="
ERROR: connection to database failed
DETAIL:
could not connect to server: Connection refused
Is the server running on host "postgres-3.postgres.prelude.svc.cluster.local" (10.244.3.6) and accepting
TCP/IP connections on port 5432?

DETAIL: attempted to connect using:
user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-3.postgres.prelude.svc.cluster.local keepalives=1 fallback_application_name=repmgr options=-csearch_path=
command terminated with exit code 6

++ vracli load-balancer
+ FQDN=vrodit.mydomaine.com
+ '[' vrodit.mydomaine.com == '' ']'
+ '[' true = true ']'
+ INGRESS_URL=https://vrodit.mydomaine.com
+ vracli service status --set-config service.status.cache.lifetime=3600
+ log_stage 'Tear down existing deployment'
+ set +x

If you have an idea please 😉

ymichalak · ‎01-26-2022

Anyone on this forum have deployed a vRO cluster on two datacenter please ?

Any feedback will be very appreciated.

😉

All

[vRO 8.6.x] - vRO cluster on 2 datacenters