VMware Cloud Community
ymichalak
Hot Shot
Hot Shot

[vRO 8.6.x] - vRO cluster on 2 datacenters

Hello,

we have a cluster with 3 nodes on the same vlan.

We try to remove one node of this cluster to add a new node from another vlan on an other datacenter.

We have open these flows between the 2 vlan's :

ymichalak_0-1640772272458.png

 

Step to do this :

- Update custom certificate with new SAN for the future new node : vracli certificate ingress --sha256 7431e5f4c3xxxxxxxx --set vrodit_my_certificate.pem

- Update deployment :/opt/scripts/deploy.sh

- remove node 3 of the cluster :vracli cluster leave

- Add new node in the cluster : vracli cluster join primary_node_hostname_or_IP

- Update deployment :/opt/scripts/deploy.sh

Below our error :

+ timeout 300s bash -c wait_noop_pods
+ kubectl patch vaconfig prelude-vaconfig --type json -p '[{"op": "add", "path": "/spec/deploy/ready", "value": false}]'
vaconfig.prelude.vmware.com/prelude-vaconfig patched
+ vracli db pause-failover
2021-12-29 08:16:40,443 [ERROR] Error pausing failover agent on pod postgres-3: DEBUG: connecting to: "user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-3.postgres.prelude.svc.cluster.local keepalives=1 fallback_application_name=repmgr options=-csearch_path="
ERROR: connection to database failed
DETAIL:
could not connect to server: Connection refused
Is the server running on host "postgres-3.postgres.prelude.svc.cluster.local" (10.244.3.6) and accepting
TCP/IP connections on port 5432?

DETAIL: attempted to connect using:
user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-3.postgres.prelude.svc.cluster.local keepalives=1 fallback_application_name=repmgr options=-csearch_path=
command terminated with exit code 6
NoneType: None
Error pausing failover agent on pod postgres-3: DEBUG: connecting to: "user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-3.postgres.prelude.svc.cluster.local keepalives=1 fallback_application_name=repmgr options=-csearch_path="
ERROR: connection to database failed
DETAIL:
could not connect to server: Connection refused
Is the server running on host "postgres-3.postgres.prelude.svc.cluster.local" (10.244.3.6) and accepting
TCP/IP connections on port 5432?

DETAIL: attempted to connect using:
user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10 dbname=repmgr-db host=postgres-3.postgres.prelude.svc.cluster.local keepalives=1 fallback_application_name=repmgr options=-csearch_path=
command terminated with exit code 6

++ vracli load-balancer
+ FQDN=vrodit.mydomaine.com
+ '[' vrodit.mydomaine.com == '' ']'
+ '[' true = true ']'
+ INGRESS_URL=https://vrodit.mydomaine.com
+ vracli service status --set-config service.status.cache.lifetime=3600
+ log_stage 'Tear down existing deployment'
+ set +x

 

If you have an idea please 😉

0 Kudos
1 Reply
ymichalak
Hot Shot
Hot Shot

Anyone on this forum have deployed a vRO cluster on two datacenter please ?

Any feedback will be very appreciated.

 

😉

0 Kudos