VMware Cloud Community
MeenaJ
Contributor
Contributor
Jump to solution

Considering breaking ELM

We have 2 sites connected via NSX and a vCenter at each site in Linked Mode. If a site goes down I know the other vCenter should function properly, is that correct? What about a prolonged outage of a week or more, will the vCenter at the good site start having issues not being able to find its partner? 

These are things we worry about and we why want to break ELM.

A followup question is we have used the Advance Cross vCenter vMotion to move VM between vCenter that were not linked in an in a different SSO domain. We assume this would work fine if we broke ELM and still need to move a VM on an NSX segment from site to another using XVM.

Thank you for any comments.

Reply
0 Kudos
1 Solution

Accepted Solutions
GeoPerkins
Enthusiast
Enthusiast
Jump to solution

You are very correct to be worrying about breaking ELM and what may happen. We ran into this and found VMware's built-in resiliency for ELM to be severely lacking. In theory you can stop an ELM vCenter partner, and when it resumes, it will catch up. The problems are if more than one partner is down, complex network problems, or if you restore a vCenter from backup (using either snapshot, or internal VCSA backup tool). The replication of ELM can get confused.

We learned that there is a procedure available via an internal KB (available only to the VMware support teams) that provides a scripted tool to re-sync a broken ELM replication. You must open a case with VMware and then they (if you get the right technical specialist) can direct you to download and run the script. It is unfortunate that this is not a published KB for general use. They do not like to air their dirty laundry. 🙂

How do you know if ELM replication is working?

1. Use SSH to connect to each VCSA in the ELM replication domain.

2. Verify that normal VCSA services are running, if not start the stopped service(s). In fact, this could fix the problem. 

service-control --status

3. Display the ELM replication status: 

/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showservers -h localhost -u administrator

/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showpartners -h localhost -u administrator

/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showpartnerstatus -h localhost -u administrator

FYI administrator is the admin of the ELM domain (usually administrator@vsphere.local) and you'll need the password.

 

Please note https://kb.vmware.com/s/article/85662?lang=en_us which only barely covers limitations of backups of vCenters when using ELM.

 

View solution in original post

Reply
0 Kudos
3 Replies
GeoPerkins
Enthusiast
Enthusiast
Jump to solution

You are very correct to be worrying about breaking ELM and what may happen. We ran into this and found VMware's built-in resiliency for ELM to be severely lacking. In theory you can stop an ELM vCenter partner, and when it resumes, it will catch up. The problems are if more than one partner is down, complex network problems, or if you restore a vCenter from backup (using either snapshot, or internal VCSA backup tool). The replication of ELM can get confused.

We learned that there is a procedure available via an internal KB (available only to the VMware support teams) that provides a scripted tool to re-sync a broken ELM replication. You must open a case with VMware and then they (if you get the right technical specialist) can direct you to download and run the script. It is unfortunate that this is not a published KB for general use. They do not like to air their dirty laundry. 🙂

How do you know if ELM replication is working?

1. Use SSH to connect to each VCSA in the ELM replication domain.

2. Verify that normal VCSA services are running, if not start the stopped service(s). In fact, this could fix the problem. 

service-control --status

3. Display the ELM replication status: 

/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showservers -h localhost -u administrator

/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showpartners -h localhost -u administrator

/usr/lib/vmware-vmdir/bin/vdcrepadmin -f showpartnerstatus -h localhost -u administrator

FYI administrator is the admin of the ELM domain (usually administrator@vsphere.local) and you'll need the password.

 

Please note https://kb.vmware.com/s/article/85662?lang=en_us which only barely covers limitations of backups of vCenters when using ELM.

 

Reply
0 Kudos
GeoPerkins
Enthusiast
Enthusiast
Jump to solution

Your other question about cross-vCenter migration: As far as I am aware, there is no dependency upon ELM to do XVM. (I do not use NSX, so if NSX has ELM dependencies, that is another matter). You can XVM to any other vCenter 6.0+ (ELM or not) as long as both sides of the vCenter migration are licensed at EnterprisePlus level (minimum requirement for XVM). 

Tags (1)
Reply
0 Kudos
MeenaJ
Contributor
Contributor
Jump to solution

Thank you this is great information. I am sorry it took so long to get back to you,  my notifications were not working. I think VMware markets ELM to get this great single pane of glass, and neglect to inform people of these possible issues. 

Reply
0 Kudos