VMware Cloud Community
gfricke
Contributor
Contributor
Jump to solution

How to use VMKernel vSphere Replication traffic

I am trying to set up my host with my replication appliance with a standalone VMKernal with vSphere Replication traffic enabled on its own virtual VLAN. 

After I configured this on 2 hosts (each with local datastores) and tried to replicate a VM from one host to the host with the VR Appliance, the traffic goes out on the management vlan/nic on the host with the VM and is received on the host with the appliance (where I am storing the data) on the VM traffic incoming nic/vlan.

I want 100% of my replication traffic off the management and VM network, just like I have done with vmotion...

1 Solution

Accepted Solutions
mvalkanov
VMware Employee
VMware Employee
Jump to solution

Hi,

The option on the vmkernel creation to tag replication traffic in ESXi 5.0.x and 5.1.x is an experimental feature and not officially supported yet.

You need somehow to route the replication traffic ports. Some people are using WAN optimizer solutions (Riverbed or similar).

The VR appliance needs to be have access to vCenter Server and the hosts that can be used to write to the target datastores. Link to KB article with the exact port numbers should be present in the admin guide.

Regards,

Martin

View solution in original post

0 Kudos
31 Replies
iangelov
VMware Employee
VMware Employee
Jump to solution

Sounds strange, could you please check if VR appliance vm is connected to "non-management" network?

0 Kudos
mvalkanov
VMware Employee
VMware Employee
Jump to solution

Hi,

With the currently released versions of vSphere Replication, the only way to isolate the replication traffic is to set up different routing / shaping / etc. on the ports 31031 and 44046. Source ESXi sends the replication traffic over these ports to the VR server at the target site. After that the VR server opens the .vmdks using port 902 of a host that has access to the target datastore.

Regards,

Martin

0 Kudos
gfricke
Contributor
Contributor
Jump to solution

So are you saying the option in the vm kernel creation to tag replication traffic does nothing? 

iangelov:  Where should the VR appliance reside?  I have VMs on my 100 vlan, my management network is also currently on the 100 vlan. The vr appliance is on the 100 vlan.  The network I setup for replication traffic is on the 200.  What do I need to do to ensure all VR traffic flows over the 200?

0 Kudos
iangelov
VMware Employee
VMware Employee
Jump to solution

Mu mistake, please follow mvalkanov's advise.

0 Kudos
gfricke
Contributor
Contributor
Jump to solution

Then what is the purpose of the new vmkernel option for tagging replication traffic?  It doesn't operate like the vmotion tagging which allows us to route over separate nics using this tag? 

replication.JPG

0 Kudos
mvalkanov
VMware Employee
VMware Employee
Jump to solution

Hi,

The option on the vmkernel creation to tag replication traffic in ESXi 5.0.x and 5.1.x is an experimental feature and not officially supported yet.

You need somehow to route the replication traffic ports. Some people are using WAN optimizer solutions (Riverbed or similar).

The VR appliance needs to be have access to vCenter Server and the hosts that can be used to write to the target datastores. Link to KB article with the exact port numbers should be present in the admin guide.

Regards,

Martin

0 Kudos
gfricke
Contributor
Contributor
Jump to solution

Ok, so VMWare released an unsupported feature in production software...  brilliant.  Can I have 4 hours of my life back trying to get this to work properly?


Gentlemen, thanks for the help.

iforbes
Hot Shot
Hot Shot
Jump to solution

I agree 100%. WTF!!! I thought the same thing about the new replication traffic tag. I'm using ESXi 5.5. Is this replication tag still just window dressing, or does it actually work? I have a question that I need clarity about. With SRM, vSphere replication comes with a vSphere Replication management appliance (VRMA) AND vSphere Replication Servers ( VRS). I've read so many docs on the functionality of both of these and there's ambiguity everywhere.

It's my impression that the protected site ESXi servers send replication traffic to the VRS and NOT the VRMA (over ports 31031 initially and 44046 afterwards). From there it's the VRS and not the VRMA that connects to the ESXi servers and sends the replicated blocks via NFC.

I also want to isolate replication traffic from the protected site to recovery. I assume I have to have the protected site ESXi servers be able to route to the recovery side VRS (NOT VRMA) over the replication VLAN. This will likely mean I have to configure a static route on the protected site ESXi servers to achieve this.

I'm hoping someone can 100% confirm for me that it's the VRS and NOT the VRMA that is responsible for receiving replication traffic from the protected site ESXi servers. Documentation everywhere is ambiguos and frequently only discusses the VRMA.

0 Kudos
mvalkanov
VMware Employee
VMware Employee
Jump to solution

Hi iforbes,

About the replication traffic - you are correct:

- the source ESXi sends replication data to VR server (over 31031 initially and after that 44046).

- VR server connects to the ESXi server and sends the replicated blocks to the target datastores over NFC (port 902).

- VRMS (VR replication management server) is not involved in the actual replication traffic processing.

VRMS is responsible for replication management, including monitoring the source VMs for hardware/config changes and providing access of VR servers to the ESXi hosts in the same vCenter inventory.

About the appliances - the combined VR appliance contains VRMS + embedded DB for VRMS + VR server. The VR server only appliance has the same .vmdk bits, but lower memory requirements and only the VR server enabled in it.

Regards,

Martin

0 Kudos
iforbes
Hot Shot
Hot Shot
Jump to solution

Hi Martin. Thanks for confirming. I just wish the documentation out there was a little more detailed in identifying what each appliance is responsible for :-). One other question. When performing vMotion I can setup multi-nic vMotion to increease throughput of my vMotions. Is it possible to do the same thing with the vSphere Replication traffic? Currently, I have replication traffic being routed over a vmkernel port group specifically created for replication traffic (different subnet than hypervisor management vmkernel). The traffic gets routed to the target VRS which resides in a replication traffic network port group. This all works fine.

This design only has one replication vmkernel port group at the source, so replication traffic can only go out a single interface. It doesn't matter if I dedicate multiple nics specifically for that replication vmkernel traffic. Traffic will only get pinned to a single physical nic. Like multi-nic vMotion I would need multiple vmkernels dedicated to replication and setup the nics in the same fashion as multi-nic vMotion (i.e. active-standby, active-standby).

Anyways, I'm wondering if anyone has been able to push replication traffic over more than a single interface?

0 Kudos
mikez2
VMware Employee
VMware Employee
Jump to solution

mvalkanov wrote:

Hi iforbes,

About the replication traffic - you are correct:

- the source ESXi sends replication data to VR server (over 31031 initially and after that 44046).

- VR server connects to the ESXi server and sends the replicated blocks to the target datastores over NFC (port 902).

Don't forget Port 80 which VR uses to perform other important operations on the replica-side hosts.

0 Kudos
crumpuppet
Contributor
Contributor
Jump to solution

Ugggh, same here, wasted a couple hours wondering why the traffic was still going out through the management ports.

Surely there are plans to isolate this traffic, otherwise why would they put that option in the web client? Does anyone know when they plan on implementing this properly?

0 Kudos
Smoggy
VMware Employee
VMware Employee
Jump to solution

I'm sure you understand we cannot give out dates for features for all the usual legal reasons but network traffic isolation  improvements are on the todo (or doing??) list. Appreciate that does not give you a concrete date but hopefully it gives you some comfort that we are listening and do know this needs to be improved and enhanced.

0 Kudos
crumpuppet
Contributor
Contributor
Jump to solution

I can understand that. Will the replication traffic isolation work in the way the posters in this thread are expecting? I.e., when the feature is implemented, will replication traffic go exclusively over VMKernel ports tagged for replication traffic?

If so, I'm going to leave my configuration as-is, and just let replication traffic go over our data network for now (unfortunately VLAN tagging at switch level is not an option for me in this scenario).

Thanks for replying.

0 Kudos
Smoggy
VMware Employee
VMware Employee
Jump to solution

I'd say you've got the idea...that's as much of a hint as I can give Smiley Happy

0 Kudos
VirtuallyMikeB
Jump to solution

Have you implemented a dedicated vSphere Replication vmkernel port in 5.5 with static routes?  If so, have you tested to see if replication traffic is actually using it instead of the management vmk0?

----------------------------------------- Please consider marking this answer "correct" or "helpful" if you found it useful (you'll get points too). Mike Brown VMware, Cisco Data Center, and NetApp dude Sr. Systems Engineer michael.b.brown3@gmail.com Twitter: @VirtuallyMikeB Blog: http://VirtuallyMikeBrown.com LinkedIn: http://LinkedIn.com/in/michaelbbrown
0 Kudos
sclarkenetcraft
Enthusiast
Enthusiast
Jump to solution

Did this get sorted in 5.5 U2 and associated VR/SRM updates?

0 Kudos
Bleeder
Hot Shot
Hot Shot
Jump to solution

It doesn't look like 5.5 Update 2 fixed it.  Here's hoping that Update 3 will finally make vSphere Replication useable.

0 Kudos
sclarkenetcraft
Enthusiast
Enthusiast
Jump to solution

It's usable - just limited in documented functionality....

So for now we need to do some manual routing outside of vSphere to get replication traffic onto a separate network or separate out our management networks even further than they already are.

At this stage I'm setting up management redundancy anyway - so I'll leave VR using the "primary" management network as it is, and redirect everything else to use the "secondary" management network to get around the problem.

0 Kudos