Cloud Pod Architecture and Cisco Nexus 1000v Bug

I worked with VMware and they will now create a KB on this issue but wanted to get it into the communities if someone else was banging their head on this.

We own two vblocks between two datacenters.  They run Nexus 1000v for the virtual networking component.

We are deploying VDI and when we enabled cloud pod architecture the Global data replication worked great, however all of our connection servers in the remote pod would show red or offline.

We found that we could not telnet to the internal pod or remote pod connection servers over port 8472.  All other ports we were good on.

Finally VMware View Support said they had one other instance of this with a customer with Nexus 1000v and found that there was a bug in the N1kv and a TCP Checksum Offload.

Cisco has a bug report posted about 8472 being dropped at the VEM for N1kV:

The bug mentions TCP Checksum being the root cause and offloading only 8472 packets.

We didn’t want to rip and replace N1kV with vDS so we dug into the OS to disable TCP Offload.

We found this article:

So we applied the following:

  • IPv4 Checksum Offload
  • Large Receive Offload (was not present for our vmxnet3 advanced configuration)
  • Large Send Offload
  • TCP Checksum Offload

We did this on each of the VMXnet3 Adapters on each connection server at both of our datacenters.

Once disabled (it did cause nic to blip), we were able to Telnet between the datacenters on port 8472 again.

We logged into View Admin portal (this is Horizon 6.1) and see all greens for remote connection servers.

We have tested this and validated it works as intended now and off to the races (Don't forget Home Sites are not built into the GUI interface in 6.1 for cloud pod).

Hope this helps others identify and resolve this issue and will be looking for that KB article sometime on this.

0 Kudos
0 Replies