Duplicate rules when using ESG and DFW

bruno_infotrad · ‎12-15-2016

Hello,

Our team has started using NSX recently. After reading on it, here is our understanding of the high level architecture

ESGs are to be used for North-South communications. We assume North-South can be taken to be the same as External-Internal
DLRs are to be used for Est-West communications. We assume East-West can be taken to be the same as Internal1-Internal2
Distributed Firewall (DFW) rules are applied at the VM (or the other object the rules apply to)

In our setup, we have 2 ESGs for two separate tenants. We have been experimenting with configuring firewall rules on both ESGs and on the DFW and we were able to get the various flows we needed to work.

In our first configuration, all the rules were implemented at the ESGs and we got flows to work by setting the last default rule on the DFW to be "allow from any to any". Apart from the fact that this does not look very good for security conscious organizations, it turns the DFW into a black list engine - weird thing for a firewall
In our second configuration, we had no (user) firewall rules implemented at the ESGs. Every rule was configured on the DFW and we played with the "Applied to" column to apply the rules we needed to the respective ESG. However, we did have to implement NAT rules on the ESGs, which makes for a weird mix (usually firewall and NAT rules are usually on the same firewall).
Finally, we tried to modify the first configuration by moving the rules back to their respective ESGs and changing the last default DFW rule to be "deny from any to any" (white list). Unfortunately, this does not work: the packets seem to be blocked by this default rule even though the ESG rule should allow them. We can't seem to get it to work without duplicating the ESG rule on the DFW, which is totally counterproductive as we end up with the same rule in two places (as seen in this example A Multi-Tenant Topology in VMware NSX - ipcraft.net)

Questions:

Anything we did wrong 🙂
Assuming we did not, which configuration is recommended by VMware?

hansroeder · ‎12-15-2016

The Distributed Firewall works on the VM vNIC level, whereas the firewall on the ESG works at the ESG itself. It makes total sense that if you allow traffic on the ESG firewall, that it will still be blocked by the Distributed Firewall, since this is then blocked at the vNIC level. The traffic will thus never reach the ESG. Of course we would need some more information regarding the setup and firewall rules to be sure. But keep in mind that there is a difference between both firewalls.

Regarding the correct way to do it, I would say do everything you can with the Distributed Firewall, so you can block traffic as close to the source as possible. You could, however, apply some rules to the ESG firewall, but I think it should make sense. For instance, in our Cloud environment, customers will set up an IPSec connection to their ESG. Then at the ESG firewall we will allow or block some traffic coming from their internal network. I think this makes sense, since this is North-South traffic, which will always go through the ESG, so we block it there (again, as close to the source as possible). Firewalling between the VMs of our customers is done at the Distributed Firewall level. However, should you choose to do all firewalling using the Distributed Firewall (if that's possible and desirable), I don't think you can really go wrong with that...

bruno_infotrad · ‎12-16-2016

Hello Hans,

Thank you for your reply. It confirms what we were thinking. This being said, let me point out a few items

...so you can block traffic as close to the source as possible

I agree with this statement but then, should the traffic not be block at the perimeter, i.e. the ESG?

I think this makes sense, since this is North-South traffic, which will always go through the ESG, so we block it there (again, as close to the source as possible)

It is the allow that causes the problem. I have the following setup:

ESG firewall rule

Last rule on ESG is

ESG NAT rule

As mentioned in my third bullet, with these two rules and the default DFW last rule of "deny from any to any" (screenshot below), the incoming HTTP traffic does not reach web servers on the Ubuntu VMs. And it is the rule below hat blocks the traffic according to the flow monitoring tool:

In order for the traffic to be allowed, I have to add a rule on the DFW, which duplicates the rule on the ESG as seen below:

So, unless I am doing something wrong, we have to have the same rule in two places - hardly a manageable solution. And yes, I agree that I can do everything from the DFW but

it contradicts all the documentation (including VMware) we have seen on using the ESG firewall for managing North-South traffic;
it goes against the very idea of stopping the traffic as close to the source as possible;
It makes NAT management difficult (NAT rules are on the ESGs but firewall rules are on the DFW)

hansroeder · ‎12-16-2016

Yes, this makes sense The Distributed Firewall doesn't care about what the ESG will allow, since it's a different firewall. (This can be the only logical explanation, since it is indeed what we observe). In that case, I would do all firewalling on the Distributed Firewall and only apply firewall rules to the ESG if this is possible (and logical!) or when you really need to.

I think you have a good point regarding firewalling and NAT, where you have to configure firewalling on the Distributed Firewall and NAT on the ESG. However, this doesn't have to be a bad thing. The ESG is for North-South services (VPN, NAT, Load Balancing, etc.). You can use it as a firewall, but you don't HAVE to. What you really don't want, and I know for sure you will agree with me, is to apply the same firewall rules in two locations.

Now I must confess that I've yet to really dive into everything related to firewalling in NSX, but I did check and found the following option for the "Applied To" field: "Apply this rule on all the Edge gateways". If you check this option (and remove the firewall rule on the ESG), does it work? You then only have to configure the rule once, but it will be enforced on both firewalls. If it works, of course Please let me/us know...

bruno_infotrad · ‎12-16-2016

I think you have a good point regarding firewalling and NAT, where you have to configure firewalling on the Distributed Firewall and NAT on the ESG. However, this doesn't have to be a bad thing. The ESG is for North-South services (VPN, NAT, Load Balancing, etc.). You can use it as a firewall, but you don't HAVE to.

OK but then, I do not understand why VMware needed to make it so complicated with ESGs, DLRs and the distributed firewall. Pure switching and the distributed firewall would have been enough. In any case, they should review their recommended architecture.

What you really don't want, and I know for sure you will agree with me, is to apply the same firewall rules in two locations.

You are absolutely right: I do not want this

Now I must confess that I've yet to really dive into everything related to firewalling in NSX, but I did check and found the following option for the "Applied To" field: "Apply this rule on all the Edge gateways". If you check this option (and remove the firewall rule on the ESG), does it work? You then only have to configure the rule once, but it will be enforced on both firewalls. If it works, of course Please let me/us know...

Yes, we played with that too. It does work when slightly modifying the distributed firewall rule to include the external IPs associated with the NAT rules as seen below.

Then, rule #1 on the ESG (as seen above) can be disabled/deleted and the traffic still flows to the web servers.

hansroeder · ‎12-16-2016

Personally, I don't really think it's that complicated. Every NSX component performs a certain task, or multiple tasks. You have the ESG for North-South routing and stateful services (unless you're using ECMP, then it's just North-South routing). You can also apply some firewall rules on the ESG, if needed (some implementations may not even use the Distributed Firewall functionality, so in those cases edge firewalling will become more important). Then you have the DLR, which is used to optimize East-West routing. The DLR also allows for firewalling, but it's not really a best practice to do so. In short, you should simply not perform any firewalling on the DLR (and why would you?). Last but certainly not least you have the Distributed Firewall, which you can configure directly, or populate dynamically using Service Composer defined rulesets (or both).

However, I do partially agree with you in that it's maybe a bit too much and that it could be simpler. Keep in mind, however, that NSX is Nicira + vShield, so it's a combination of two separate things. I firmly believe that it will become simpler and more transparant in the future. One day (and that day will come faster than you might imagine), NSX-v will be replaced by what is now called NSX Transformers, which should be more straightforward (from what I've heard about it at least).

Also, keep in mind that what VMware says does not always represent that which you need for your specific requirements. Just for the fun of it, I worked out some possible NSX scenarios for different customers (which do not have NSX at the moment), and they are all quite different from each other. There is of course a lot of overlap, but every customer (and therefore every environment) is different, so the requirements are different as well. In my opinion, you should use whatever VMware gives you as a guideline, from which you can create your own implementation.

In your case, I would suggest to just use the Distributed Firewall. I agree that keeping the firewall rules as close to the source (or destination for that matter) as possible, but only if it works (duh! :smileysilly:) and/or makes sense. Looking at your issue, I would say that there really is no need to do any firewalling on the ESG.

Could you point me to the documentation that you've used so far? Maybe this can clarify things (for me, at least).

bruno_infotrad · ‎12-16-2016

Thanks, Hans,

Here are some links we have used:

A Multi-Tenant Topology in VMware NSX - ipcraft.net very clear and explains things well - but implements a duplicate firewall rule for HTTP access 🙂

VMware NSX 6.2 Beginners Guide – From Zero to Full Deployment for Labs | Virten.net for installation

http://www.icc-usa.com/resources/vmw-nsx-network-virtualization-design-guide.pdf VMware guide. Page 15 contains the following statement,

"The NSX Edge supports stateful firewalling capabilities, which complement the Distributed Firewall (DFW) enabled in the Kernel of the ESXi hosts. While the DFW is mostly utilized to enforce security policies for communication between workloads connected to logical networks (the so called east-west communication), the firewall on the NSX Edge is usually filtering communications between the logical space and the external physical network (north-south traffic flows)"

which clearly contradicts what we will be doing by implementing every rule on the distributed firewall.

http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/partners/nsx/dell-vmwarensx-referen... reference architecture from VMware/Dell

VMware Hands-on Labs - HOL-SDC-1625‌ VMware hands-on labs

JJBN · ‎12-24-2016

The "apply to all ESGs" is great feature, but it would be even better if it would say apply to all ESGs attached to the selected Distributed Logical Switch. The issue we have in our setup is that we use Universal Logical Switches and we don't have this option. In an scenario that has Hardware VTEPs where physical and virtual servers are hanging from the same logical switch, it means that DFW can only be applied to the VMs and the ESG is applied to the physical servers (and yes also the VMs for North-South traffic). What we have done is on the DFW apply negative security (deny inside the tenant East-West VM to VM not desired traffic & allow everything else) and on the ESG we apply positive security (only allow what's good) for the North-South Traffic that impacts both virtual and physical servers. With this setup we have avoided duplicate FW rules on the ESG and DFW. BUT we have to apply the rules twice, this time on the ESGs...... as we have ESGs on Primary and Secondary sites.... the same rules must be configured on the ESG in the primary site and on the secondary site.

There is any roadmap to allow "apply to all ESGs" option to be supported for universal DFW?

There is any roadmap to support dynamic tags using universal DFW?

How could we automate the rules applied on primary ESG to be replicated to the secondary ESG?

Thanks.

JJBN

hansroeder · ‎12-24-2016

All things "Universal" are still, unfortunately, a bit tricky to do right, since not all information is replicated between PSC's. I firmly believe that this will improve within the next couple of NSX releases.

Regarding automation, you could use the NSX Manager REST API. And even better, use PowerNSX. I'm in the process of deploying a Cross-vCenter NSX implementation (5 vCenters in total) for a customer and will be using PowerNSX a lot, since I don't really like doing the same thing over and over and over

JJBN · ‎12-24-2016

Thanks Hans for the answer! I will take a look to use PowerNSX to automate FW rules implementation on ESGs.

Regards,

JJBN