5 Replies Latest reply on Feb 25, 2020 11:11 PM by HassanAlKak88

    NSX environment performance issue

    Rahul418282 Lurker

      Hello Experts,

       

      Customer is facing performance issues in VMware SDDC environment and blaming NSX for it. I can see ESXi hosts are healthy, rules are pushed in hypervisors, no error on NSX dashboard but yet they facing latency when traffic is passing through VXLAN and DFW rules.

       

      Can anyone guide me what additional parameters should I check? Any suggestions are welcome.

       

      Thank you!!

      Rahul Kumar

        • 1. Re: NSX environment performance issue
          Sreec Master
          Community WarriorsvExpert

          I'm sorry to say this , you need do your home work to isolate this performance issue .

           

          1.What is the latency you are seeing when user report the issue ? How did you test the latency ?  Is there any strict latency requirements for those apps ?

          2.How is the design for this setup ?

          3.For what kind of workloads we have performance issues ?

          4.What type of traffic is reporting performance issues?

          5.Do we have such issues from the beginning ?

          6.Was there any change in the setup recently ?

          7.Do we have a specific time frame for such issues or it is intermittent ?

          8. Do we have any performance monitoring tools/software's in this setup ?

           

          Please do watch VMworld 2017 US - NET1343BU - NSX Performance Deep Dive - YouTube and never ignore vSphere design ,it can be a potential caveat as well.

          • 2. Re: NSX environment performance issue
            Rahul418282 Lurker

            Hello Sreec,

             

            Answers to your questions are below inline

             

            1.What is the latency you are seeing when user report the issue ? How did you test the latency ?  Is there any strict latency requirements for those apps ? Using tool httperf with rate test = 10000, installed on Source VM in the LAN cluster.

             

            2.How is the design for this setup ?  3 ESG in ECMP mode connecting down to one DLR. Separate ESGs in one-arm mode are being used as load balancer for the backend servers.

            Only two clusters are under same datacenter at vcenter level. One LAN cluster ( vxlan not configured ), one VXLAN cluster ( vxlan configured ).  Source VM is in LAN cluster and target VMs are in VXLAN cluster ( mircosegmentation is done to allow traffic - DFW rules are in place  - Target VM's are behind separate ESGs in one-arm mode ).

             

            3.For what kind of workloads we have performance issues ?  For all applications hosted in VXLAN cluster.

             

            4.What type of traffic is reporting performance issues? TCP traffic most of the time

             

            5.Do we have such issues from the beginning ? Not from the beginning. We upgraded NSX from 6.3.4 to 6.4.5 in oct-nov 2019. After that customer started reporting such issues in platform. I can't any bug reported by VMware on internet.

             

            6.Was there any change in the setup recently ? No, except for NSX upgrade in cot-nov, 2019.

             

            7.Do we have a specific time frame for such issues or it is intermittent ?  it's for every test they running to validate test across platform.

             

            8. Do we have any performance monitoring tools/software's in this setup ? Except the tool httperf, no other tool is being to monitor the latency. Any advice?

             

             

            • 3. Re: NSX environment performance issue
              HassanAlKak88 Expert
              vExpert

              Hello,

               

              A quick hint, are they using the Applied To option under NSX DFW or they keep it the default?

               

              The Applied To defines the scope at which this rule is applicable which decrease the number of rules applied per VM network adapter.

               

              check the following: https://www.esvr.cloud/2017/08/10/the-importance-of-nsx-distributed-firewall-applied-to/

              • 4. Re: NSX environment performance issue
                Rahul418282 Lurker

                Hello HassanAlKak88

                 

                Yes, Problem is customer has created all the dfw firewall rule with "Applied to" set to DFW in turn it has replied to very vnic of VMs hosted on platform. Although firewall rules are around 1500-1700 but per vnic it has exceeded supported number ( 3500 max as per VMware ). In my case it's over 5700. This is what VMware support team has concluded after raising this case to them and root cause of performance issues.

                 

                I don't have visibility on what rule is being used for what. Has anyone faced this situation before and what was done to rewrite the existing rules?

                • 5. Re: NSX environment performance issue
                  HassanAlKak88 Expert
                  vExpert

                  Hello Dear,

                   

                  To handle this kind of problems, you have to make a global assessment on all your firewall rules and try the below: