2 Replies Latest reply on Jan 8, 2019 8:23 AM by vswitchzero

    NSX controller v6.3.5 logrotate high cpu

    jedijeff Hot Shot

      Hi. We have potentially hit the known bug for NSX controller v6.3.5 and high cpu.

       

      We have Vrealize Network Insight, and have turned off Controller polling.

       

      We also followed this TID.

       

       

      https://kb.vmware.com/s/article/56811

       

      We dont see really any log files, yet still after the Controller is started within about 10min there are about 15 logrotate process which max out all 4cpus. Eventually the controller disconnects from the cluster.

       

      Anyone know what else to try? I had a very good VMware tech who we spent a good hour going through it all, and pretty mjuch the conclusion was upgrade off of v6.3.5

       

      Odd it is only happening to 1 controller though. I was told we cant really delete just 1 controller, we need to delete all 3 and re-create. I am just curious why.

       

      Thanks,,,

        • 1. Re: NSX controller v6.3.5 logrotate high cpu
          RShankar22 Lurker
          VMware Employees

          You Can try deleting the impacted controller.

          • 2. Re: NSX controller v6.3.5 logrotate high cpu
            vswitchzero Enthusiast
            vExpert

            In the past, VMware always recommended deleting the entire control cluster as a precaution when things went bad - even with a single controller. There were some 'quirks' in older builds of NSX that necessitated this. I don't recall the specifics, but I believe the replacement of the entire cluster prevented cluster election issues. In my personal experience, newer builds - 6.3.5 included - should be fine. That said, it's critical that you confirm your control cluster health before deleting a controller (show control-cluster status). Cluster majority is required for the control plane to function in a read/write capacity (i.e. 2 of 3 controllers need to be up). If the other two are completely healthy, you should be able to delete and re-create a single controller node without any noticeable control plane impact. I'd definitely do it during a maintenance window though to be safe.

             

            Hope this helps.

            My blog: https://vswitchzero.com
            Follow me on Twitter: @vswitchzero