Solved: Re: NSX-T 2.5 Manager node 100% CPU usage

oloflundgren · ‎11-01-2019

HI I am just getting started with NSX-T. I have a greenfield deployment of NSX-T Manager node 2.5.

I booted this, and connected to vcenter to deploy 2 extra nodes for a total of 3 manager nodes.

All these nodes run at 100% cpu, using all 4 cores.

When logging into linux root shell, I see in "top" that java is using all CPU.

Any suggestions for this behaviour? Its been a few hours and they hogging all the CPU.

mauricioamorim · ‎11-06-2019

In nested environments I have always noted this high CPU usage. I run on a E5-2640 v2 and have the same behavior.

I have not observed the same in non-nested environments, with newer CPUs in customer implementations.

View solution in original post

mauricioamorim · ‎11-01-2019

Based on the information of only using 4 cores seems you are using Small Appliance. Please note that Small VM is only suitable for PoC.

I have seen this behavior even on Medium NSX Manager if the processors are not that fast. On customers I have always been using Large Manager and have not observed high CPU usage. You can also increase CPU cores in the Manager VM, as they are a little CPU hungry.

oloflundgren · ‎11-01-2019

mauricioamorim

Youre right, this is only a PoC.

I am currently running 4CPU yeah. If I up to 8 cores, they still use an insane amount of CPU.

But they are still using 7GHz CPU total for 4 CPU.

Also when doing TCPDUMP on one of the management hosts, I am getting about 1200 packets every second from the two other hosts on port 9000. I guess this is sync and API calls. But 1200 packets every second, is that normal?

This HOST is running:

Hypervisor:VMware ESXi, 6.7.0, 14320388
Model:VMware7,1
Processor Type:Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
Logical Processors:24

Still, java command is still using all the CPU, with 8 cores...

top - 22:05:08 up 16 min, 1 user, load average: 8.16, 8.14, 6.12

Tasks: 228 total, 1 running, 137 sleeping, 0 stopped, 0 zombie

%Cpu(s): 15.6 us, 33.3 sy, 0.0 ni, 47.2 id, 0.7 wa, 0.0 hi, 3.2 si, 0.0 st

KiB Mem : 20545804 total, 1977508 free, 10460276 used, 8108020 buff/cache

KiB Swap: 0 total, 0 free, 0 used. 9658316 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

12927 rabbitmq 20 0 4107308 29748 5084 S 68.5 0.1 0:05.14 beam.smp

6254 uproton 20 0 7402400 1.368g 18020 S 49.5 7.0 11:35.48 java

5401 corfu 10 -10 8576900 725512 20600 S 44.9 3.5 7:03.95 java

4535 nsx-cbm 10 -10 6163244 422036 22068 S 43.6 2.1 4:40.20 java

5685 uproton 20 0 9.888g 2.503g 16856 S 37.7 12.8 14:28.49 java

1184 nsx 10 -10 9.863g 1.149g 32884 S 32.1 5.9 5:17.04 java

7 root 20 0 0 0 0 S 6.6 0.0 0:38.63 ksoftirqd/0

2057 rabbitmq 20 0 5952776 133252 7280 S 4.6 0.6 1:49.24 beam.smp

4536 ucminv 20 0 5911800 756924 17688 S 4.3 3.7 2:11.75 java

6218 uphc 20 0 5874616 385424 16044 S 4.3 1.9 1:37.05 java

4583 elastic+ 20 0 6716980 1.189g 22572 S 3.9 6.1 2:59.54 java

4857 uuc 20 0 5971452 765740 17628 S 3.3 3.7 2:18.02 java

5123 uproxy 20 0 6039064 527028 18524 S 2.6 2.6 1:49.00 java

8 root 20 0 0 0 0 I 2.0 0.0 0:21.45 rcu_preempt

593 root 20 0 0 0 0 S 1.3 0.0 0:15.33 jbd2/dm-1-8

892 syslog 20 0 404412 7092 3536 S 1.3 0.0 0:09.30 rsyslogd

9 root 20 0 0 0 0 I 0.7 0.0 0:05.05 rcu_sched

1030 root 20 0 9512 136 12 S 0.7 0.0 0:07.46 rngd

oloflundgren · ‎11-01-2019

So even with 32G ram and 8CPU, which is more than "medium" I am still having huge CPU usage.

Looking at packet count and the picture in last post, you can see its used 600MB of data for only ~20 minute uptime. And I have no VM's or networks connected to these...

daphnissov · ‎11-01-2019

So you're running 2 managers (not 3) and you're doing this in a nested vSphere setup?

------------------
How to Ask for Help on Tech Forums
https://neonmirrors.net

oloflundgren · ‎11-01-2019

daphnissov click picture, its three managers.

Yes, its a nested environment, virtualization with exposed hardware for cpu enabled. Isn't that what VMware themself use for their labs in their online classrooms?

I know its not the most powerful CPU. But still, an idling NSX-T manager really eating that much CPU?

Manufacturer HP

Model ProLiant DL380p Gen8

CPU

Logical processors 32

Processor type Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz

Sockets 2

Cores per socket 8

Hyperthreading Enabled

mauricioamorim · ‎11-06-2019

In nested environments I have always noted this high CPU usage. I run on a E5-2640 v2 and have the same behavior.

I have not observed the same in non-nested environments, with newer CPUs in customer implementations.

oloflundgren · ‎11-06-2019

You're right. I never got it working normally on nested CPU, on those E5-2650. So I deployed another physical ESXI host for the management cluster, and now I'm all fine.

tonyvitek · ‎01-06-2020

We saw the same issue in a nested NSX-T lab i setup for a friend. We solved this issue by removing the CPU reservations on the NSX Managers. This solved our problems and cpu usage was similar to a non-nested nsx manager .