HI I am just getting started with NSX-T. I have a greenfield deployment of NSX-T Manager node 2.5.
I booted this, and connected to vcenter to deploy 2 extra nodes for a total of 3 manager nodes.
All these nodes run at 100% cpu, using all 4 cores.
When logging into linux root shell, I see in "top" that java is using all CPU.
Any suggestions for this behaviour? Its been a few hours and they hogging all the CPU.
In nested environments I have always noted this high CPU usage. I run on a E5-2640 v2 and have the same behavior.
I have not observed the same in non-nested environments, with newer CPUs in customer implementations.
Based on the information of only using 4 cores seems you are using Small Appliance. Please note that Small VM is only suitable for PoC.
I have seen this behavior even on Medium NSX Manager if the processors are not that fast. On customers I have always been using Large Manager and have not observed high CPU usage. You can also increase CPU cores in the Manager VM, as they are a little CPU hungry.
mauricioamorim
Youre right, this is only a PoC.
I am currently running 4CPU yeah. If I up to 8 cores, they still use an insane amount of CPU.
But they are still using 7GHz CPU total for 4 CPU.
Also when doing TCPDUMP on one of the management hosts, I am getting about 1200 packets every second from the two other hosts on port 9000. I guess this is sync and API calls. But 1200 packets every second, is that normal?
This HOST is running:
Still, java command is still using all the CPU, with 8 cores...
top - 22:05:08 up 16 min, 1 user, load average: 8.16, 8.14, 6.12
Tasks: 228 total, 1 running, 137 sleeping, 0 stopped, 0 zombie
%Cpu(s): 15.6 us, 33.3 sy, 0.0 ni, 47.2 id, 0.7 wa, 0.0 hi, 3.2 si, 0.0 st
KiB Mem : 20545804 total, 1977508 free, 10460276 used, 8108020 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 9658316 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12927 rabbitmq 20 0 4107308 29748 5084 S 68.5 0.1 0:05.14 beam.smp
6254 uproton 20 0 7402400 1.368g 18020 S 49.5 7.0 11:35.48 java
5401 corfu 10 -10 8576900 725512 20600 S 44.9 3.5 7:03.95 java
4535 nsx-cbm 10 -10 6163244 422036 22068 S 43.6 2.1 4:40.20 java
5685 uproton 20 0 9.888g 2.503g 16856 S 37.7 12.8 14:28.49 java
1184 nsx 10 -10 9.863g 1.149g 32884 S 32.1 5.9 5:17.04 java
7 root 20 0 0 0 0 S 6.6 0.0 0:38.63 ksoftirqd/0
2057 rabbitmq 20 0 5952776 133252 7280 S 4.6 0.6 1:49.24 beam.smp
4536 ucminv 20 0 5911800 756924 17688 S 4.3 3.7 2:11.75 java
6218 uphc 20 0 5874616 385424 16044 S 4.3 1.9 1:37.05 java
4583 elastic+ 20 0 6716980 1.189g 22572 S 3.9 6.1 2:59.54 java
4857 uuc 20 0 5971452 765740 17628 S 3.3 3.7 2:18.02 java
5123 uproxy 20 0 6039064 527028 18524 S 2.6 2.6 1:49.00 java
8 root 20 0 0 0 0 I 2.0 0.0 0:21.45 rcu_preempt
593 root 20 0 0 0 0 S 1.3 0.0 0:15.33 jbd2/dm-1-8
892 syslog 20 0 404412 7092 3536 S 1.3 0.0 0:09.30 rsyslogd
9 root 20 0 0 0 0 I 0.7 0.0 0:05.05 rcu_sched
1030 root 20 0 9512 136 12 S 0.7 0.0 0:07.46 rngd
So even with 32G ram and 8CPU, which is more than "medium" I am still having huge CPU usage.
Looking at packet count and the picture in last post, you can see its used 600MB of data for only ~20 minute uptime. And I have no VM's or networks connected to these...
So you're running 2 managers (not 3) and you're doing this in a nested vSphere setup?
daphnissov click picture, its three managers.
Yes, its a nested environment, virtualization with exposed hardware for cpu enabled. Isn't that what VMware themself use for their labs in their online classrooms?
I know its not the most powerful CPU. But still, an idling NSX-T manager really eating that much CPU?
Manufacturer HP
Model ProLiant DL380p Gen8
CPU
Logical processors 32
Processor type Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
Sockets 2
Cores per socket 8
Hyperthreading Enabled
In nested environments I have always noted this high CPU usage. I run on a E5-2640 v2 and have the same behavior.
I have not observed the same in non-nested environments, with newer CPUs in customer implementations.
You're right. I never got it working normally on nested CPU, on those E5-2650. So I deployed another physical ESXI host for the management cluster, and now I'm all fine.
We saw the same issue in a nested NSX-T lab i setup for a friend. We solved this issue by removing the CPU reservations on the NSX Managers. This solved our problems and cpu usage was similar to a non-nested nsx manager .