hardikpithadia
Contributor
Contributor

Control plane agent(netcpa) is down - NSX

Hello Everyone,

I am new to VMware NSX, I am in the process of setting-up NSX lab for learning purpose, I have deployed required components for NSX to work that includes Manager, Controller and also completed host preparation post that activity I have checked communication health status in the host preparation tab and found that Control plane agent(netcpa) is down.

I have followed all the troubleshooting steps mentioned in the KB article of VMware NSX but no luck, can someone helps in identifying the issue or if I am missing something in the initial configuration.

Attaching screenshot for your reference.

pastedImage_0.png

Regards,

Hardik.

Tags (1)
16 Replies
mdac
Enthusiast
Enthusiast

Hi Hardik,

The control plane agent on the ESXi hosts must be able to communicate with the control cluster on TCP port 1234. Can you ping the controller IPs from the ESXi SSH shell? Also, check to ensure there is a valid TCP connection to the controller nodes with esxcli network ip connection list |grep 1234

You can also try to restart netcpa using the /etc/init.d/netcpad restart command to see if that helps.

Regards,

Mike

My blog: https://vswitchzero.com Follow me on Twitter: @vswitchzero
0 Kudos
hardikpithadia
Contributor
Contributor

Hello Mike,

Thanks for your prompt response.

I have checked the connections by pinging host to controller and vice versa, it's pinging but when I executed command it is showing no output, below screenshot is for your reference.

pastedImage_0.png

I have also restarted service manually but result is was the same.

Just out of my curiosity, do I need to connect NSX with physical network just to get this service started.

Regards,

Hardik.

0 Kudos
mdac
Enthusiast
Enthusiast

Hi Hardik,

It doesn't look like the ESXi host has even attempted to connect to the control cluster for some reason. Can you check to ensure your controller IPs are being received by the host via the management plane? You can check the IPs in the /etc/vmware/netcpa/config-by-vsm.xml file.

Regards,

Mike

My blog: https://vswitchzero.com Follow me on Twitter: @vswitchzero
0 Kudos
hardikpithadia
Contributor
Contributor

Hello Mike,

Regret for late reply.

I have checked in the Host by executing below command but getting permission denied message though I am having root access on Host.

pastedImage_0.png

Just for your information, I have not connected any physical infra with NSX is this due to that ?

Regards,

Hardik.

0 Kudos
sk84
Expert
Expert

Just for your information, I have not connected any physical infra with NSX is this due to that ?

What do you mean with "not connected any physical infra with NSX" ?

All controller nodes must be able to reach the management vmkernel port on each ESXi host and also the NSX manager must be reachable via network.

For example:

nsx-controller-A on esxi-host-A must be able to communicate with vmkX of esxi-host-B and must also be reachable from the NSX manager on esxi-host-C.

Is that possible?

The easiest way to accomplish this would be one VLAN, one subnet and ESXi hosts, nsx controller and nsx manager are using this subnet.

esxi-host-A Management: VLAN 123 - 192.168.10.10

esxi-host-B Management: VLAN 123 - 192.168.10.11

esxi-host-C Management: VLAN 123 - 192.168.10.12

nsx-controller-A: Portgroup with VLAN 123 - 192.168.10.20

nsx-controller-B: Portgroup with VLAN 123 - 192.168.10.21

nsx-controller-C: Portgroup with VLAN 123 - 192.168.10.22

nsx-manager: Portgroup with VLAN 123 - 192.168.10.30

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.
mdac
Enthusiast
Enthusiast

Hi Hardik - looks like in that screenshot it's attempting to execute the xml file rather than output the text contained within. Can you run the following command instead?:

cat /etc/vmware/netcpa/config-by-vsm.xml

Regards,

Mike

My blog: https://vswitchzero.com Follow me on Twitter: @vswitchzero
0 Kudos
hardikpithadia
Contributor
Contributor

Hello Mike,

I have tried executing above commands on the Host and got below error, It seems that controller has not initiated any communication with Host but I can ping controller and Manager IP address from Host.

pastedImage_0.png

Regards,

Hardik.

0 Kudos
hardikpithadia
Contributor
Contributor

Hello sk84,

Thanks for your reply.

I have setup a lab and working on nested environment as far as connectivity is concerned there is a communication between all the required components of NSX.

Manager ---- Vcenter ICMP OK

Manager ---- Controller ICMP OK

Host ----- Manager ICMP OK

Host ----- Controller ICMP OK

There is no firewall filters in between.

pastedImage_0.png

Above sentence means I have not connected any uplink from edge gateway.

Let me know if more information required.

Regards,
Hardik.

0 Kudos
sk84
Expert
Expert

Have you prepared the clusters or hosts for NSX?

(vSphere Client -> Networking & Security -> Installation & Upgrade -> Host preparation)

Is every status green there?

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.
0 Kudos
hardikpithadia
Contributor
Contributor

Hello,

Yes everything looks green here, I checked twice by force re-sync.

pastedImage_0.png

0 Kudos
sk84
Expert
Expert

Okay. So, the VIBs should be installed correctly.

To go back to your previous post with the output of the config-by-vsm.xml file.

There are several spelling mistakes in the path.

Can you please execute the following command again and share the output with us:

cat /etc/vmware/netcpa/config-by-vsm.xml

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.
0 Kudos
hardikpithadia
Contributor
Contributor

Hello,

Apologies, I have not noticed the typo, here is the output.

I have noticed one change here in the output, here the server(Controller) ip address showing as 10.35.195.185 but on the controller configuration ip address is showing 10.131.222.216 and this is ip address is correct no the ip address it is showing below.

I have cross checked by powering off controller and continuous ping when controller is down i am not getting response from ip address and when it is up i am getting reply from 10.131.222.216.

I have tried reaching ip address 10.35.195.185 but its not reachable in any condition.

It's lab setup hence I have only deployed single controller.

Is there a way we can manually change the ip address in the below file and test.

pastedImage_2.png

Controller configuration for your reference.

pastedImage_3.png

pastedImage_4.png

Let me know if more information is required.

Regards,

Hardik.

0 Kudos
sk84
Expert
Expert

Okay. The "config-by-vsm.xml" file is pushed from the NSX manager to the hosts via vsfwd.

Can you please try to restart this process and netcpad again on the ESXi hosts?

/etc/init.d/netcpad stop

/etc/init.d/vShield-Stateful-Firewall stop

/etc/init.d/vShield-Stateful-Firewall start

/etc/init.d/netcpad start

And after the restarts, check whether these processes are really running:

ps | grep vsfwd

ps | grep netcpa

After that, please check the content of the "config-by-vsm.xml" file again to see if the controller ip is correct.

If that doesn't solve the problem, please parse through the following log files looking for errors:

/var/log/vsfwd.log

/var/log/netcpa.log

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.
hardikpithadia
Contributor
Contributor

Hello,

As suggested I have restarted both the processes and check the status after restart I have observed vsfwd started without any problem but netcpa is not running, then I have check the controller IP address it is still the same.

I tried to see logs by executing above command for netcpa but getting permission denied error.

cat /var/run/log/vmkwarning.log | grep NETCPA

with above command I am able to see status of netcpa is failed.

Please find below screenshot for reference.

pastedImage_2.png

0 Kudos
sk84
Expert
Expert

You can't execute a file on command line. If you want to see the content you need one of the following commands:

cat /path/to/file

less /path/to/file

more /path/to/file

tail /path/to/file

But it seems that the netcpa.log is empty. Are there any errors in the vsfwd.log?

And did you try rebooting the ESXi host?

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.
0 Kudos
hardikpithadia
Contributor
Contributor

Hello,

Yes, I have rebooted Host many times, I am not observing any errors for vsfwd process, below is the screenshot for your reference.

pastedImage_0.png

0 Kudos