Hello Everyone,
I am new to VMware NSX, I am in the process of setting-up NSX lab for learning purpose, I have deployed required components for NSX to work that includes Manager, Controller and also completed host preparation post that activity I have checked communication health status in the host preparation tab and found that Control plane agent(netcpa) is down.
I have followed all the troubleshooting steps mentioned in the KB article of VMware NSX but no luck, can someone helps in identifying the issue or if I am missing something in the initial configuration.
Attaching screenshot for your reference.
Regards,
Hardik.
Hi Hardik,
The control plane agent on the ESXi hosts must be able to communicate with the control cluster on TCP port 1234. Can you ping the controller IPs from the ESXi SSH shell? Also, check to ensure there is a valid TCP connection to the controller nodes with esxcli network ip connection list |grep 1234
You can also try to restart netcpa using the /etc/init.d/netcpad restart command to see if that helps.
Regards,
Mike
Hello Mike,
Thanks for your prompt response.
I have checked the connections by pinging host to controller and vice versa, it's pinging but when I executed command it is showing no output, below screenshot is for your reference.
I have also restarted service manually but result is was the same.
Just out of my curiosity, do I need to connect NSX with physical network just to get this service started.
Regards,
Hardik.
Hi Hardik,
It doesn't look like the ESXi host has even attempted to connect to the control cluster for some reason. Can you check to ensure your controller IPs are being received by the host via the management plane? You can check the IPs in the /etc/vmware/netcpa/config-by-vsm.xml file.
Regards,
Mike
Hello Mike,
Regret for late reply.
I have checked in the Host by executing below command but getting permission denied message though I am having root access on Host.
Just for your information, I have not connected any physical infra with NSX is this due to that ?
Regards,
Hardik.
Just for your information, I have not connected any physical infra with NSX is this due to that ?
What do you mean with "not connected any physical infra with NSX" ?
All controller nodes must be able to reach the management vmkernel port on each ESXi host and also the NSX manager must be reachable via network.
For example:
nsx-controller-A on esxi-host-A must be able to communicate with vmkX of esxi-host-B and must also be reachable from the NSX manager on esxi-host-C.
Is that possible?
The easiest way to accomplish this would be one VLAN, one subnet and ESXi hosts, nsx controller and nsx manager are using this subnet.
esxi-host-A Management: VLAN 123 - 192.168.10.10
esxi-host-B Management: VLAN 123 - 192.168.10.11
esxi-host-C Management: VLAN 123 - 192.168.10.12
nsx-controller-A: Portgroup with VLAN 123 - 192.168.10.20
nsx-controller-B: Portgroup with VLAN 123 - 192.168.10.21
nsx-controller-C: Portgroup with VLAN 123 - 192.168.10.22
nsx-manager: Portgroup with VLAN 123 - 192.168.10.30
Hi Hardik - looks like in that screenshot it's attempting to execute the xml file rather than output the text contained within. Can you run the following command instead?:
cat /etc/vmware/netcpa/config-by-vsm.xml
Regards,
Mike
Hello Mike,
I have tried executing above commands on the Host and got below error, It seems that controller has not initiated any communication with Host but I can ping controller and Manager IP address from Host.
Regards,
Hardik.
Hello sk84,
Thanks for your reply.
I have setup a lab and working on nested environment as far as connectivity is concerned there is a communication between all the required components of NSX.
Manager ---- Vcenter ICMP OK
Manager ---- Controller ICMP OK
Host ----- Manager ICMP OK
Host ----- Controller ICMP OK
There is no firewall filters in between.
Above sentence means I have not connected any uplink from edge gateway.
Let me know if more information required.
Regards,
Hardik.
Have you prepared the clusters or hosts for NSX?
(vSphere Client -> Networking & Security -> Installation & Upgrade -> Host preparation)
Is every status green there?
Hello,
Yes everything looks green here, I checked twice by force re-sync.
Okay. So, the VIBs should be installed correctly.
To go back to your previous post with the output of the config-by-vsm.xml file.
There are several spelling mistakes in the path.
Can you please execute the following command again and share the output with us:
cat /etc/vmware/netcpa/config-by-vsm.xml
Hello,
Apologies, I have not noticed the typo, here is the output.
I have noticed one change here in the output, here the server(Controller) ip address showing as 10.35.195.185 but on the controller configuration ip address is showing 10.131.222.216 and this is ip address is correct no the ip address it is showing below.
I have cross checked by powering off controller and continuous ping when controller is down i am not getting response from ip address and when it is up i am getting reply from 10.131.222.216.
I have tried reaching ip address 10.35.195.185 but its not reachable in any condition.
It's lab setup hence I have only deployed single controller.
Is there a way we can manually change the ip address in the below file and test.
Controller configuration for your reference.
Let me know if more information is required.
Regards,
Hardik.
Okay. The "config-by-vsm.xml" file is pushed from the NSX manager to the hosts via vsfwd.
Can you please try to restart this process and netcpad again on the ESXi hosts?
/etc/init.d/netcpad stop
/etc/init.d/vShield-Stateful-Firewall stop
/etc/init.d/vShield-Stateful-Firewall start
/etc/init.d/netcpad start
And after the restarts, check whether these processes are really running:
ps | grep vsfwd
ps | grep netcpa
After that, please check the content of the "config-by-vsm.xml" file again to see if the controller ip is correct.
If that doesn't solve the problem, please parse through the following log files looking for errors:
/var/log/vsfwd.log
/var/log/netcpa.log
Hello,
As suggested I have restarted both the processes and check the status after restart I have observed vsfwd started without any problem but netcpa is not running, then I have check the controller IP address it is still the same.
I tried to see logs by executing above command for netcpa but getting permission denied error.
cat /var/run/log/vmkwarning.log | grep NETCPA
with above command I am able to see status of netcpa is failed.
Please find below screenshot for reference.
You can't execute a file on command line. If you want to see the content you need one of the following commands:
cat /path/to/file
less /path/to/file
more /path/to/file
tail /path/to/file
But it seems that the netcpa.log is empty. Are there any errors in the vsfwd.log?
And did you try rebooting the ESXi host?
Hello,
Yes, I have rebooted Host many times, I am not observing any errors for vsfwd process, below is the screenshot for your reference.