VMware Networking Community
James__M
Contributor
Contributor
Jump to solution

NSX-v 6.1.2 > controller issues > out of disk space

After shutting down entire infrastructure for hardware maintenance, it looks like the only thing that won't come back online are the NSX controllers. The power on but after troubleshooting and some CLI digging, all of my controllers are showing to be "out of disk space". I've tried the clear commands (minus commands that will wipe configurations) but output shows that clear commands aren't able to delete non-essential files due to there not being enough room. I issued "show status" and I get a 0% for the available disk space. Thoughts?

asdf.png

asdfasdf.png

1 Solution

Accepted Solutions
James__M
Contributor
Contributor
Jump to solution

Update:

Luckily, one of my controllers was still in a good "joined status". I eventually fixed my problem by deleting my other two controllers and redeploying them. I noticed that when deploying a controller to one or two hosts, it continued to show that I did not have a cluster to join to. So I redeployed the two other controllers to a single ESXi host that wasn't giving the error. My problem is now fixed and my controllers are all showing "connected".

Worth mentioning, the original controllers that were showing as disconnected, after troubleshooting I noticed that upon virtual appliance reboot (controllers), an error message was showing [host SMBus controller not enabled] during the boot up process of the controller virtual appliances. After issuing the [show status] command, I could see that the system partitions (mount points) were not present. After redeploying the controllers a second time, everything seemed to be working.

*****to clarify, I redeployed the clusters 3 times. My original problem was that upon powering on the NSX infrastructure, the controllers for some reason ran out of disk space and couldn't properly start up. After rebuilding the controllers, I ran into issues with the host SMBus controllers not loading, which lead the virtual disks to not be loaded (hence the mount points not showing up...since the VM couldn't connect to them). After a 2nd reploy, I ran into the "no cluster to join" error, which was resolved with redeploying the controllers on a known good host. Now that my controllers are all deployed and synchronized with each other, I'm able to vMotion the controllers across all the hosts.

Message was edited by: James__M *added the clarification dialogue at the bottom of the post

View solution in original post

3 Replies
James__M
Contributor
Contributor
Jump to solution

By the way, I've tried rebooting the controllers (vsphere client/web client), tried restarting the controllers, restarting the server (CLI), resynching option on [installation] section of Networking and Security in vSphere.

Reply
0 Kudos
grosas
Community Manager
Community Manager
Jump to solution

Deleting and re-adding a controller (although not time friendly) should cover you to get you back online.   After you have a healthy controller; you could pull a "good" and "bad" tech support log for comparison, to help in identifying the issue filling up the disk.

_____________________________________
Gabe Rosas (VMware HCX team at VMware)
Blog: hcx.design
LinkedIn: /in/gaberosas
Twitter: gabe_rosas
James__M
Contributor
Contributor
Jump to solution

Update:

Luckily, one of my controllers was still in a good "joined status". I eventually fixed my problem by deleting my other two controllers and redeploying them. I noticed that when deploying a controller to one or two hosts, it continued to show that I did not have a cluster to join to. So I redeployed the two other controllers to a single ESXi host that wasn't giving the error. My problem is now fixed and my controllers are all showing "connected".

Worth mentioning, the original controllers that were showing as disconnected, after troubleshooting I noticed that upon virtual appliance reboot (controllers), an error message was showing [host SMBus controller not enabled] during the boot up process of the controller virtual appliances. After issuing the [show status] command, I could see that the system partitions (mount points) were not present. After redeploying the controllers a second time, everything seemed to be working.

*****to clarify, I redeployed the clusters 3 times. My original problem was that upon powering on the NSX infrastructure, the controllers for some reason ran out of disk space and couldn't properly start up. After rebuilding the controllers, I ran into issues with the host SMBus controllers not loading, which lead the virtual disks to not be loaded (hence the mount points not showing up...since the VM couldn't connect to them). After a 2nd reploy, I ran into the "no cluster to join" error, which was resolved with redeploying the controllers on a known good host. Now that my controllers are all deployed and synchronized with each other, I'm able to vMotion the controllers across all the hosts.

Message was edited by: James__M *added the clarification dialogue at the bottom of the post