bobbyccie
Contributor
Contributor

Nested NSX-V 6.4.6 lab - controller VMs failing to boot (Photon OS boot screen followed by black screen)

Hi all,

I am in the process of building an NSX-V home lab (based on ESXi 6.7U3 + vCenter 6.7U3a + NSX 6.4.6). Everything is running on a single bare metal server (Intel i7, 32GB RAM) with EVE-NG as the underlying hypervisor (as per: https://www.eve-ng.net/index.php/documentation/howtos/ - see VMware ESXi/vCenter/NSX links for details).

Everything works fine regarding the nested ESXi hosts, vCenter, and initial NSX manager deployment. I have VMs running fine on the nested ESXi hosts with Distributed vSwitches etc.

But I hit problems when deploying the NSX controller VMs. It basically just sits at “Deploying” and eventually the controller VM gets deleted.

I think I have enough resources allocated to the nested ESXi host where the NSX controller is getting deployed to (4 vCPU, 8 GB RAM). I briefly see the Photon OS boot screen on the console of the controller VM but then it changes to black with no response on the keyboard.

Any hints on why a controller VM would fail to boot? Screenshots attached.

Thanks,

Bob

Stays on "Deploying":

Screenshot 2019-11-29 at 20.48.51.png

Controller VM gets deployed but fails to fully boot. After the Photon  screen all I see is black with no response on the keyboard. VM gets deleted after ~10 minutes:

Screenshot 2019-11-29 at 20.48.12.png

0 Kudos
3 Replies
Sreec
VMware Employee
VMware Employee

Were are you deploying the nsx controllers ? I'm hoping it's on the nested ESXI and NSX manager is at bare metal server ? . One of the key reason controllers get deleted immediately after deployment is because of network connectivity issue . This being a nested lab , can you ensure connectivity is in place ? MAC learning works fine (Promiscuous mode ) ?  Do follow steps @ NSX Controller Deployment Issues

Cheers,
Sree | CKA|CKAD|VCIX-3X| VCAP-4X| VExpert 5x
Please KUDO helpful posts and mark the thread as solved if answered
0 Kudos
bobbyccie
Contributor
Contributor

Thanks for the reply but I am still facing issues with the NSX controller VM starting up properly.

The NSX manager and vCenter are running directly on EVE-NG (not inside a nested ESXi host).

Other VMs run fine on the nested ESXi hosts.

I have enabled promiscuous mode on the vSS used for management connectivity but it didn’t help.

It seems to be an issue with the NSX controller VM booting up properly. It gets deployed and started OK but doesn’t progress past the Photon boot screen (console screen stays black) and eventually the VM gets deleted.

As I mentioned, my lab is running on EVE-NG which is using KMV under the hood. Other nested VMs work fine. I guess most people are running nested labs using ESXi on ESXi but I would like to stick with EVE-NG.

I have checked the deployment issues link but didn’t see anything that could be causing it.

Any other hints?

Sent from my iPhone

0 Kudos
bobbyccie
Contributor
Contributor

After some more testing I think my issue is affecting ANY 64-bit VMs on the nested ESXi hosts (32-bit VMs work OK but as the NSX controller VMs are 64-bit I think that's my problem).

Apparently I need to set 'options kvm ignore_msrs=1' but for some reason this is not surviving a reboot .

I have posted a question here: https://communities.vmware.com/message/2902276#2902276 for help with that.

Thanks all.

0 Kudos