VMware Networking Community
avarcher
Commander
Commander

NSX-T 3.2 on vSphere 7u3c Hugepage issues on Edge VMs.

I'm building (not upgrading) a nested lab environment as follows:

NSX-T Datacenter 3.2
vSphere 7 update 3c

However all Edge VM deployments fail:
Host configuration: Failed to send the HostConfig message. [TN=TransportNode/29bb8107-35ce-441c-aa4c-e60a25784b40]. Reason: Mac address for a vnic fp-eth0 is not found on edge node /infra/sites/default/enforcement-points/default/edge-transport-node/29bb8107-35ce-441c-aa4c-e60a25784b40.

And the Edge reports:
ERROR: NSX Edge configuration has failed. 1G hugepage support required

The VM does not register with the uplink port group (The one showing Link Up is a VM I moved there to check out the port group.)

avarcher_0-1645118655474.png

The issue applies to VLAN and Overlay, it seems to be an issue with FastPath/DPDK/HugePages.

The hosting environment uses Intel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz with EVC set to Intel "Sandy Bridge" Generation. This environment has supported an NSX 3.0 on vSphere 7 GA successfully - edge deployments with these versions deploy successfuly.

Any help would be appreciated, thanks.

21 Replies
salarmehdizadeh
Contributor
Contributor

I have the same issue in the NSX Upgrading process from version 3.1.3.1 to 3.2.2 in EDGE nodes Upgrade stage.

I edited the EDGE VM advance feature to "featMask.vm.cpuid.pdpe1gb: Val:1" based on VMware kb
but, because of EVC is enabled on Cluster, it does NOT take effect for me.
the suggestion was about disabling the EVC but in a production environment, will disable EVC has a negative impact?

 

 

0 Kudos
mackov83
Enthusiast
Enthusiast

Like you I have a Iab environment and had the same / similar issues during the upgrade to 3.2. I also faced it when deploying new edge transport nodes.

Give this a try:

  1. Ensure the VM is powered off
  2. Edit Settings
  3. VM Options
  4. Advanced
  5. Edit Configuration
  6. Add configuration params
  7. Add the following
    • In name column: featMask.vm.cpuid.pdpe1gb
    • In value column: Val:1

This worked for me. After the OVF was deployed and failed to boot, I had to shut it down add this code and power on again.

0 Kudos
Dr_Virt
Hot Shot
Hot Shot

This has taken down one of our large hosting environments.

0 Kudos
davidr78
Enthusiast
Enthusiast

Hi mackov83, I had the same problem in the lab, this is actually a fresh 3.2 install and deploying new edge nodes.

Whole NSX-T lab is nested, however underlying hosts are Xeon E5-2670 (Sandy Bridge)

when adding in the advanced config params and trying to start the VM, I get the following error:

Feature '1 GB pages (PDPE1GB)' was absent, but must be present. Failed to start the virtual machine. Module FeatureCompatLate power on failed.

Reckon my cpu's just won't support it ?

Tags (1)
0 Kudos
mackov83
Enthusiast
Enthusiast

@davidr78 - My CPUs are also Sandy Bridge (E5-2690). From what I can tell just a higher clock speed.

However, if your ESXi is nested that may be contributing to the issue. In your case you would have:

ESXi host (physical) > ESXi host (virtual) > VM (Edge Transport Node)

In my case:

ESXi host (physical) > VM (Edge Transport Node)

FYI - I also had to disable EVC mode at the cluster level as it would not accept the setting otherwise. I saw this recommendation on other posts.

0 Kudos
bape1892
Contributor
Contributor

Hi there - I'm having the same issue as you when trying to power on the NSX Edge.  Did you manage to resolve?

@davidr78 

0 Kudos
tckoon
Enthusiast
Enthusiast

My ESXI nested lab:

NSX-T Datacenter 3.2
vSphere 7 update 3d

Facing same issue where when deployment of EDGE Transport node failed.
Been 3 days trying all solutions found here and also google. But dont work.


Good news is I managed to bring up both of my EDGE Transport nodes.
I follow the mackov83 suggestion, first try not working.
Then I realised we are doing setup on "Nested" environment.

We need to add the featMask.vm.cpuid.pdpe1gb , Val:1 on both level of VMs (Edge VM & Nested ESXi host)

And additional steps is on the Edge VM & nested ESXi host, enabled "Expose hardware assisted virtualization to the guest OS"
I did that and it work !!.

Here the steps given by mackov83:

1) Ensure the VM is powered off
2) Edit Settings
3) VM Options
4) Advanced
5) Edit Configuration
6) Add configuration params
7) Add the following
😎 In name column: featMask.vm.cpuid.pdpe1gb
9) In value column: Val:1

A) Access to the Edge node VM auto deployed by NSX-T manager and perform the above steps.
B) Perform above steps on the nested ESXI host VM ( where the Edge node VM reside)
c) Lastly make sure the Edge node VM, under CPU setting enables/checked "Expose hardware assisted virtualization to the guest OS"
D) Reboot the Edge node VM and nested ESXI host VM.

Important NOTES: I found EDGE node Form Factor=small not working, I deploy with medium form factor it work.

Hope it help you.

thanks

 

xabiermatos
Contributor
Contributor

I had this same problem in my homelab, with  Nested ESXi 7.0U3 and NSX-T 3.2 .

This solved it. In my case, it was not necesary :   "Expose hardware assisted virtualization to the guest OS"  in the Edge node VM 

0 Kudos
tsommer
VMware Employee
VMware Employee

I was able to get around the issue when the EDGE starts up by adding these advanced settings on the VM

monitor_control.enable_fullcpuid = TRUE 
featMask.vm.cpuid.pdpe1gb = "Val:1" 

That lets the EDGE start up and it's fine.  However, if I try to edit that same EDGE later to change something in 3.2 NSX Manager I get this.

Has anyone got around this?  

 

0 Kudos
xabiermatos
Contributor
Contributor

I had got the same issue,  using a Nested env.  I had to add that parameter featMask.vm.cpuid.pdpe1gb = "Val:1"   also in the ESXi node

0 Kudos
jaschluc73
Contributor
Contributor

I was looking for the settings to add the same value to the ESXI node. Can you please share where you set the featMask.vm.cpuid.pdpe1gb = "Val:1"  in the ESXI node? Thanks!

0 Kudos
mackov83
Enthusiast
Enthusiast

@jaschluc73 - On a physical ESXi node or a nested ESXi VM?

0 Kudos
xabiermatos
Contributor
Contributor

These are my settings.

  • EVC mode needs to be disabled
  • And the following Advanced Feature needs to be added to the VM:
    • featMask.vm.cpuid.pdpe1gb: Val:1

If the following doesn´t work add the same value Advanced Feature in the ESXi VM.

All my  ESXi 7.x are Nested.

0 Kudos
drogozinskiy
Contributor
Contributor

tckoon

Does it work with VMware EVC ? you didn't specify

0 Kudos
DanialZafar1
Contributor
Contributor

I cannot deploy NSX Edge node on my nested ESXI hosts
It says vm is not compatible with any host
I have done all the necessary steps that are explained above
I am using Intel Xeon E5-2600 series Ivy Bridge processor Dell PowerEdge R720 server
NSX-T Version 3.2.1
ESXI NESTED HOST 7.0.3

vCenter server 7


Can you help me out

 

Danial Zafar
0 Kudos
drogozinskiy
Contributor
Contributor

See Install an NSX Edge on ESXi Using the vSphere GUI

and manual from tckoon

So you can install edge from vSphere GUI with OVA/OVF file only. You cann't install edge from NSX-T Manager interface because it doesn't set the flag "Expose hardware assisted virtualization to the guest OS"

 
0 Kudos
salarmehdizadeh
Contributor
Contributor

Hello,

Based on VMware support advice, we migrate EDGE VMs to another cluster where EVC was disabled there and the issue was resolved and we could proceed with the Upgrade process as well.

I hope that it works for you.

0 Kudos
vgrdy
Contributor
Contributor

I have the same issue on my side

I deployed a new edge node as medium form factor and the VM is not able to start.

I added the featMask.vm.cpuid.pdpe1gb=1 parameter and after that the VM can boot

-> same on the nested ESX
-> EVC is enabled on the cluster

Unfortunatelly, it seems that the NSX manager do not detect the edge node VM is up and running, and it do not follow the creation process. Is there a way to force to continue the creation as the VM is now running ?

Thanks

0 Kudos
TryllZ
Expert
Expert

Hi All,

I'm facing a similar issue, the CPU is Xeon E5-2650 v2.

ESXi - 7.0.3, 21424296

vCenter - 7.0.3, 21477706

NSX - 4.0.0.1.0.20159694

NSX Edge - 4.0.0.1.0.20159697

When Edge is installed on ESXi (which is on Bare Metal), Edge installs fine, and boots up, does not boot up in the Nested Environment with the below error, EVC is disabled on cluster.

TryllZ_0-1690119337232.png

Have added all the 3 settings below to the nested ESXi VMs as well.

featMask.vm.cpuid.PDPE1GB = Val:1
sched.mem.lpage.enable1GPage = "TRUE"
monitor_control.enable_fullcpuid = "TRUE"

Have attempted all the settings mentioned here, on both Edge VM and Nested ESXi VM without success.

Any thoughts ?

0 Kudos