- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Bug: NMI incorrectly blocked in guest if host blocks NMI and injects interrupt to guest
Hi,
I am recently experimenting with how VMware handles NMI interrupts in nested virtualization, and I find that the behavior of VMware is incorrect in a corner case.
The version of VMware and my host OS are:
- Product: VMware(R) Workstation 17 Pro
- Version: 17.0.1 build-21139696
- Host OS: Debian 11, kernel version 5.10.0-21-amd64
- Host CPU: Intel(R) Core(TM) i5-7600 CPU @ 3.50GHz
To reproduce this bug:
- Create a new virtual machine, choose "Other" as guest OS
- Open "Edit Virtual Machine Settings"
- Remove the default disk, add attached VMDK file (d.vmdk) instead
- Change number of processors from 1 to 2
- Enable "Virtualize Intel VT-x/EPT or AMD-V/RVI"
- Add a serial port to the virtual machine, be able to read it (e.g. use output file)
- Keep other configurations as default (for me, it is 256 MB memory, ...)
- Start the virtual machine
- Observe the serial port output
The source code of the VMDK file can be found at: https://github.com/lxylxy123456/uberxmhf/blob/9eb50d71910b2586c11f92462f37096a7066502b/xmhf/src/xmhf...
Actual output (on VMware):
...
Detecting environment
End detecting environment
Experiment: 17
Enter host, exp=17, state=0
hlt_wait() begin, source = EXIT_NMI_H (5)
Inject NMI
Interrupt recorded: EXIT_NMI_H (5)
At instruction: 0x00000000
VM-exit reason: 0x00000012
hlt_wait() end
hlt_wait() begin, source = EXIT_TIMER_H (6)
Inject interrupt
Interrupt recorded: EXIT_TIMER_H (6)
hlt_wait() end
hlt_wait() begin, source = EXIT_TIMER_H (6)
Inject NMI
Inject interrupt
Interrupt recorded: EXIT_TIMER_H (6)
hlt_wait() end
Leave host
Interrupt recorded: EXIT_NMI_H (5)
At instruction: 0x00000000
VM-exit reason: 0x0000000a
CPU(0x02): key press: 65, guest=1
source: EXIT_VMEXIT (7)
exit_source: EXIT_NMI_H (5)
rip: 0x082013df
exit_rip: 0x08208488
TEST_ASSERT '0 && (exit_source == source)' failed, line 372, file lhv-guest.c
Expected output (reproducible on real Intel hardware):
...
Detecting environment
End detecting environment
Experiment: 17
Enter host, exp=17, state=0
hlt_wait() begin, source = EXIT_NMI_H (5)
Inject NMI
Interrupt recorded: EXIT_NMI_H (5)
At instruction: 0x00000000
VM-exit reason: 0x00000012
hlt_wait() end
hlt_wait() begin, source = EXIT_TIMER_H (6)
Inject interrupt
Interrupt recorded: EXIT_TIMER_H (6)
hlt_wait() end
hlt_wait() begin, source = EXIT_TIMER_H (6)
Inject NMI
Inject interrupt
Interrupt recorded: EXIT_TIMER_H (6)
hlt_wait() end
Leave host
Interrupt recorded: EXIT_VMEXIT (7)
CPU(0x01): key press: 250, guest=1
Enter host, exp=17, state=1
iret_wait() begin, source = EXIT_MEASURE (1)
iret_wait() end
Leave host
Experiment: 1
... (endless)
Explanation:
The VMDK file (d.vmdk) contains a micro-hypervisor called LHV. Assume VMware runs in L0, LHV runs in L1, the guest of LHV runs in L2.
The code in LHV performs an experiment (called "Experiment 17" in serial output) on CPU 0 to test the behavior of NMI blocking. The experiment steps are:
- Prepare state such that the CPU is currently in L1 (LHV), and NMI is blocked
- An NMI interrupt arrives at the CPU. However, since NMI is blocked, the NMI interrupt handler of L1 is not invoked
- Modify VMCS to make sure that L2 has virtual NMIs enabled (NMI exiting = 1, Virtual NMIs = 1), and L2 blocks NMI (Blocking by NMI = 1)
- Modify VMCS to inject a normal interrupt (vector 0x21) to L2 at VM entry
- VM entry to L2
The expected behavior is:
- 6. Immediately after VM entry L2's interrupt 0x21 handler is invoked
- 7. VM exit happens immediately due to the NMI interrupt at step 2
However, on VMware, the behavior appears to be:
- 6. Immediately after VM entry L2's interrupt 0x21 handler is invoked
- 7. After executing some instructions, L2 executes the CPUID instruction, which causes VM exit to L1
- 8. Immediately after VM exit, L1's NMI interrupt handler is executed
It appears that VMware's implementation is incorrect. NMI is blocked in L2, but Intel's SDM says that NMI is always unblocked in L2. Quote from Intel SDM:
The following items describe the use of bit 3 (blocking by NMI) in the interruptibility-state field if the “virtual NMIs” VM-execution control is 1:
- The bit’s value does not affect the blocking of NMIs after VM entry. NMIs are not blocked in VMX non-root operation (except for ordinary blocking for other reasons, such as by the MOV SS instruction, the wait-for-SIPI state, etc.)
- ...
Could you please fix this implementation problem in VMware? Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is best to specify also your host CPU model.
Virtualised interrupt delivery and posted process interrupts are features that are available in Xeon processors but not in Intel desktop or mobile CPUs. So these features might affect what you see in your experiments.
You can check the vmware.log
Process posted interrupts (from Ivy Bridge and newer) and virtual-interrupt delivery (Haswell and newer). A value of {0} means it is not available on the CPU. {0,1} would be mean it is.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply. I have edited the bug report to include my host CPU model: Intel(R) Core(TM) i5-7600 CPU @ 3.50GHz
Since this bug is relevant to non-maskable interrupts (NMIs), I think virtualised interrupt delivery and posted process interrupts are likely unrelated. According to SDM, these features are only related to maskable interrupts.
Just FYI, here are the guest VMX capabilities printed in vmware.log:
Guest VT-x Capabilities:
Basic VMX Information (0x00d8100000000001)
VMCS revision ID 1
VMCS region length 4096
VMX physical-address width natural
SMM dual-monitor mode no
VMCS memory type WB
Advanced INS/OUTS info yes
True VMX MSRs yes
Exception Injection ignores error code no
True Pin-Based VM-Execution Controls (0x0000003f00000016)
External-interrupt exiting {0,1}
NMI exiting {0,1}
Virtual NMIs {0,1}
Activate VMX-preemption timer { 0 }
Process posted interrupts { 0 }
True Primary Processor-Based VM-Execution Controls (0xfff9fffe04006172)
Interrupt-window exiting {0,1}
Use TSC offsetting {0,1}
HLT exiting {0,1}
INVLPG exiting {0,1}
MWAIT exiting {0,1}
RDPMC exiting {0,1}
RDTSC exiting {0,1}
CR3-load exiting {0,1}
CR3-store exiting {0,1}
Activate tertiary controls { 0 }
CR8-load exiting {0,1}
CR8-store exiting {0,1}
Use TPR shadow {0,1}
NMI-window exiting {0,1}
MOV-DR exiting {0,1}
Unconditional I/O exiting {0,1}
Use I/O bitmaps {0,1}
Monitor trap flag {0,1}
Use MSR bitmaps {0,1}
MONITOR exiting {0,1}
PAUSE exiting {0,1}
Activate secondary controls {0,1}
Secondary Processor-Based VM-Execution Controls (0x00553cfe00000000)
Virtualize APIC accesses { 0 }
Enable EPT {0,1}
Descriptor-table exiting {0,1}
Enable RDTSCP {0,1}
Virtualize x2APIC mode {0,1}
Enable VPID {0,1}
WBINVD exiting {0,1}
Unrestricted guest {0,1}
APIC-register virtualization { 0 }
Virtual-interrupt delivery { 0 }
PAUSE-loop exiting {0,1}
RDRAND exiting {0,1}
Enable INVPCID {0,1}
Enable VM Functions {0,1}
Use VMCS shadowing { 0 }
ENCLS exiting { 0 }
RDSEED exiting {0,1}
Enable PML { 0 }
EPT-violation #VE {0,1}
Conceal VMX from PT { 0 }
Enable XSAVES/XRSTORS {0,1}
PASID translation { 0 }
Mode-based execute control for EPT {0,1}
Sub-page write permissions for EPT { 0 }
PT uses guest physical addresses { 0 }
Use TSC scaling { 0 }
Enable UMWAIT and TPAUSE { 0 }
Enable ENCLV in VMX non-root mode { 0 }
Enable EPC Virtualization Extensions { 0 }
Bus lock exiting { 0 }
Notification VM exits { 0 }
Tertiary Processor-Based VM-Execution Controls (0x0000000000000000)
LOADIWKEY exiting no
Enable HLAT no
Enable Paging-Write no
Enable Guest Paging Verification no
Enable IPI Virtualization no
True VM-Exit Controls (0x003fefff00036dfb)
Save debug controls {0,1}
Host address-space size {0,1}
Load IA32_PERF_GLOBAL_CTRL { 0 }
Acknowledge interrupt on exit {0,1}
Save IA32_PAT {0,1}
Load IA32_PAT {0,1}
Save IA32_EFER {0,1}
Load IA32_EFER {0,1}
Save VMX-preemption timer { 0 }
Clear IA32_BNDCFGS { 0 }
Conceal VMX from processor trace { 0 }
Clear IA32_RTIT MSR { 0 }
Clear IA32_LBR_CTL MSR { 0 }
Clear user-interrupt notification vector { 0 }
Load CET state { 0 }
Load PKRS { 0 }
True VM-Entry Controls (0x0000d3ff000011fb)
Load debug controls {0,1}
IA-32e mode guest {0,1}
Entry to SMM { 0 }
Deactivate dual-monitor mode { 0 }
Load IA32_PERF_GLOBAL_CTRL { 0 }
Load IA32_PAT {0,1}
Load IA32_EFER {0,1}
Load IA32_BNDCFGS { 0 }
Conceal VMX from processor trace { 0 }
Load IA32_RTIT MSR { 0 }
Load user-interrupt notification vector { 0 }
Load CET state { 0 }
Load IA32_LBR_CTL MSR { 0 }
Load PKRS { 0 }
VPID and EPT Capabilities (0x00000f0106714141)
R=0/W=0/X=1 yes
Page-walk length 3 yes
EPT memory type WB yes
2MB super-page yes
1GB super-page no
INVEPT support yes
Access & Dirty Bits yes
Advanced VM exit information for EPT violations yes
Supervisor shadow-stack control no
Type 1 INVEPT yes
Type 2 INVEPT yes
INVVPID support yes
Type 0 INVVPID yes
Type 1 INVVPID yes
Type 2 INVVPID yes
Type 3 INVVPID yes
Miscellaneous VMX Data (0x00000000400401e0)
TSC to preemption timer ratio 0
VM-Exit saves EFER.LMA yes
Activity State HLT yes
Activity State shutdown yes
Activity State wait-for-SIPI yes
Processor trace in VMX no
RDMSR SMBASE MSR in SMM no
CR3 targets supported 4
Maximum MSR list size 512
VMXOFF holdoff of SMIs no
Allow all VMWRITEs no
Allow zero instruction length yes
MSEG revision ID 0
VMX-Fixed Bits in CR0 (0x0000000080000021/0x00000000ffffffff)
Fixed to 0 0xffffffff00000000
Fixed to 1 0x0000000080000021
Variable 0x000000007fffffde
VMX-Fixed Bits in CR4 (0x0000000000002000/0x00000000003727ff)
Fixed to 0 0xffffffffffc8d800
Fixed to 1 0x0000000000002000
Variable 0x00000000003707ff
VMCS Enumeration (0x000000000000005a)
Highest index 0x2d
VM Functions (0x0000000000000001)
Function 0 (EPTP-switching) supported.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would suggest that you open a service request with VMware if you want to make sure some action is taken on this by VMware. This forum is not connected to VMware technical support or development engineers. Posting a bug report here does not guarantee that VMware will see it or act on it. VMware employees do not regularly monitor this forum and if they do it's on their own time.
Editor of the Unofficial Fusion Companion Guides
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for letting me know. Interesting, I posted this bug here because I saw https://www.vmware.com/support/policies/defect.html . Do I need something special (e.g. active support agreement) to open a service request? When I am trying to post a service request I do not see anything under "Supported Products".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you don't have a service contract, you can purchase per-incident support from the VMware Online Store. That's essential a "credit" or "voucher" for you to open a service request. Opening a service request online should then ask you for the "serial number" or something of that ilk that represents the per-incident support that you purchased in the store.
If you were within 30 days of purchasing a new or upgrade license, they have complimentary support to open a support request. The bad thing about that is that most folks don't find things that need attention until after that 30 days are up.
Editor of the Unofficial Fusion Companion Guides