VMware Communities
Strohmi
Contributor
Contributor

Workstation V17.5.1 don't start only black screen

Hello I hope you can help me 

Here are the Main point's host

- W11 enterprice

- 132GB RAM

- 1 Socket 20 Core 28 logic processors i7-13850HX

Guest is

- W11 Pro

- 16GB OF RAM

- 1 socket with different logical processors

 

I only get a short loading screen and then a black screen and my host is also getting slow. after a few times the guest will start but only responds to input from time to time.

 

0 Kudos
11 Replies
bluefirestorm
Champion
Champion

The VM is using the Intel UHD Graphics for rendering while there is a more powerful Nvidia Ada RTX 2000 GPU present.
2024-03-28T09:11:00.652Z Wa(03) mks mksSandboxLog: DX12Module: Selected adapter:
2024-03-28T09:11:00.652Z Wa(03) mks mksSandboxLog: DX12Module: Intel(R) UHD Graphics

I think there are problems with VMware Workstation 17.x using newer generations of Intel GPUs with the DX12 rendering on Windows hosts and this likely the reason for the black screen.

2024-03-28T09:10:59.636Z In(05) vmx IOPL_Init: Hyper-V detected by CPUID
2024-03-28T09:10:59.837Z In(05) vmx hostCPUID name: 13th Gen Intel(R) Core(TM) i7-13850HX
2024-03-28T09:10:59.929Z In(05) vmx DICT numvcpus = "24"
2024-03-28T09:10:59.929Z In(05) vmx DICT cpuid.coresPerSocket = "24"
2024-03-28T09:10:59.931Z In(05) vmx Monitor Mode: ULM

24 vCPUs are assigned while the host CPU has 8c/16t P-cores and 12 E-cores. There is a longstanding problem that only the E-cores gets assigned and results in very slow VMs. So the extra 12vCPUs are useless as there are 12 E-cores. Even so, it is a bad idea to have a mix of P-cores and E-cores inside a VM as very likely the P-core and E-cores don't have instruction set/feature parity and the guest OS cannot choose whether to use P-core or E-core anyway.

There is Hyper-V enabled (or components that turn on Hyper-V such as WSL2, VBS, Defender AppGuard, Kernel DMA Protection, etc, you can check using msinfo32); as a result the slower "ULM" hypervisor is used instead of the ring 0 kernel driver "CPL0" hypervisor.

Shut down the VM and edit the vmx configuration file of the VM.

mks.forceDiscreteGPU = "TRUE"
Processor16.use = "FALSE"
Processor17.use = "FALSE"
Processor18.use = "FALSE"
Processor19.use = "FALSE"
Processor20.use = "FALSE"
Processor21.use = "FALSE"
Processor22.use = "FALSE"
Processor23.use = "FALSE"
Processor24.use = "FALSE"
Processor25.use = "FALSE"
Processor26.use = "FALSE"
Processor27.use = "FALSE"

Visually you can confirm
using nvidia-smi command line that mksSandbox.exe (or mksSandbox-debug.exe) process is running on RTX 2000 (or use the Nvidia GPU Activity applet).
for the E-core, you can confirm from Task Manager of vmware-vmx.exe (vmware-vmx-debug.exe) that processors 16-27 are unchecked when Set Affinity is chosen.

In-lieu of editing every VM vmx, it can be added to %PROGRAMDATA%\VMware\VMware Workstation\config.ini instead; use a Command Prompt with Admin to launch Notepad to edit config.ini.

There are alternatives to do all this edits using other methods, let me know if you want to know these methods.

With the E-cores being excluded to be used in VM, suggest reduce any VM to have 16 vCPUs or less.

For Hyper-V, if you want to remove it, refer to this post.
https://communities.vmware.com/t5/VMware-Workstation-Pro/Disabling-Hyper-V-hypervisor-on-Windows-11-...
For Kernel DMA Protection, it has to be disabled from the host UEFI.
If machine is a member of an AD Domain, some items such as VBS can be enforced through Domain Policy and if that is the case the Domain Admins will have to be involved.

0 Kudos
Strohmi
Contributor
Contributor

Hello buefirestorm,

thank's for your answer but after I insert your code in the .vmx or the config.ini for force the GPU and also disable pocessors 17-27 i get as soon as I start the VM in both cases the fault "Transport (VMDB) error -14: Pipe connection has been broken."

 

That I use is a companys Laptop and I can't deactivate the Hyper-V I must set in the CPU settings this for running up.

- disable - Virtualize Intel VT-x/EPT or AMD-V/RVI

- disable - Virtualize CPU performance counters

- I only can activate - Virtualize IOMMU

- I set Number of processors 1 and now number of cores to 16

0 Kudos
DCasota
Expert
Expert

Strange, a few week ago I had a few black screen situations, too, but those are gone. I can't say that there was a specific root cause.

Here the setup build information for comparison purposes.

Microsoft Windows 11 Pro, 23H2, build 22631.3374

WSL-Version: 2.1.5.0
Kernelversion: 5.15.146.1-2
WSLg-Version: 1.0.60
MSRDC-Version: 1.2.5105
Direct3D-Version: 1.611.1-81528511
DXCore-Version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows-Version: 10.0.22631.3374

usbipd version: 4.1.0+52.Branch.master.Sha.b0b7589d2dc4481b1af481787d6d773f46d0758a

VMware Workstation 17 Pro, 17.5.1 build-23298084

Virtual Machines with virtual hardware v21

  • Windows Server 205, 8GB, 1 vCPU with Virtualize IOMMU checked, accelerate 3D graphics with 8GB graphics memory, usb controller 3.1, bridged network adapter, , sound card, 60gb hard disk nvme
  • Photon OS 3.0 (upgraded to 5.0), 32gb ram, 16 vCPU (16/1) with no virtualization engine settings, no 3d graphics, bridged network adapter, 2x scsi harddisks

 

The situation started after upgrade VMware Workstation to 17.5.1 and ended after the updates of WSL2/usbipd and having applied Microsoft Updates. The latest Microsoft updates were KB5034848 and KB5035853. So far, the setup build listed above has been pretty robust since three weeks. On March 28th the latest Microsoft patches KB5036035 and KB5035942 were applied, too.

 

 

0 Kudos
Strohmi
Contributor
Contributor

Hello DCasota,

I checked in the update history KB5035853 are installed but the other I can't find there in my case the Windows Updates are manged by my company.

 

In the log the system doing all time long test VMSAMPLE for all setted CPU cores.

vmx (VCPU 0) VMSAMPLE: cs=0x10, rip=0xfffff80139413ccf halted progress=362257753
+ vmx (VCPU 1) VMSAMPLE: cs=0x33, rip=0x7ff788e9515d progress=83148906

0 Kudos
DCasota
Expert
Expert

Hi @Strohmi ,

In view of the exclusion principle, after having analyzed cpu and (usb) storage, and consequently the fact that you own a two-graphics card system, I would start by analyzing the graphic adapter issue.

In Microsoft windows system settings > display > graphic I've configured "high performance" aka Nvidia gpu in all VMware components options. Can you double check the settings of the five components on your system? 

DCasota_0-1711798785058.png

When parsing through the logfile, with type .\vmware-Logdata.log | select-string -pattern "fail", there are several matches e.g. 

  • mks SWBWindow: Window #0 validation failed: no valid host window or host surface.
  • vcpu-5 VMCI QueuePair: QueuePairPageStoreInit failed (res=-10).

Also, same parsing method using the pattern "error" shows up a SVGA issue SVGA Driver Error.

 

0 Kudos
bluefirestorm
Champion
Champion


after I insert your code in the .vmx or the config.ini for force the GPU and also disable pocessors 17-27 i get as soon as I start the VM in both cases the fault "Transport (VMDB) error -14: Pipe connection has been broken."

Try either the mks.forceDiscreteGPU and Processor[n].use = "FALSE" edits separately at first since they are suppose to address separate issues. mks.forceDiscreteGPU is to address the black screen and the Processor[n].use is to address the slowness of the E-cores.

I would not know why it would cause a fault; unless you attach a vmware.log with the edit changes in effect. By the way, why is the logging set to "Full" instead of "Default"?


- disable - Virtualize Intel VT-x/EPT or AMD-V/RVI

This cannot be enabled because of Hyper-V running on the host. This is for running a VM inside a VM or WSL2 or Docker inside a Windows 10/11 VM.


- disable - Virtualize CPU performance counters

This also cannot be enabled because of Hyper-V running on the host. This is only needed if you have specific software to measure the performance of software running inside a VM.


- I only can activate - Virtualize IOMMU


This is only needed if you want VBS inside a Windows 10/11 VM.


- I set Number of processors 1 and now number of cores to 16


Does the VM really need that many vCPUs? Suggest you start with 2 vCPUs and adjust upwards as needed. There is no difference in VM performance whether "Number of processors" (virtual sockets) is 1 or 2. The difference is in whether the guest OS can recognise it or not. In the case of Windows 10/11 Professional/Enterprise the maximum socket is 2.

 

0 Kudos
bluefirestorm
Champion
Champion

Alternatives to mks.forceDiscreteGPU = "TRUE"
(1) set in Nvidia Control Panel Global Settings "Preferred graphics processor" to use "High performance NVIDIA processor" instead of "Automatic" for Performance - could be a drain on battery as all applications will use Nvidia GPU
or
(2) add in Nvidia Control Panel Program Settings mksSandbox.exe (mksSandbox-debug.exe for Full debug setting) to use "High Performance NVIDIA processor". Need to add vmware-vmx.exe (vmware-vmx-debug.exe) if ISBRenderer is disabled.

The visual confirmation is the same. Use either nvidia-smi command-line or the Nvidia GPU activity applet.

Alternatives to processor[n].use = "FALSE" in vmx or config.ini
(a) set Power Plan to "High Performance" (this could also drain battery more quickly)
or
(b) disable power throttling on vmware-vmx.exe (and vmware-vmx-debug.exe for Full debug setting)
powercfg /powerthrottling disable /path "C:\Program Files (x86)\VMware\VMware Workstation\x64\vmware-vmx.exe"
powercfg /powerthrottling disable /path "C:\Program Files (x86)\VMware\VMware Workstation\x64\vmware-vmx-debug.exe"
or
(c) use Task Manager and uncheck the E-cores from vmware-vmx.exe (or vmware-vmx-debug.exe) after choosing "Set Affinity". The inconvenience here is obvious and thus the Processor[n].use = "FALSE" edit is the more convenient equivalent

The visual confirmation technique for (a) or (b) would be on Task Manager that the vmware-vmx.exe (or vmware-vmx-debug.exe) is not using the E-cores but you somehow have to monitor it. There is no guarantee (a) or (b) will work the same way if Windows 11/Intel Thread Director implementation behaviour changes in the future or if VMware somehow fixes this E-core problem and (a) or (b) becomes unnecessary although don't hold your breath for a VMware fix.

Strohmi
Contributor
Contributor

Hello @bluefirestorm 

I set the logging to Full I hope it generate more details and help for you.

try to enter only the code for mks.forceDiscreteGPU then the VM is start up but only shown the dark screen log is attached

in case I enter Processor16.use = "FALSE" system get the fault "Transport (VMDB) error -14: Pipe connection has been broken." also attached the log file

I also test with 1 socked and with  4 CPU in the attached log are the results but no change only back screen and reduced power on the host

with 2CPU the guest start up but after loggin no reaction in case i wont open a progamm attached also the log

I do also your secound post but the CPU load in the host is never over 20% in the case the VM is running in the black screen 

 

One think I see now is in the Taskmanager on the host in case I start the VM I see a incrasing of a Core load in the last CPU's all higher than 15 but it say holding on these with a lighter blue color.

 

0 Kudos
bluefirestorm
Champion
Champion

The crash is likely because Processor16 was not excluded. The index starts from 0 (zero) so the 12 E-cores are processors 16-27 (not 17-27); the Set Affinity in Task Manager also starts with CPU0 so the visual confirmation is also 16 - 27 are unchecked.

2024-03-30T16:57:27.792Z In(05) vmx VMX_PowerOn: VMX build 23298084, UI build 23298084
2024-03-30T16:57:27.792Z In(05) vmx Not using processor 17
2024-03-30T16:57:27.792Z In(05) vmx Not using processor 27
2024-03-30T16:57:28.886Z In(05) vmx FeatureCompat: Capabilities from GetHostCaps:
2024-03-30T16:57:28.886Z Cr(01) vmx PANIC: ASSERT bora\vmcore\vmx\main\featureCompat.c:724

I just noticed that the guest OS is set as "Windows 11" while the guest Detailed Data is set for Windows 10. Was this VM upgraded from Windows 10 to Windows 11? But yet there is no virtual TPM or any partial encryption. Anyway it is better to set the right guest OS type. If you want to upgrade the VM to Windows 11, upgrade first and then set the guest OS type.

2024-03-30T17:23:37.488Z In(05) vmx DICT guestInfo.detailed.data = <not printed>
2024-03-30T17:23:37.488Z In(05) vmx DICT guestOS.detailed.data = "architecture='X86' bitness='64' buildNumber='19044' distroName='Windows' distroVersion='10.0' familyName='Windows' kernelVersion='19044.1706' prettyName='Windows 10 Pro, 64-bit (Build 19044.1706)'"
2024-03-30T17:23:37.488Z In(05) vmx DICT guestOS = "windows11-64"

The mks.forceDiscreteGPU worked as it selected RTX 2000.
2024-03-30T17:23:37.490Z In(05) vmx DICT mks.forceDiscreteGPU = "TRUE"
2024-03-30T17:23:38.286Z Wa(03) mks mksSandboxLog: DX12Module: Selected adapter:
2024-03-30T17:23:38.286Z Wa(03) mks mksSandboxLog: DX12Module: NVIDIA RTX 2000 Ada Generation Laptop GPU

As for the black screen; is the laptop using scaled display (e.g. 125% or 150%) or is the VM configured for scaled display? Is the display of the host machine higher than 60Hz?
You could try removing the scaling at the host and/or setting display refresh rate to 60Hz. I haven't had problems (yet) with Fusion 12.x (with scaling) on macOS and Workstation 16.x on Linux host with higher display refresh rates on the host.

But if the scaling is needed because everything becomes so tiny in 15-16" display with UHD or higher resolution. What you could try are

(1) try DX11Renderer instead of the DX12Renderer
mks.enableDX12Renderer = "FALSE"
mks.enableDX11Renderer = "TRUE"

or
(2) try GLRenderer instead of the DX11/DX12 rendering
mks.enableDX12Renderer = "FALSE"
mks.enableDX11Renderer = "FALSE"
mks.enableGLRenderer = "TRUE"

Note that for the GLRenderer, only the Nvidia GPU should be used. There are longstanding issues in VMware Workstation with Intel GPUs when using OpenGL rendering on Windows hosts.

Assuming the black screen is not resolved, you could try to use RDP (assuming it is enabled on the guest OS). Was this VM working before in the same laptop or different host machine?

As for vmware.log level, I think it is pointless to enable "Full" debugging as neither you nor I have access to the source code. Even the default logging level, one can only make a very high level guess of what the flow/implementation is like; what/where a bug might be indicated in the log and/or cause of the bug could be.

0 Kudos
DCasota
Expert
Expert

@Strohmi @bluefirestorm just as an idea. Is there a company' Data Loss Prevention policiy in place? Are there other indicators than the vmware.log file e.g. any Windows event viewer entries, user information messages, etc.? Is the user used a domain account?

For instance, with 'Device control in Microsoft Defender for Endpoint' with an activated policy for W11 devices might detect those D:\ and I:\ drives. Can you "write" an new vm on those drives? As said, just an idea.

0 Kudos
Strohmi
Contributor
Contributor

Thanks for your help @bluefirestorm 

 

it was a W10 VM and I upgrade it round 1/2 Jear ago to W11 after the upgrade I was try to remove the TPM and also disable the encryption and it works bevore I upgrade the VM and the Tools. 

I also change the settings in the Options -> General to Windows 11 x64 I don't know why there is Windows 10 the guest is W11 and I set it in the parameter to W11

In the task manager there are all CPU selsected.

now it works with max 12 CPU's in case I try 16 then it don't start up I set the priority of the VMWorkstation from normal to high and disable some CPU and activate it again.

 

Now I test this with my other VM's 

 

 

0 Kudos