Enthusiast
Enthusiast

UEFI console crashes VMware Workstation 16 and Fusion 12

Created new VM with Workstation 16 using https://download.freebsd.org/ftp/releases/ISO-IMAGES/12.2/FreeBSD-12.2-BETA3-amd64-disc1.iso.xz

VM used UEFI bios, with 2 GB RAM, 2 processor cores and a 20GB HDD.  Stock install with ZFS.

Installer booted just fine with UEFI and ran to completion.  After install, I get the following error at boot.  If I dismiss the error dialogue box too quickly, VMware Workstation freezes and requires forcibly closing.  Looks like it's triggered some UEFI console-related bug in VMware workstation itself.  I have attached the full logs for this run plus the VM configuration.

Any workarounds known for this?  I was specifically wanting to use the UEFI console support that FreeBSD 12 provides.

Thanks,

Roger

pastedImage_0.png

20 Replies
Enthusiast
Enthusiast

This also affects VMware Fusion Pro 12.0.0.  In this case, it won't even boot the installer image from the CD.  The failure is at exactly the same point in the boot sequence.

pastedImage_0.png

0 Kudos
Leadership
Leadership

Hi,

Aninteresting excerpt from your log is (I think):

2020-09-30T16:34:58.984+01:00| vcpu-0| I005: SCSI0: RESET BUS

2020-09-30T16:34:58.984+01:00| vcpu-0| I005: SCSI0: RESET BUS

2020-09-30T16:34:58.984+01:00| vcpu-0| I005: DISKUTIL: scsi0:0 : capacity=41943040 logical sector size=512

2020-09-30T16:34:59.047+01:00| vcpu-0| I005: Guest: About to do EFI boot: EFI VMware Virtual SCSI Hard Drive (0.0)

2020-09-30T16:35:09.505+01:00| vcpu-0| I005: UHCI: HCReset

2020-09-30T16:35:09.516+01:00| vcpu-1| I005: CPU reset: soft (mode Interp)

2020-09-30T16:35:09.516+01:00| vcpu-0| I005: Guest: Firmware has transitioned to runtime.

2020-09-30T16:35:10.128+01:00| vcpu-0| I005: Msg_Post: Error

2020-09-30T16:35:10.128+01:00| vcpu-0| I005: [msg.efi.exception] The firmware encountered an unexpected exception. The virtual machine cannot boot.

2020-09-30T16:35:10.128+01:00| vcpu-0| I005: ----------------------------------------

2020-09-30T16:35:10.128+01:00| vcpu-0| I006: Vigor_MessageQueue: event msg.efi.exception (seq 10396062) queued

2020-09-30T16:35:10.147+01:00| vcpu-0| I005: MKSGrab: MKS release: start, unlocked, nesting 0

I doubt it will help, but do you get the same error if you try to use a SATA disk instead of a SCSI one?

Let's ping dariusd​ he's the expert on troubleshooting UEFI issues.

--

Wil

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
Enthusiast
Enthusiast

I created a new VM with SATA rather than SCSI and attached it here.  No difference except that I dismissed the error dialogue quickly, and this locked up VMware Workstation entirely  Last lines in the log are:

2020-10-01T15:57:45.901+01:00| vcpu-0| I005: Guest: About to do EFI boot: EFI VMware Virtual SATA Hard Drive (0.0)

2020-10-01T15:58:01.359+01:00| vcpu-0| I005: UHCI: HCReset

2020-10-01T15:58:01.378+01:00| vcpu-1| I005: CPU reset: soft (mode Interp)

2020-10-01T15:58:01.378+01:00| vcpu-0| I005: Guest: Firmware has transitioned to runtime.

2020-10-01T15:58:01.893+01:00| vcpu-0| I005: Msg_Post: Error

2020-10-01T15:58:01.893+01:00| vcpu-0| I005: [msg.efi.exception] The firmware encountered an unexpected exception. The virtual machine cannot boot.

2020-10-01T15:58:01.893+01:00| vcpu-0| I005: ----------------------------------------

2020-10-01T15:58:01.896+01:00| vcpu-0| I006: Vigor_MessageQueue: event msg.efi.exception (seq 11237791) queued

2020-10-01T15:58:01.918+01:00| vcpu-0| I005: MKSGrab: MKS release: start, unlocked, nesting 0

2020-10-01T15:58:06.707+01:00| vmx| I005: Stopping VCPU threads...

2020-10-01T15:58:06.718+01:00| svga| I005: SWBScreen: Screen 1 Destroyed: xywh(0, 0, 1024, 768) flags=0x2

2020-10-01T15:58:06.718+01:00| svga| I005: SVGA thread is exiting the main loop

2020-10-01T15:58:06.718+01:00| mks| I005: MKSGrab: MKS release: start, locked, nesting 1

0 Kudos
Leadership
Leadership

OK, that was expected.

Hopefully Darius has some time to look into this.

--
Wil

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
Contributor
Contributor

I have the exact same problem with pfSense 2.5.0 that's based on FreeBSD 12.2

I'm running it under ESXi 7.0U1

the error message it's the same

if it's useful I followed an explanation here:

https://forums.freebsd.org/threads/cant-boot-on-uefi.68141/

to make it work after the installation on my ESXi I had to boot from the iso into a shell and run:

mount -t msdosfs /dev/da0p1 /mnt
efibootmgr -c -l /mnt/efi/boot/BOOTx64.EFI -L "pfSense"
efibootmgr -a 0004

reboot

https://redmine.pfsense.org/attachments/3194/Immagine.jpg

our discussion is here https://redmine.pfsense.org/issues/10943

I also tried FreeBSD 13 but it didn't even boot from the iso

I hope it's not too much out of context here

regards

Manuel

Leadership
Leadership

Thanks for the post.  I have reproduced the failure here and I will try to figure it out.

At first glance, the messages in vmware.log suggest that this might be a guest OS defect... The message about "Firmware has transitioned to runtime" says that the FreeBSD bootloader is running and is happy and claims it is entirely ready to claim control (from the firmware) of all of the virtual machine's hardware, but then half a second later something goes wrong and our firmware's exception handler gets called... long after the guest OS should have taken over all of the exception handling responsibilities.

There are still some odd ways in which this could be our fault, but I won't know either way until I investigate a bit more.

Thanks,

--

Darius

0 Kudos
Contributor
Contributor

dariusd​ I'm the person who originally reported upstream to FreeBSD. Probably patient zero too, since I found it back in September. I tracked it down to a specific set of commits which, at least to my eyes, shouldn't have but certainly could have and definitely did introduce the problem.

r366422 - updates lua to report the kernel console (the EFI console message) https://svnweb.freebsd.org/base?view=revision&revision=366422

r366588 - video on PCI heuristic change to address a non-firing scenario https://svnweb.freebsd.org/base?view=revision&revision=366588

r366588 is the most likely candidate, however, bifurcation testing showed the same failure if both were not removed. Looking over https://reviews.freebsd.org/D26572 it looks like previously, PCI video was initialized very differently due to erroneous assumptions. Setting aside the bhyve portion, it seems likely to me that the crash is occurring as a direct result of probing the PCI video device, which did not occur previously. (FreeBSD just made a reasonable assumption as to device endpoints.)

Debugging it is definitely going to require tearing into the actual PCI traffic to see the probe and the response.

0 Kudos
Leadership
Leadership

thanks,

also written about in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=250580

(which was a link in the earlier mentioned redmine article)

--

Wil

| Author of Vimalin. The virtual machine Backup app for VMware Fusion, VMware Workstation and Player |
| More info at vimalin.com | Twitter @wilva
0 Kudos
Contributor
Contributor

Hello,

Very new to using FreeBSD on VMware Fusion, but I would like to report the following (I am using VMware Fusion on a macOS Catalina host) :

- I have the same issue, i.e. Virtual Machine with FreeBSD installed fails to start when started with "play" button ;

     - if a new machine is created, Install CD boots well and failure to start happens at first reboot with new system ;

     - if an old machine (with e.g. FreeBSD 12.1 is installed) and 12.2 is installed via CD-install, same thing happens at first reboot with new system;

     - if an old machine (with e.g. FreeBSD 12.1 is installed) and 12.2 is installed via freebsd-upgrade, same thing happens also at first reboot with new system;

    

- Now, whatever the install scenario is, the FreeBSD 12.2, Virtual Machine starts ok if started in the "start on firmware" mode and then, within EFI firmware, entering "normal" boot.

Analyzing the log files in the two cases (normal start and "via EFI" start) is far beyond my capabilities, but I can provide them.

Hope this can help

Breizh56

0 Kudos
Contributor
Contributor

I'm running FreeBSD 12.1 and FreeNAS 11.3 on my ESXi system.

After upgrading them to FreeBSD 12.2 and TrueNAS 12 both of them are broken and unable to boot.

But!

I found that they can boot properly by entering the UEFI shell of VMware firmware before booting the system.

Just press the ESC button immediately after power on these VMs and select "EFI Internal Shell", then press enter key to continue, and both of them can boot without a crash.

That's confusing.

0 Kudos
Contributor
Contributor

I found that they can boot properly by entering the UEFI shell of VMware firmware before booting the system.

Just press the ESC button immediately after power on these VMs and select "EFI Internal Shell", then press enter key to continue, and both of them can boot without a crash.

Thanks! That's actually very helpful to hardware debugging.

dariusd​ is probably the one who is going to have to tell us which way is the right way to jump here, or if VMware is going to fix it on their end. But that very, very clearly points to the device probe and initialization as the problem.

Hitting escape to enter the EFI shell causes the EFI stack to init the video device and output. Which it does correctly. When you continue on to FreeBSD, because the device is already probed and primed, the error does not occur. So the issue is that when FreeBSD probes the PCI video device, it causes a virtual hardware error, which the kernel reports back as an internal kernel error. Likely due to the device's response or non-response causing exactly that. So there is a difference in how the PCI video device is probed and initialized.

If we can get some direction on how we should or should not be probing the device (or if VMware is going to take ownership of this as a bug in UEFI or virtual hardware,) I can probably whip up a patch.

0 Kudos
Leadership
Leadership

Hitting escape to enter the EFI shell causes the EFI stack to init the video device and output. Which it does correctly. When you continue on to FreeBSD, because the device is already probed and primed, the error does not occur.

Hitting escape will not make any difference to the way the firmware initializes the video device – it is unconditionally and fully initialized upon every boot, and unconditionally bound to the stack of console drivers (ConSplitter, GraphicsConsole, Terminal) before the VMware logo is displayed on screen.

Hitting escape will potentially alter the initialization of other non-boot PCI devices, but in my experience there is a much more likely (and more mundane) explanation for changing behaviors like this one: A memory access error.  Something – either in the FreeBSD bootloader or kernel or in our firmware, can't yet tell – could be using uninitialized memory or accessing memory beyond a region's bounds, and the move to the EFI Boot Manager or the Shell just shuffles around the pool allocations enough to make the problem sometimes go away.

No idea whether that is the situation in this case, but it so far strikes me as being more likely than a video initialization problem based upon what I've seen so far.

So the issue is that when FreeBSD probes the PCI video device, it causes a virtual hardware error, which the kernel reports back as an internal kernel error.

I do not yet see any evidence of any virtual hardware error.

The message "firmware encountered an unexpected exception" just says that something triggered a CPU exception (page fault, general protection fault, divide by zero, invalid opcode, etc...) which the firmware was not expecting and was not able to handle.

The most interesting part is that the logs say that the guest OS has called ExitBootServices, after which point the guest OS is really responsible for setting up and managing the Interrupt Descriptor Table (IDT) in accordance with its own needs, but the error message we're seeing can only occur if the guest OS is still using the firmware's IDT, otherwise the firmware would not have any role in handling exceptions.  There will always be a brief window of time after ExitBootServices is called and before the guest OS has the opportunity to activate its own IDT, and the unexpected exception occurs within that window of time – or the FreeBSD bootloader/kernel is not installing its own exception handlers as early as it perhaps should.

--

Darius

0 Kudos
Contributor
Contributor

I got a bit more info by enabling a serial port (over network) on my VM:

EDIT1: Wrapped it in a spoiler tag to save space

EDIT2: My issue is actually on esxi, so there's no guarantee it's the same problem.

 

Spoiler
   ______               ____   _____ _____
  |  ____|             |  _ \ / ____|  __ \
  | |___ _ __ ___  ___ | |_) | (___ | |  | |
  |  ___| '__/ _ \/ _ \|  _ < \___ \| |  | |
  | |   | | |  __/  __/| |_) |____) | |__| |
  | |   | | |    |    ||     |      |      |
  |_|   |_|  \___|\___||____/|_____/|_____/
                                                 ```                        `
 +-----------Welcome to FreeBSD------------+    s` `.....---.......--.```   -/
 |                                         |    +o   .--`         /y:`      +.
 |  1. Boot Single user [Enter]            |     yo`:.            ‌‌">      `+-
 |  2. Boot Multi user                     |      y/               -/`   -o/
 |  3. Escape to loader prompt             |     .-                  ::/sy+:.
 |  4. Reboot                              |     /                     `--  /
 |  5. Cons: Dual (Video primary)          |    `:                          :`
 |                                         |    `:                          :`
 |  Options:                               |     /                          /
 |  6. Kernel: default/kernel (1 of 1)     |     .-                        -.
 |  7. Boot Options                        |      --                      -.
 |                                         |       `:`                  `:`
 |                                         |         .--             `--.
 +-----------------------------------------+            .---.....----.
   ______               ____   _____ _____
  |  ____|             |  _ \ / ____|  __ \
  | |___ _ __ ___  ___ | |_) | (___ | |  | |
  |  ___| '__/ _ \/ _ \|  _ < \___ \| |  | |
  | |   | | |  __/  __/| |_) |____) | |__| |
  | |   | | |    |    ||     |      |      |
  |_|   |_|  \___|\___||____/|_____/|_____/
                                                 ```                        `
 +-----------Welcome to FreeBSD------------+    s` `.....---.......--.```   -/
 |                                         |    +o   .--`         /y:`      +.
 |  1. Boot Single user [Enter]            |     yo`:.            ‌‌">      `+-
 |  2. Boot Multi user                     |      y/               -/`   -o/
 |  3. Escape to loader prompt             |     .-                  ::/sy+:.
 |  4. Reboot                              |     /                     `--  /
 |  5. Cons: Serial                        |    `:                          :`
 |                                         |    `:                          :`
 |  Options:                               |     /                          /
 |  6. Kernel: default/kernel (1 of 1)     |     .-                        -.
 |  7. Boot Options                        |      --                      -.
 |                                         |       `:`                  `:`
 |                                         |         .--             `--.
 +-----------------------------------------+            .---.....----.


Loading kernel...
/boot/kernel/kernel text=0x16bdcc4 data=0x140 data=0x75fe80 syms=[0x8+0x17e098+0x8+0x19bdd3]
Loading configured modules...
/boot/entropy size=0x1000
/boot/kernel/zfs.ko size 0x3bad38 at 0x247c000
loading required module 'opensolaris'
/boot/kernel/opensolaris.ko size 0xa448 at 0x2837000
can't find '/etc/hostid'
Start @ 0xffffffff80373000 ...
EFI framebuffer information:
addr, size     0xf0000000, 0x300000
dimensions     1024 x 768
stride         1024
masks          0x00ff0000, 0x0000ff00, 0x000000ff, 0xff000000
!!!! X64 Exception Type - 06(#UD - Invalid Opcode)  CPU Apic ID - 00000000 !!!!
RIP  - 0000000000000040, CS  - 0000000000000018, RFLAGS - 0000000000010006
RAX  - 000000000A550000, RCX - 0000000006600000, RDX - 000000000A550000
RBX  - 000000000414FFF8, RSP - 000000000414FFF8, RBP - 0000000000000000
RSI  - 0000000000000008, RDI - 0000000000000000
R8   - 000000000414C000, R9  - FFFFFFFF80373000, R10 - 000000000CECEA70
R11  - 0000000000000000, R12 - 000000000284E000, R13 - 0000000002843000
R14  - 000000000414C000, R15 - FFFFFFFF80373000
DS   - 0000000000000008, ES  - 0000000000000008, FS  - 0000000000000008
GS   - 0000000000000008, SS  - 0000000000000008
CR0  - 0000000080010033, CR2 - 0000000000000000, CR3 - 000000000FF78000
CR4  - 0000000000000668, CR8 - 0000000000000000
DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
GDTR - 00000000FFFFFEA0 000000000000002F, LDTR - 0000000000000000
IDTR - 000000000FEC43E0 0000000000000FFF,   TR - 0000000000000000
FXSAVE_STATE - 000000000414FC50
!!!! Can't find image information. !!!!

 

 

 

0 Kudos
Contributor
Contributor

You're welcome.
I've just found, besides entering the EFI Shell, which I've mentioned above, purging the VM's NVRAM could also help.

Just open ESXi's datastore browser, navigate to the VM's folder, and delete the .nvram file, the OS will boot successfully but only once. After a reboot, no matter it's a hot reboot or cold boot, it will fail again and needed another purge of its NVRAM (or the "EFI Shell" trick I've mentioned above).

Also, this works in 12.2-RELEASE but not 12.2 RC3.
12.2 RC3 is the kernel shipped with TrueNAS 12 Release. After deleting the .nvram file of TrueNAS 12 VM, the VM will reboot on the first boot attempt (not a firmware crash that will shut down the VM), and then a firmware crash as always.

"EFI Shell" trick works for both 12.2-RELEASE and 12.2 RC3 however.

0 Kudos
Contributor
Contributor

Running MacOS Big Sur 11.0.1 with VMware Fusion 12.1.0

After upgrading to FreeBSD 12.2-RELEASE and rebooting with the new kernel it didn't boot.

Clearing out the NVRAM solved the issue.

Continued installation and it works OK now.

*edit* This solution might not be permanent in all situations. On another machine it is fixed just for one time.

Tags (2)
0 Kudos
Contributor
Contributor

I tried using the answers above. All worked. However, if I did not manually select the boot menu for each boot, it could not boot.
After several trials and errors, I found out that for automatic booting, you can set the 'boot delay' setting to 1000ms. When the boot delay is set to 1000ms, it is confirmed that booting is performed normally even if nothing is done during booting.

0 Kudos
Contributor
Contributor

@hwano1010 by "boot delay" do you mean the thing that does the normal seconds countdown on the FreeBSD boot screen? I have at least 3s there and the error still triggers.

Since legacy BIOS boot works for new installs, is there a way to convert an exising install from EFI to BIOS mode? If I simply switch mode in the VM's settings ESXi reports "operating system not found".

0 Kudos
Contributor
Contributor

Also, for anyone new coming here, to save you some googling there's a fix of the root cause upstream in progress: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=251866

0 Kudos
Contributor
Contributor

Spent some time look into this and tried patch /usr/src/stand with this modification mentioned in 251866

diff --git a/stand/efi/loader/copy.c b/stand/efi/loader/copy.c
--- a/stand/efi/loader/copy.c	(revision ea57b2d7a80a256ca7c16b3df9906d9744ae44b6)
+++ b/stand/efi/loader/copy.c	(revision 4d6047edb675e52b8fad57135ab3ded8e66d0dac)
@@ -174,9 +174,7 @@
 #endif /* __i386__ || __amd64__ */
 
 #ifndef EFI_STAGING_SIZE
-#if defined(__amd64__)
-#define	EFI_STAGING_SIZE	100
-#elif defined(__arm__)
+#if defined(__arm__)
 #define	EFI_STAGING_SIZE	32
 #else
 #define	EFI_STAGING_SIZE	64

 after "make && make install", my VM starts booting normally again without the "EFI Shell" trick or cleaning NVRAM.

From what I understand, the FreeBSD team raised the "EFI_STAGING_SIZE" to 100 to make it compatible with bare metal hardwares with NVIDIA graphic cards and breaks the boot capability when it comes to VMware firmwares. They revert the value back to 64 but didn't backport it to 12.2, which is the current release of FreeBSD, and the status of Bug 251866 is still "In Progress".

0 Kudos