VMware Communities
continuum
Immortal
Immortal

WS 6.5.1 crashes when you start a Virtualbox VM with VT-support

WS 6.5.1 can run along with Virtualbox nicely as long as you do NOT use a VirtualBox VM with VT-support.

When you start a VirtualBox VM with VT-support all running Workstation VMs crash at once.

Why does WS crash ? - can't you lock the VT extensions so that VirtualBox can't start a VM with VT-support ?

___________________________________

description of vmx-parameters:

VMware-liveCD:


________________________________________________
Do you need support with a VMFS recovery problem ? - send a message via skype "sanbarrow"
I do not support Workstation 16 at this time ...

0 Kudos
15 Replies
admin
Immortal
Immortal

can't you lock the VT extensions so that VirtualBox can't start a VM with VT-support ?

No; once the extensions have been locked on or locked off by the BIOS, that setting cannot be changed without pulling all power from the CPU.

Why does WS crash?

That's a complicated question. WS may be more fragile than it has to be, but this is a difficult situation. VirtualBox, KVM, and other hypervisors that use VT typically leave the CPU in VMX root operation all of the time. VMX root operation is a crippled operating mode that does not allow the full gamut of normal CPU operations. In particular, you cannot enter real-address mode when in VMX operation.

Why does this matter? Well, when switching between 64-bit mode and legacy 32-bit mode, it is necessary to put the CPU briefly into real-address mode during the switch. If you are running a 64-bit guest on a 32-bit host, this switch is necessary to invoke the 64-bit VMM. Also, for historical reasons, VMware products deal with 64-bit guests using two "peer" VMMs: one in 32-bit legacy mode and one in 64-bit mode. Thus, whether you are running a 64-bit guest on a 32-bit host or on a 64-bit host, you need to switch briefly into real-address mode to invoke the peer that does not match the host in bitness. And, of course, running a 32-bit guest on a 64-bit host, you need to switch briefly into real-address mode to invoke the legacy 32-bit mode VMM.

None of these switches are possible when the system has been put into VMX operation by a foreign hypervisor. Moreover, you cannot even suspend the VM unless you can invoke the VMM. Thus, the only options available are to crash the VM or to busy-wait until the foreign hypervisor goes away.

You may ask why Workstation doesn't simply leave VMX operation, do what is has to do, and then return to VMX operation when it relinquishes the CPU. Unfortunately, Intel's VT design does not make this entirely possible. While it may be possible in some cases, there is the potential of crashing the foreign hypervisor if the foreign hypervisor has active VMs other than its current VM. There is no way in the VT implementation to enumerate all active VMs so that they can be properly shut down when leaving VMX operation and then restarted when resuming VMX operation. Intel apparently did not envision the scenario where there would be multiple hypervisors active on the same CPU.

0 Kudos
oskarh
Contributor
Contributor

Thanks for this in-depth answer.

I have the same issue (but with Player 2.5.3). I recognize that this is a complex problem but I don't think it's very nice that if you start VMware Player with VirtualBox already running (with VT extensions enabled of course) the player still crashes. It must be possible to check for VMX root operation during startup and then gracefully shutdown before a suspension of the VM is even neccessary.

Regards,

Oskar

0 Kudos
admin
Immortal
Immortal

Thanks for your comments. We are working on improving the user experience.

Still, it would be best for everyone if Virtual Box restored the CPU to normal operation when descheduled.

0 Kudos
oskarh
Contributor
Contributor

If that would allow running VMware products alongside VirtualBox with VT-x enabled then I totally agree. Is it an expensive operation to restore the CPU to normal operation?

Is there a ticket for this issue on the public VirtualBox issue tracker?

0 Kudos
admin
Immortal
Immortal

If that would allow running VMware products alongside VirtualBox with VT-x enabled then I totally agree.

It would.

Is it an expensive operation to restore the CPU to normal operation?

It's probably in the hundreds of cycles, at most.

Is there a ticket for this issue on the public VirtualBox issue tracker?

Sorry; I don't know.

0 Kudos
HPReg
VMware Employee
VMware Employee

continuum and oskarh, thanks for reporting the issue. We believe we have fixed it in VMware Workstation 7, that we released yesterday. Do you mind giving it a try?

0 Kudos
oskarh
Contributor
Contributor

I just tried and now it works! (using Player 3.0.0, but I guess it's no difference to Workstation)

Thanks for fixing this and informing about the update.

0 Kudos
sque
Contributor
Contributor

Hi HPReg,

as jmattson described in depth there is no way for two foreign hypervisor to work on the same host with VT. Those words have been written in many sites and even on wiki of KVM and Virtualbox. Being a user of KVM and reading what you I said I gave a try to workstation 7 to see if something have changed.

Although the first time a vmware vm start-ups all KVM vms crash at the second try both KVM and VMware vms run simultaneously. After that success I ran at KVM irc channel to debug that crash problem but didn't believe me that I had two hypervisors with VT emulation running on the same host.

I searched in release notes of Workstation 7 for more information but didn't find something there. So can you please give some more information of WHAT changed in version 7? Is there a nice solution to cooperate with other VT-enabled hypervisors? Or is it a hack that it will never work as intended. It would be nice if you provided more information or point at a better reference so that I can resolve this final issue with KVM crashing at first-run of VMware.

NOTE: To ensure that both were using VT, I changed the "virtualization engine" to VT in vmware vm.

CPU: Core 2 Quad Q6600

Host OS: Ubuntu 9.10 Desktop 32bit

KVM VMs: Ubuntu 8.04.3 Server 32bit, Ubuntu 9.10 Server (4 instances)

VMWare VM: Windows XP SP3 32bit (1 instance)

Thanks

0 Kudos
admin
Immortal
Immortal

It's a hack--Intel has assured us that it cannot possibly work.

0 Kudos
HPReg
VMware Employee
VMware Employee

As jmattson described in depth there is no way for two foreign hypervisor to work on the same host with VT. Those words have been written in many sites and even on wiki of KVM and Virtualbox. Being a user of KVM and reading what you I said I gave a try to workstation 7 to see if something have changed.

Yes, that is what jmattson (our hardware-assisted virtualization guru) thought at the time he wrote that.

But then something changed: in Workstation 7/Player 3/Fusion 3, jmattson found a hack which allows 2 foreign hypervisors to work on the same host with VT. It works really well in practice as you have been able to verify, but Intel warned us that sometimes it might not work in theory. I suspect this is the reason why jmattson does not want to talk about it.

Still, not working 1% of the time is better than not working 100% of the time, so jmattson implemented the hack in the code we ship.

Do you have a URL for the VirtualBox wiki where this is being discussed? I remember having seen it, but I cannot find it anymore. I have personally verified that Fusion 3 and VirtualBox 3.0.4 could use VT simultaneously on both the 32-bit and the 64-bit kernels of Mac OS 10.6, so I would like to update the wiki.

0 Kudos
admin
Immortal
Immortal

Unfortunately, I can't say much about our discussions with Intel on this subject. However, one of our assumptions was that we could not necessarily enlist the assistance of other hypervisor vendors to make this work. If the kvm developers are willing to assist us in making this happen, I think we can probably make it happen.

Everything would be fine if all hypervisor vendors could agree to leave VMX operation when yielding the CPU. VMware already does this if the CPU was not in VMX operation when we were scheduled. There may be some resistance to this idea for performance reasons. However, our current hack would be more robust if all hypervisor vendors could agree to clear the "launched" state of every active VMCS on the physical core before yielding the core to us. There may still be some resistance to this idea for performance reasons, but perhaps that's the starting point of a negotiation.

0 Kudos
sque
Contributor
Contributor

First of all thanks for the great job you do by informing the public. Smiley Happy

HRPeg I don't have urls, I had done this reseach 6 months ago when I was trying to make KVM and Workstation 6.5 work together. However I am almost sure that both wikis had this written down somewhere.

So it is a hack and a nasty one, as Intel judges, but it (almost) works! Although I definetly HATE hacks this is a hardware overcome giving you a feature that you would never have with current cpus. Congrats!

Can you please explain drawbacks that it may have? Is it too expensive in context switching? Or anyother info that may be usefull when you set-up a multi-hypervisor enviroment.

In my case I have KVM/libvirt installed that autostarts 4 linux server VMs. When I first boot up my computer (and KVM autostarts) the first time that I will startup a vmware virtual machine, all KVM vms will crash instantly with log saying

kvm: unhandled exit 6

kvm_run returned -22

After that crash if I restart KVM, starting/stoping vmware vms does not affect them, except that one of them may get affected by a HUGE performance decrease and the gnome applet "system monitor" showing a high graph on "IOWait" however this may be the case with task scheduling on 9.10 as someone of you verified this on another thread .

Do you have any idea why does this happen? Does vmware initializes VT in a more elegant way than KVM? I want to try to solve it with guys from KVM but I don't have enough info.

Thanks very much Smiley Happy

0 Kudos
sque
Contributor
Contributor

You posted it while I was writting my previous post. You answered many of what I asked before reading my questions ! Do you have anything else to add now that I have described my case in details?

0 Kudos
admin
Immortal
Immortal

Thank you for providing those additional details. Unhandled exit 6 is likely to mean VM-instruction error number 6, which means "VMRESUME with a corrupted VMCS (indicates corruption of the current VMCS)." This is exactly what we would expect when our hack fails. This particular failure could be avoided if the kvm hypervisor would clear the "launched" state of every active VMCS on the physical core before yielding the core to us. There may be other solutions as well, but there is nothing that VMware can do unilaterally to address this problem. Any feasible solution requires cooperation among the many hypervisor vendors.

0 Kudos
admin
Immortal
Immortal

kvm: unhandled exit 6

kvm_run returned -22

kvm may be able to recover from this error by doing a VMCLEAR of the current VMCS, followed by a VMPTRLD of that VMCS, followed by a VMLAUNCH. This should force the hardware to discard any cached VMCS information and re-synch from the in-memory VMCS. The VMLAUNCH cannot fail due to "corrupted VMCS."

If everything else has gone smoothly, the in-memory VMCS will actually have correct and up-to-date information when this kind of failure is observed due to an interaction with VMware products, so this sequence should work around the problem.

Since this new code sequence would only be in response to a condition which currently causes a fatal exit, there is no performance penalty involved for kvm. However, it does rely on unarchitected behavior and stacks the house of cards even higher.

The one drawback is that this attempt at recovery could result in data corruption or other problems if the in-memory VMCS truly is corrupted.

0 Kudos