can't you lock the VT extensions so that VirtualBox can't start a VM with VT-support ?
No; once the extensions have been locked on or locked off by the BIOS, that setting cannot be changed without pulling all power from the CPU.
Why does WS crash?
That's a complicated question. WS may be more fragile than it has to be, but this is a difficult situation. VirtualBox, KVM, and other hypervisors that use VT typically leave the CPU in VMX root operation all of the time. VMX root operation is a crippled operating mode that does not allow the full gamut of normal CPU operations. In particular, you cannot enter real-address mode when in VMX operation.
Why does this matter? Well, when switching between 64-bit mode and legacy 32-bit mode, it is necessary to put the CPU briefly into real-address mode during the switch. If you are running a 64-bit guest on a 32-bit host, this switch is necessary to invoke the 64-bit VMM. Also, for historical reasons, VMware products deal with 64-bit guests using two "peer" VMMs: one in 32-bit legacy mode and one in 64-bit mode. Thus, whether you are running a 64-bit guest on a 32-bit host or on a 64-bit host, you need to switch briefly into real-address mode to invoke the peer that does not match the host in bitness. And, of course, running a 32-bit guest on a 64-bit host, you need to switch briefly into real-address mode to invoke the legacy 32-bit mode VMM.
None of these switches are possible when the system has been put into VMX operation by a foreign hypervisor. Moreover, you cannot even suspend the VM unless you can invoke the VMM. Thus, the only options available are to crash the VM or to busy-wait until the foreign hypervisor goes away.
You may ask why Workstation doesn't simply leave VMX operation, do what is has to do, and then return to VMX operation when it relinquishes the CPU. Unfortunately, Intel's VT design does not make this entirely possible. While it may be possible in some cases, there is the potential of crashing the foreign hypervisor if the foreign hypervisor has active VMs other than its current VM. There is no way in the VT implementation to enumerate all active VMs so that they can be properly shut down when leaving VMX operation and then restarted when resuming VMX operation. Intel apparently did not envision the scenario where there would be multiple hypervisors active on the same CPU.
Thanks for this in-depth answer.
I have the same issue (but with Player 2.5.3). I recognize that this is a complex problem but I don't think it's very nice that if you start VMware Player with VirtualBox already running (with VT extensions enabled of course) the player still crashes. It must be possible to check for VMX root operation during startup and then gracefully shutdown before a suspension of the VM is even neccessary.
If that would allow running VMware products alongside VirtualBox with VT-x enabled then I totally agree.
Is it an expensive operation to restore the CPU to normal operation?
It's probably in the hundreds of cycles, at most.
Is there a ticket for this issue on the public VirtualBox issue tracker?
Sorry; I don't know.
as jmattson described in depth there is no way for two foreign hypervisor to work on the same host with VT. Those words have been written in many sites and even on wiki of KVM and Virtualbox. Being a user of KVM and reading what you I said I gave a try to workstation 7 to see if something have changed.
Although the first time a vmware vm start-ups all KVM vms crash at the second try both KVM and VMware vms run simultaneously. After that success I ran at KVM irc channel to debug that crash problem but didn't believe me that I had two hypervisors with VT emulation running on the same host.
I searched in release notes of Workstation 7 for more information but didn't find something there. So can you please give some more information of WHAT changed in version 7? Is there a nice solution to cooperate with other VT-enabled hypervisors? Or is it a hack that it will never work as intended. It would be nice if you provided more information or point at a better reference so that I can resolve this final issue with KVM crashing at first-run of VMware.
NOTE: To ensure that both were using VT, I changed the "virtualization engine" to VT in vmware vm.
CPU: Core 2 Quad Q6600
Host OS: Ubuntu 9.10 Desktop 32bit
KVM VMs: Ubuntu 8.04.3 Server 32bit, Ubuntu 9.10 Server (4 instances)
VMWare VM: Windows XP SP3 32bit (1 instance)
As jmattson described in depth there is no way for two foreign hypervisor to work on the same host with VT. Those words have been written in many sites and even on wiki of KVM and Virtualbox. Being a user of KVM and reading what you I said I gave a try to workstation 7 to see if something have changed.
Yes, that is what jmattson (our hardware-assisted virtualization guru) thought at the time he wrote that.
But then something changed: in Workstation 7/Player 3/Fusion 3, jmattson found a hack which allows 2 foreign hypervisors to work on the same host with VT. It works really well in practice as you have been able to verify, but Intel warned us that sometimes it might not work in theory. I suspect this is the reason why jmattson does not want to talk about it.
Still, not working 1% of the time is better than not working 100% of the time, so jmattson implemented the hack in the code we ship.
Do you have a URL for the VirtualBox wiki where this is being discussed? I remember having seen it, but I cannot find it anymore. I have personally verified that Fusion 3 and VirtualBox 3.0.4 could use VT simultaneously on both the 32-bit and the 64-bit kernels of Mac OS 10.6, so I would like to update the wiki.
Unfortunately, I can't say much about our discussions with Intel on this subject. However, one of our assumptions was that we could not necessarily enlist the assistance of other hypervisor vendors to make this work. If the kvm developers are willing to assist us in making this happen, I think we can probably make it happen.
Everything would be fine if all hypervisor vendors could agree to leave VMX operation when yielding the CPU. VMware already does this if the CPU was not in VMX operation when we were scheduled. There may be some resistance to this idea for performance reasons. However, our current hack would be more robust if all hypervisor vendors could agree to clear the "launched" state of every active VMCS on the physical core before yielding the core to us. There may still be some resistance to this idea for performance reasons, but perhaps that's the starting point of a negotiation.
First of all thanks for the great job you do by informing the public.
HRPeg I don't have urls, I had done this reseach 6 months ago when I was trying to make KVM and Workstation 6.5 work together. However I am almost sure that both wikis had this written down somewhere.
So it is a hack and a nasty one, as Intel judges, but it (almost) works! Although I definetly HATE hacks this is a hardware overcome giving you a feature that you would never have with current cpus. Congrats!
Can you please explain drawbacks that it may have? Is it too expensive in context switching? Or anyother info that may be usefull when you set-up a multi-hypervisor enviroment.
In my case I have KVM/libvirt installed that autostarts 4 linux server VMs. When I first boot up my computer (and KVM autostarts) the first time that I will startup a vmware virtual machine, all KVM vms will crash instantly with log saying
kvm: unhandled exit 6
kvm_run returned -22
After that crash if I restart KVM, starting/stoping vmware vms does not affect them, except that one of them may get affected by a HUGE performance decrease and the gnome applet "system monitor" showing a high graph on "IOWait" however this may be the case with task scheduling on 9.10 as someone of you verified this on another thread Workstation 7 on Ubuntu 9.10 host -- VMs run dog slow .
Do you have any idea why does this happen? Does vmware initializes VT in a more elegant way than KVM? I want to try to solve it with guys from KVM but I don't have enough info.
Thanks very much
Thank you for providing those additional details. Unhandled exit 6 is likely to mean VM-instruction error number 6, which means "VMRESUME with a corrupted VMCS (indicates corruption of the current VMCS)." This is exactly what we would expect when our hack fails. This particular failure could be avoided if the kvm hypervisor would clear the "launched" state of every active VMCS on the physical core before yielding the core to us. There may be other solutions as well, but there is nothing that VMware can do unilaterally to address this problem. Any feasible solution requires cooperation among the many hypervisor vendors.