VMware Cloud Community
LesykM
Contributor
Contributor
Jump to solution

ESXi inside KVM

Hello!

I wonder, if it is possible to run ESXi 5.0/5.1 inside KVM (proxmox) ?

I need to run few for testing purposes. I cannot setup any other virtualization software.

I need only ESXi itself, even without virtual machines inside it:)

Thanks!

Tags (3)
180 Replies
admin
Immortal
Immortal
Jump to solution

Is there anything in /var/log/messages on the host regarding unimplemented MSRs?

Reply
0 Kudos
jcp0wermac
Enthusiast
Enthusiast
Jump to solution

Nope unfortunately I didn't see anything in the logs.

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

Try this:

# echo 1 > /sys/modules/kvm/parameters/ignore_msrs

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

Or possibly this:

# modprobe kvm ignore_msrs=1

Reply
0 Kudos
jcp0wermac
Enthusiast
Enthusiast
Jump to solution

module without the "s" worked

echo 1 > /sys/module/kvm/parameters/ignore_msrs

And a different PSOD, see attached.

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

I suspect that the PSOD is due to the fact that the CPU frequency has been measured at 1.44 petahertz.  It looks like ESX is using the ACPI PM timer to get the CPU frequency estimate.  Can you try invoking qemu-kvm with the '-no-acpi' option?

Reply
0 Kudos
jcp0wermac
Enthusiast
Enthusiast
Jump to solution

Didn't make it to Initializing chipset...

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

Going back one step, I see the following in the serial log:  "Running in a VM, so getting results from backdoor."  ESX is not getting the TSC frequency based on the ACPI PM timer.  It's getting it from kvm/qemu via a hypercall. This is the issue I referred to earlier in this thread.  QEMU pretends to implement the VMware backdoor, but it doesn't actually implement enough of it for ESX to work in a VM.  See the instructions in update 11 for recompiling qemu so that it won't pretend to implement the VMware backdoor.

Reply
0 Kudos
zamf
Contributor
Contributor
Jump to solution

Hello,

Thank you for the answers in this post, they were helpful in running ESXi 5.5 in QEMU with --enable-kvm.

However, I am mainly interested in running ESXi 5.5 in QEMU in DBT mode (i.e., without --enable-kvm). The main problem is that ESXi 5.5 requires at least 2 cores, so I cannot go past the install part where ESXi checks the number of cores. Our QEMU-based tool currently works with a single core, so it would be great if we could force ESXi to work with a single core. Is there any way to force ESXi to work with a single core? Perhaps some undocumented boot option?

I also tried with 2 cores in DBT mode and it seems like the way ESXi checks that the two cores have the same frequency does not work in DBT mode. It seems ESXi uses TSC to measure clock frequency and expects synchronized values, which is not the case in DBT mode. Do you know if this is more of a sanity check, or ESXi relies on the two cores having the same frequency? This would help me see how to fix it in QEMU.

Here are the relevant snippets from the serial log (the full log is attached):

0:00:00:01.190 cpu0:1)Initializing timing ...

0:00:00:01.191 cpu0:1)HPET: 206: 64-bit, 100000000 Hz HPET at 0xfed00

0:00:00:01.191 cpu0:1)HPET: 208: HPET capabilities 0x9896808086a201, configuration 0x0

0:00:00:01.193 cpu0:1)HPET: 98: 1000 calls to HPET_Read32() took 2077076 TSC cycles

0:00:00:01.194 cpu0:1)HPET: 99: 1000 calls to HPET_Read() took 3588860 TSC cycles

0:00:00:01.235 cpu0:1)HPET: 118: Counter start 291627, end 4307230, diffUS 40156

0:00:00:03.236 cpu0:1)Timer: 1488: cpu 0: measured cpu speed (using TSC): 2900098312 Hz, bus speed: 999923991 Hz

0:00:00:03.240 cpu0:1)Initializing scheduler ...

...

0:00:00:23.037 cpu0:32768)CpuSched: 583: user latency of 32782 CmdCompl-0 0 changed by 32768 bootstrap -1

0:00:00:23.040 cpu0:32768)CpuSched: 583: user latency of 32783 CmdCompl-1 0 changed by 32768 bootstrap -1

0:00:00:23.055 cpu0:32768)ScsiEvents: 501: Event Subsystem: APD_Event_Subsystem, Created!

0:00:00:23.056 cpu0:32768)ScsiEvents: 301: EventSubsystem: APD_Event_Subsystem, Event Mask: 0, Parameter: 0x0, Registered!

0:00:00:23.056 cpu0:32768)ScsiEvents: 501: Event Subsystem: ScsiDHE_Subsystem, Created!

0:00:00:23.057 cpu0:32768)ScsiEvents: 301: EventSubsystem: ScsiDHE_Subsystem, Event Mask: 0, Parameter: 0x0, Registered!

0:00:00:23.419 cpu1:32769)SMP: 425: cpu 1: measured cpu speed (using TSC): 2900178232 Hz, bus speed: 938671392 Hz

0:00:00:26.029 cpu0:32768)ALERT: Timer: 4094: APIC timer speed measurement inconsistency: late cpu 1 938671392 vs. early cpu 0 999923991)

0:00:00:32.510 cpu0:32768)Timer: 4139: reference timer is TSC at 2900098312 Hz

0:00:00:32.512 cpu0:32768)World: 8773: PRDA 0x418040000000 ss 0x0 ds 0x4018 es 0x4018 fs 0x4018 gs 0x4018

0:00:00:32.513 cpu0:32768)World: 8775: TR 0x4020 GDT 0x412380021000 (0x402f) IDT 0x418008ef3000 (0xfff)

0:00:00:32.514 cpu0:32768)World: 8776: CR0 0x8001003d CR3 0x80016000 CR4 0x12c

0:00:00:32.516 cpu0:32768)Backtrace for current CPU #0, worldID=32768, ebp=0x417fc8e08c50

0:00:00:32.518 cpu0:32768)0x417fc8e08c50:[0x418008e8ccd9]<no symbols>+0x8e8ccd9 stack: 0x8, 0x417fc8e08cc0, 0x417fc8e08c80, 0

0:00:00:32.519 cpu0:32768)0x417fc8e08cb0:[0x418008e8cf1d]<no symbols>+0x8e8cf1d stack: 0x417fc8e08cf0, 0x41800bad00c0, 0x4123

0:00:00:32.519 cpu0:32768)0x417fc8e08cf0:[0x418008e67508]<no symbols>+0x8e67508 stack: 0x418008e0bd3e, 0x410841e04060, 0x4100

0:00:00:32.520 cpu0:32768)0x417fc8e08df0:[0x418008e00649]<no symbols>+0x8e00649 stack: 0x10000000001, 0x1400000014, 0x1000000

0:00:00:32.521 cpu0:32768)0x417fc8e08fe0:[0x418008e01169]<no symbols>+0x8e01169 stack: 0x418009053148, 0x0, 0x0, 0x0, 0x0

0:00:00:32.523 cpu0:32768) [45m [33;1mVMware ESXi 5.5.0 [Releasebuild-1331820 x86_64] [0m

VMKernel initialization error: Multiprocessor initialization failed: Timer initialization failed.

0:00:00:32.525 cpu0:32768)cr0=0x8001003d cr2=0x0 cr3=0x1000e0000 cr4=0x12c

0:00:00:32.526 cpu0:32768)pcpu:0 world:32768 name:"bootstrap" (S)

0:00:00:32.526 cpu0:32768)pcpu:1 world:32769 name:"idle1" (IS)

0:00:00:32.526 cpu0:32768)@BlueScreen: VMKernel initialization error: Multiprocessor initialization failed: Timer initialization failed.

0:00:00:32.527 cpu0:32768)Code start: 0x418008e00000 VMK uptime: 0:00:00:32.527

0:00:00:32.528 cpu0:32768)0x417fc8e08c50:[0x418008e8ccd9]<no symbols>+0x8e8ccd9 stack: 0x8

0:00:00:32.529 cpu0:32768)0x417fc8e08cb0:[0x418008e8cf1d]<no symbols>+0x8e8cf1d stack: 0x417fc8e08cf0

0:00:00:32.529 cpu0:32768)0x417fc8e08cf0:[0x418008e67508]<no symbols>+0x8e67508 stack: 0x418008e0bd3e

0:00:00:32.530 cpu0:32768)0x417fc8e08df0:[0x418008e00649]<no symbols>+0x8e00649 stack: 0x10000000001

0:00:00:32.531 cpu0:32768)0x417fc8e08fe0:[0x418008e01169]<no symbols>+0x8e01169 stack: 0x418009053148

0:00:00:32.538 cpu0:32768)base fs=0x0 gs=0x418040000000 Kgs=0x0

0:00:00:32.509 cpu0:32768)Timer: 4094: APIC timer speed measurement inconsistency: late cpu 1 938671392 vs. early cpu 0 999923991)

0:00:00:32.509 cpu0:32768)Timer: 4094: APIC timer speed measurement inconsistency: late cpu 1 938671392 vs. early cpu 0 999923991)

0:00:00:33.764 cpu0:32768)vmkernel             0x0 .data 0x0 .bss 0x0

0:00:00:34.206 cpu0:32768)No place on disk to dump data.

0:00:00:34.207 cpu0:32768)No file configured to dump data.

Debugger waiting(world 32768) Debugger waiting(world 32768) $S05#b8Halting PCPU 1.

Thank you!

Cristi

Reply
0 Kudos
zamf
Contributor
Contributor
Jump to solution

Clarification: I looked closer at the serial output and the problem is the measured bus speed, not the frequency:

999923991 vs. 938671392 => a 61252599 difference.

When running with KVM, the values are:

0:00:00:02.410 cpu0:1)Timer: 1488: cpu 0: measured cpu speed (using TSC): 2900075151 Hz, bus speed: 1000000942 Hz

0:00:00:04.461 cpu1:32769)SMP: 425: cpu 1: measured cpu speed (using TSC): 2900083440 Hz, bus speed: 999997830 Hz

so the difference is just 3112, which does not raise any flags in ESXi.

Would it be possible to know how ESXi measures the bus speed?

Thank you.

Reply
0 Kudos
zamf
Contributor
Contributor
Jump to solution

Update:

It turns out there is an option to ignore the bus speed difference: One can pass busSpeedMayVary=TRUE in the boot options. I installed ESXi with the QEMU options --enable-kvm -smp 2 and then booted the image without KVM with the options: "-cpu SandyBridge -smp 2" and passed the option busSpeedMayVary=TRUE to ESXi at boot time.

This solved the problem.

I am still searching for a way to boot ESXi 5.5 inside a QEMU vm with only 1 CPU. Here is the message I get when I try to run with a single CPU.

Screenshot 2013-11-14 14.46.10.png

Thank you!

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

This message is from the installer.  What if you install under KVM with two vCPUs and then try to boot the already-installed image under QEMU with a single vCPU?

Reply
0 Kudos
zamf
Contributor
Contributor
Jump to solution

Thanks  a lot, this worked. Only the installer checks the number of CPUs.

Reply
0 Kudos
zamf
Contributor
Contributor
Jump to solution

ESXi boots just fine in both KVM and dynamic binary translation mode with versions of QEMU up to 1.0.

However, we have a modified version of QEMU, and with this version, we get this ESXi crash.

ESXi_crash.png

The serial log (attached) has various warnings about failed reads, like "Failed to request module '/usr/lib/vmware/vmkmod/iodm': Failure "

Do you have any idea if this is the likely cause of the crash? Do you see any other suspicious behavior in the log?

Thank you,

Cristi

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

Would you mind uploading the log file as an attachment rather than embedded in your message?  (Use the advanced editor.)

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

I suspect it means that networking did not come up.  What kind of virtual NIC are you using?

Reply
0 Kudos
zamf
Contributor
Contributor
Jump to solution

It is Intel e1000, which works with the vanilla QEMU.

Which message in the serial log triggered your suspicion? The serial log for a successful run with vanilla QEMU does not have messages like "Failed to request module '/usr/lib/vmware/vmkmod/iodm", so I was suspecting there is a problem with the disk.

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

I'm looking at "Net: 1999: invalid Net_Create: class etherswitch not supported."  Since the backtrace shows a page fault in NetVsi_TcpipMultiInstanceList, I'm guessing that the error is networking related.

Reply
0 Kudos
zamf
Contributor
Contributor
Jump to solution

I did a diff of the serial log for a successful run and a failed run. Besides the fact that the failed run does not detect the HPET, this is the first divergence:

Correct run:

cpu0:33234)Loading module iodm ...

cpu0:33234)Elf: 1861: module iodm has license VMware

cpu0:33234)Mod: 4780: Initialization of iodm succeeded with module ID 4.

cpu0:33234)iodm loaded successfully.

Failed run:

cpu0:33232)User: 2886: wantCoreDump : vmkeventd -enabled : 0

cpu0:33154)WARNING: Mod: 6592: Failed to request module '/usr/lib/vmware/vmkmod/iodm': Failure

cpu0:33154)Config: 346: "UserMemASRandomSeed" = 1965899852, Old Value: 0, (Status: 0x0)

This error occurs just after the vmkapi_mgmt module loaded successfully in both runs. Strangely, the iodm module eventually gets to be loaded in the failed run too, after all sort of other failures to request other modules.

cpu0:33283)Loading module vmkplexer ...

cpu0:33283)Elf: 1861: module vmkplexer has license GPLv2

cpu0:33283)vmkplexer-heap: initial size : 262144, max size: 20971520

cpu0:33283)vmkplexer-heap: heap creation succeeded. id = 0x41090eebe000

cpu0:33283)vmkplexer registration succeeded!

cpu0:33283)Mod: 4780: Initialization of vmkplexer succeeded with module ID 4101.

cpu0:33283)vmkplexer loaded successfully.

cpu0:33283)Loading module iodm ...

cpu0:33283)Elf: 1861: module iodm has license VMware

cpu0:33283)Mod: 4780: Initialization of iodm succeeded with module ID 6.

cpu0:33283)iodm loaded successfully.

However, many modules are not loaded, including networking related ones. The error you pointed out probably comes from the fact that this module could not be loaded:

Failed to request module '/usr/lib/vmware/vmkmod/etherswitch'

I think the root cause is related to the failed module requests from /usr/lib

My question is where are these modules located? Is it a ram disk or the disk where I installed ESXi?

Can I enable additional logging related to these failed module requests?

Thank you,

Cristi

Reply
0 Kudos
admin
Immortal
Immortal
Jump to solution

It would appear that vmkeventd is dying, and the messages about modules failing to load are the result of not being able to communicate with vmkeventd.  I don't really see any way of enabling additional logging, aside from booting a beta build of ESXi.

Reply
0 Kudos