Hi,
I'm on Mandriva Loinux 2007, running VMWareWorkstation 5 (latest release). My VMWare installation is running WinXP Pro ServicePack 2, and, when operating Explorer (or Firefox, I just tried), VMWare hangs my whole machine for up to 5 minutes! I still can move my mouse (the pointer could show... or not), my Linux panels (which are normally hidden) will show if I pass over them, but I won't be able to click, nor change to any other application.
When VMWare finally release the machine, everything works normally. This could even happen again in my Internet session.
I'm talking of Explorer or Firefox operation, because it can happen on the browser bootup, and it can happen during the Internet session. I even already saw it at the Internet session shut down.
This is annoying at the point of making VMWare completely useless for Internet browsing.
Christian Tardif
\[quote=KevinG]Hi dugan,
It would be great if you could try the " hpet=disable"
Also provide the dmesg, /proc/interrupts information without the hpet kernel option when it freezes.
Thanks
-Kevin[/quote]
Yep. After I posted that message, I performed the tests suggested.
First Round, I passed the hpet=disable option to the kernel that previously demonstrated the vmware "freeze" issue as reported in this thread and others.
With "hpet=disable" passed, I was not able to duplicate the problem in the old kernel that previously demonstrated this "freeze" issue.
(No gathering of dmesg or /proc/interrupts during time of problem, because I wasn't able to duplicate this with this configuration.)
I removed the "hpet=disable" , and rebooted with the problematic combination of kernel/vmware.
I ran a dmesg before starting, and then again after the freeze, and here are the differences:
\[17179752.816000] /dev/vmnet: open called by PID 5725 (vmware-vmx)
\[17179752.816000] /dev/vmnet: port on hub 8 successfully opened
\[17179752.884000] /dev/vmmon\[5732]: host clock rate change request 0 -> 19
\[17179756.820000] /dev/vmmon\[5732]: host clock rate change request 19 -> 83
\[17179773.240000] /dev/vmnet: open called by PID 5732 (vmware-vmx)
\[17179773.240000] /dev/vmnet: port on hub 8 successfully opened
\[17179843.908000] /dev/vmmon\[5732]: host clock rate change request 83 -> 1043
\[17179880.380000] /dev/vmmon\[5732]: host clock rate change request 1043 -> 83
Now for the run on dumping interrupts starting before the freeze, sampled ever 10 seconds, then the results near the end when the "Freeze" ends:
(Before "Freeze")
\# while true ; do cat /proc/interrupts ; date ; sleep 10 ; done | tee /tmp/int.txt
Sat Mar 31 15:14:21 PDT 2007
CPU0 CPU1
0: 73576 0 IO-APIC-edge timer
1: 407 0 IO-APIC-edge i8042
8: 16896 0 IO-APIC-edge rtc
9: 2 0 IO-APIC-level acpi
12: 114 0 IO-APIC-edge i8042
14: 23460 0 IO-APIC-edge libata
15: 1618 0 IO-APIC-edge libata
50: 14095 0 IO-APIC-level uhci_hcd:usb2, HDA Intel
58: 0 0 IO-APIC-level uhci_hcd:usb3
66: 0 0 IO-APIC-level uhci_hcd:usb4
74: 1836 0 PCI-MSI eth0
169: 15947 0 IO-APIC-level nvidia
177: 13871 0 IO-APIC-level ipw3945
185: 7 0 IO-APIC-level yenta
233: 51
368 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5
NMI: 0 0
LOC: 73381 73376
ERR: 0
MIS: 0
Sat Mar 31 15:14:31 PDT 2007
CPU0 CPU1
0: 76079 0 IO-APIC-edge timer
1: 407 0 IO-APIC-edge i8042
8: 16896 0 IO-APIC-edge rtc
9: 2 0 IO-APIC-level acpi
12: 114 0 IO-APIC-edge i8042
14: 24722 0 IO-APIC-edge libata
15: 1712 0 IO-APIC-edge libata
50: 14957 0 IO-APIC-level uhci_hcd:usb2, HDA Intel
58: 0 0 IO-APIC-level uhci_hcd:usb3
66: 0 0 IO-APIC-level uhci_hcd:usb4
74: 3316 0 PCI-MSI eth0
169: 16555 0 IO-APIC-level nvidia
177: 14339 0 IO-APIC-level ipw3945
185: 7 0 IO-APIC-level yenta
233: 53376 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5
NMI: 0 0
LOC: 75884 75879
ERR: 0
MIS: 0
Sat Mar 31 15:14:41 PDT 2007
CPU0 CPU1
0: 78581 0 IO-APIC-edge timer
1: 407 0 IO-APIC-edge i8042
8: 16896 0 IO-APIC-edge rtc
9: 2 0 IO-APIC-level acpi
12: 114 0 IO-APIC-edge i8042
14: 25124 0 IO-APIC-edge libata
15: 1769 0 IO-APIC-edge libata
50: 16874 0 IO-APIC-level uhci_hcd:usb2, HDA Intel
58: 0 0 IO-APIC-level uhci_hcd:usb3
66: 0 0 IO-APIC-level uhci_hcd:usb4
74: 3931 0 PCI-MSI eth0
169: 17177 0 IO-APIC-level nvidia
177: 14759 0 IO-APIC-level ipw3945
185: 7 0 IO-APIC-level yenta
233: 55384 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5
NMI: 0 0
LOC: 78386 78381
ERR: 0
MIS: 0
Sat Mar 31 15:14:51 PDT 2007
(This, below, IIRC, is the first sample after the freeze started. The one above would be before.)
CPU0 CPU1
0: 81084 0 IO-APIC-edge timer
1: 407 0 IO-APIC-edge i8042
8: 16896 0 IO-APIC-edge rtc
9: 2 0 IO-APIC-level acpi
12: 114 0 IO-APIC-edge i8042
14: 25189 0 IO-APIC-edge libata
15: 1789 0 IO-APIC-edge libata
50: 17018 0 IO-APIC-level uhci_hcd:usb2, HDA Intel
58: 0 0 IO-APIC-level uhci_hcd:usb3
66: 0 0 IO-APIC-level uhci_hcd:usb4
74: 3945 0 PCI-MSI eth0
169: 17785 0 IO-APIC-level nvidia
177: 15112 0 IO-APIC-level ipw3945
185: 7 0 IO-APIC-level yenta
233: 57388 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5
NMI: 0 0
LOC: 80889 80884
ERR: 0
MIS: 0
Sat Mar 31 15:15:01 PDT 2007
(skip ahead until around the time when the "Freeze" ends:
CPU0 CPU1
0: 134522 0 IO-APIC-edge timer
1: 407 0 IO-APIC-edge i8042
8: 16896 0 IO-APIC-edge rtc
9: 2 0 IO-APIC-level acpi
12: 114 0 IO-APIC-edge i8042
14: 25546 0 IO-APIC-edge libata
15: 2213 0 IO-APIC-edge libata
50: 20975 0 IO-APIC-level uhci_hcd:usb2, HDA Intel
58: 0 0 IO-APIC-level uhci_hcd:usb3
66: 0 0 IO-APIC-level uhci_hcd:usb4
74: 5669 0 PCI-MSI eth0
169: 30757 0 IO-APIC-level nvidia
177: 22786 0 IO-APIC-level ipw3945
185: 7 0 IO-APIC-level yenta
233: 100164 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5
NMI: 0 0
LOC: 134327 134322
ERR: 0
MIS: 0
Sat Mar 31 15:18:35 PDT 2007
CPU0 CPU1
0: 137025 0 IO-APIC-edge timer
1: 407 0 IO-APIC-edge i8042
8: 16896 0 IO-APIC-edge rtc
9: 2 0 IO-APIC-level acpi
12: 114 0 IO-APIC-edge i8042
14: 25551 0 IO-APIC-edge libata
15: 2233 0 IO-APIC-edge libata
50: 21122 0 IO-APIC-level uhci_hcd:usb2, HDA Intel
58: 0 0 IO-APIC-level uhci_hcd:usb3
66: 0 0 IO-APIC-level uhci_hcd:usb4
74: 5676 0 PCI-MSI eth0
169: 31365 0 IO-APIC-level nvidia
177: 23160 0 IO-APIC-level ipw3945
185: 7 0 IO-APIC-level yenta
233: 102158 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5
NMI: 0 0
LOC: 136830 136825
ERR: 0
MIS: 0
Sat Mar 31 15:18:45 PDT 2007
CPU0 CPU1
0: 139528 0 IO-APIC-edge timer
1: 407 0 IO-APIC-edge i8042
8: 16896 0 IO-APIC-edge rtc
9: 2 0 IO-APIC-level acpi
12: 114 0 IO-APIC-edge i8042
14: 25574 0 IO-APIC-edge libata
15: 2253 0 IO-APIC-edge libata
50: 22368 0 IO-APIC-level uhci_hcd:usb2, HDA Intel
58: 0 0 IO-APIC-level uhci_hcd:usb3
66: 0 0 IO-APIC-level uhci_hcd:usb4
74: 5684 0 PCI-MSI eth0
169: 31973 0 IO-APIC-level nvidia
177: 23559 0 IO-APIC-level ipw3945
185: 7 0 IO-APIC-level yenta
233: 104152 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5
NMI: 0 0
LOC: 139333 139328
ERR: 0
MIS: 0
Sat Mar 31 15:18:55 PDT 2007
(The freeze stopped somewhere in here. Perhaps the previous, or the next.)
CPU0 CPU1
0: 142031 0 IO-APIC-edge timer
1: 407 0 IO-APIC-edge i8042
8: 16896 0 IO-APIC-edge rtc
9: 2 0 IO-APIC-level acpi
12: 114 0 IO-APIC-edge i8042
14: 25579 0 IO-APIC-edge libata
15: 2273 0 IO-APIC-edge libata
50: 22510 0 IO-APIC-level uhci_hcd:usb2, HDA Intel
58: 0 0 IO-APIC-level uhci_hcd:usb3
66: 0 0 IO-APIC-level uhci_hcd:usb4
74: 5690 0 PCI-MSI eth0
169: 32580 0 IO-APIC-level nvidia
177: 23936 0 IO-APIC-level ipw3945
185: 7 0 IO-APIC-level yenta
233: 106150 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5
NMI: 0 0
LOC: 141836 141831
ERR: 0
MIS: 0
Sat Mar 31 15:19:05 PDT 2007
CPU0 CPU1
0: 144534 0 IO-APIC-edge timer
1: 407 0 IO-APIC-edge i8042
8: 16896 0 IO-APIC-edge rtc
9: 2 0 IO-APIC-level acpi
12: 114 0 IO-APIC-edge i8042
14: 25584 0 IO-APIC-edge libata
15: 2293 0 IO-APIC-edge libata
50: 22656 0 IO-APIC-level uhci_hcd:usb2, HDA Intel
58: 0 0 IO-APIC-level uhci_hcd:usb3
66: 0 0 IO-APIC-level uhci_hcd:usb4
74: 5696 0 PCI-MSI eth0
169: 33188 0 IO-APIC-level nvidia
177: 24328 0 IO-APIC-level ipw3945
185: 7 0 IO-APIC-level yenta
233: 108146 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5
NMI: 0 0
LOC: 144339 144334
ERR: 0
MIS: 0
Sat Mar 31 15:19:15 PDT 2007
CPU0 CPU1
0: 147037 0 IO-APIC-edge timer
1: 407 0 IO-APIC-edge i8042
8: 16898 0 IO-APIC-edge rtc
9: 2 0 IO-APIC-level acpi
12: 114 0 IO-APIC-edge i8042
14: 25725 0 IO-APIC-edge libata
15: 2355 0 IO-APIC-edge libata
50: 23777 0 IO-APIC-level uhci_hcd:usb2, HDA Intel
58: 0 0 IO-APIC-level uhci_hcd:usb3
66: 0 0 IO-APIC-level uhci_hcd:usb4
74: 5712 0 PCI-MSI eth0
169: 33817 0 IO-APIC-level nvidia
177: 24702 0 IO-APIC-level ipw3945
185: 7 0 IO-APIC-level yenta
233: 110153 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5
NMI: 0 0
LOC: 146842 146837
ERR: 0
MIS: 0
Sat Mar 31 15:19:25 PDT 2007
HTH.
Please let me know if you would like more tests run, and if you would prefer to have vmware run with a debugging level, and what level you would like it run with.
For now, I am going to keep running with (working) 2.6.20.4 unless you have other other tests you think would help you identify the potential hardware/kernel/vmware interoperability "freeze" issue.
Hi dugan,
Thanks for taking the time to perform these tests.
I will let Petr know that you have posted additional information.
Thanks everyone for your cooperation while this issue is under investigation.
with my SpeedStep switched OFF in the bios,
which according to my Dell bios page locks it in
lowest performance mode (and it certainly felt like
it).
Then, personally, I would take this issue up with Dell for violating Intel's design. Refer to http://www.intel.com/cd/channel/reseller/asmo-na/eng/products/desktop/processor/processors/core2duo/... and other similar pages on Intel's site describing their SpeedStep and Enhanced SpeedStep technology. Pay close attention to the screen shots - when SpeedStep is disabled[/i] (i.e. turned OFF), the processors do NOT adjust their frequency, and are locked at the MAXIMUM power/frequency. If Dell is doing the opposite, then I believe it is time to file a class action suit against Dell for violating Intel's advertised design of their technology, and for lying to their customers.
P.S. Looks like one more reason I choose to avoid any of the big name brand manufacturers.
Oh, climb down off your soap box. When Dell pays for the processor it becomes
theirs and they can do with it as they please.
They had a good reason, namely preserving battery.
Lets keep this thread on track.
The problem at hand is Xserver input freezes, which are not limited to Dells.
Yep. Take a look at IRQ 8 in your samples - it did not increment while you were dumping data from /proc/interrupts, and only started working after 5 minutes (after 32bit counter wrapped around). There is no other fix than (1) booting with hpet=disable (both disable and disabled should work, code just test first 7 characters only), or (2) build your kernel without HPET support, or (3) disable HPET in the BIOS.
There is nothing else we can do - on kernels after 2.6.21 we can try to use kernel's NOHZ infrastructure to provide precise timming for virtual machines, but if you have traditional HZ based kernel you need working /dev/rtc - and rtc emulation over HPET is not working one. I promise to dig up sample code...
I do not think that Xserver input freezes. You just have grabbed focus to VM, and you cannot release it until timer expires - which happens in 5 minutes. If you have remote connection, you could verify it...
Also can you please post full 'dmesg' when booted kernel with hpet=disable option? All your symptoms look same as others, and it helped to others, so I wonder what's different with your box. And then, can you (before starting your VM) start
date
cat /proc/interrupts | grep \[08C]:
sleep 10
done) > /tmp/interrupt-stats[/code]
and then after you'll observe lockup provide samples logged when lockup started, then few from middle of lockup, and finally samples from time lockup ended?
I do not think that Xserver input freezes. You just have grabbed focus to VM, and
you cannot release it until timer expires - which happens in 5 minutes.
When I said X input freezes, I was trying to describe symptoms, not necessarily
the cause. I do know that X output still occurs as the clock is updating.
If you have remote connection, you could verify it...
What would you suggest as the best method of verifying this? As I have mentioned
an ssh session into the machine from another machine is possible during the freeze
so the entire machine is not frozen. Cursor also moves.
>Also can you please post full 'dmesg' when booted kernel with hpet=disable option?
Ok, Petr here is a boot-thru-lockup dmesg printout, I'll run the
other test mentioned above today.
Bootdata ok (command line is root=/dev/sda2 vga=0x314 resume=/dev/sda1 splash=silent hpet=disable)
Linux version 2.6.18.2-34-default (geeko@buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009f000 (usable)
BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)
BIOS-e820: 0000000000100000 - 000000007fed3400 (usable)
BIOS-e820: 000000007fed3400 - 0000000080000000 (reserved)
BIOS-e820: 00000000f0000000 - 00000000f4007000 (reserved)
BIOS-e820: 00000000f4008000 - 00000000f400c000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
BIOS-e820: 00000000fed20000 - 00000000feda0000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
DMI 2.4 present.
ACPI: RSDP (v000 DELL ) @ 0x00000000000fc1b0
ACPI: RSDT (v001 DELL M07 0x27d70205 ASL 0x00000061) @ 0x000000007fed39cd
ACPI: FADT (v001 DELL M07 0x27d70205 ASL 0x00000061) @ 0x000000007fed4800
ACPI: HPET (v001 DELL M07 0x00000001 ASL 0x00000061) @ 0x000000007fed4f00
ACPI: MADT (v001 DELL M07 0x27d70205 ASL 0x00000047) @ 0x000000007fed5000
ACPI: MCFG (v016 DELL M07 0x27d70205 ASL 0x00000061) @ 0x000000007fed4fc0
ACPI: SLIC (v001 DELL M07 0x27d70205 ASL 0x00000061) @ 0x000000007fed509c
ACPI: BOOT (v001 DELL M07 0x27d70205 ASL 0x00000061) @ 0x000000007fed4bc0
ACPI: SSDT (v001 PmRef CpuPm 0x00003000 INTL 0x20050624) @ 0x000000007fed3a0d
ACPI: DSDT (v001 INT430 SYSFexxx 0x00001001 INTL 0x20050624) @ 0x0000000000000000
No NUMA configuration found
Faking a node at 0000000000000000-000000007fed3000
Bootmem setup node 0 0000000000000000-000000007fed3000
No mptable found.
On node 0 totalpages: 515679
DMA zone: 2895 pages, LIFO batch:0
DMA32 zone: 512784 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x1008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id\[0x00] lapic_id\[0x00] enabled)
Processor #0 6:15 APIC version 20
ACPI: LAPIC (acpi_id\[0x01] lapic_id\[0x01] enabled)
Processor #1 6:15 APIC version 20
ACPI: LAPIC_NMI (acpi_id\[0x00] high edge lint\[0x1])
ACPI: LAPIC_NMI (acpi_id\[0x01] high edge lint\[0x1])
ACPI: IOAPIC (id\[0x02] address\[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to physical flat
ACPI: HPET id: 0x8086a201 base: 0xfed00000
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 88000000 (gap: 80000000:70000000)
SMP: Allowing 2 CPUs, 0 hotplug CPUs
Built 1 zonelists. Total pages: 515679
Kernel command line: root=/dev/sda2 vga=0x314 resume=/dev/sda1 splash=silent hpet=disable
bootsplash: silent mode.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
time.c: Using 14.318180 MHz WALL HPET GTOD HPET/TSC timer.
time.c: Detected 2161.249 MHz processor.
Console: colour dummy device 80x25
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Checking aperture...
Memory: 2056072k/2095948k available (1915k kernel code, 39488k reserved, 1278k data, 188k init)
Calibrating delay using timer specific routine.. 4328.26 BogoMIPS (lpj=8656530)
Security Framework v1.0.0 initialized
Mount-cache hash table entries: 256
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
using mwait in idle threads.
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU0: Thermal monitoring enabled (TM2)
SMP alternatives: switching to UP code
checking if image is initramfs... it is
Freeing initrd memory: 3377k freed
ACPI: Core revision 20060707
Using local APIC timer interrupts.
result 10390610
Detected 10.390 MHz APIC timer.
SMP alternatives: switching to SMP code
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4322.57 BogoMIPS (lpj=8645150)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU1: Thermal monitoring enabled (TM2)
Intel(R) Core(TM)2 CPU T7400 @ 2.16GHz stepping 06
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff -7 cycles, maxerr 1105 cycles)
Brought up 2 CPUs
testing NMI watchdog ... OK.
migration_cost=25
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using MMCONFIG at f0000000
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge \[PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
ACPI: Assume root bridge \[\_SB_.PCI0] bus is 0
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.2
Boot video device is 0000:01:00.0
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Routing Table \[\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Link \[LNKA] (IRQs 9 10 11) *4
ACPI: PCI Interrupt Link \[LNKB] (IRQs *5 7)
ACPI: PCI Interrupt Link \[LNKC] (IRQs *9 10 11)
ACPI: PCI Interrupt Link \[LNKD] (IRQs 5 7 9 10 11) *3
ACPI: PCI Interrupt Link \[LNKE] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
ACPI: PCI Interrupt Link \[LNKF] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link \[LNKG] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)
ACPI: PCI Interrupt Link \[LNKH] (IRQs 3 4 5 6 *7 9 10 11 12 14 15)
ACPI: PCI Interrupt Routing Table \[\_SB_.PCI0.AGP_._PRT]
ACPI: PCI Interrupt Routing Table \[\_SB_.PCI0.PCIE._PRT]
ACPI: PCI Interrupt Routing Table \[\_SB_.PCI0.RP01._PRT]
ACPI: PCI Interrupt Routing Table \[\_SB_.PCI0.RP02._PRT]
ACPI: PCI Interrupt Routing Table \[\_SB_.PCI0.RP04._PRT]
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
hpet0: at MMIO 0xfed00000 (virtual 0xffffffffff5fe000), IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14318180 Hz
PCI-GART: No AMD northbridge found.
PCI: Bridge: 0000:00:01.0
IO window: e000-efff
MEM window: efd00000-efefffff
PREFETCH window: d0000000-dfffffff
PCI: Bridge: 0000:00:1c.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:1c.1
IO window: disabled.
MEM window: efc00000-efcfffff
PREFETCH window: disabled.
PCI: Bridge: 0000:00:1c.3
IO window: d000-dfff
MEM window: efa00000-efbfffff
PREFETCH window: e0000000-e01fffff
PCI: Bridge: 0000:00:1e.0
IO window: disabled.
MEM window: ef900000-ef9fffff
PREFETCH window: disabled.
GSI 16 sharing vector 0xA9 and IRQ 16
ACPI: PCI Interrupt 0000:00:01.0[A] -> GSI 16 (level, low) -> IRQ 169
PCI: Setting latency timer of device 0000:00:01.0 to 64
ACPI: PCI Interrupt 0000:00:1c.0[A] -> GSI 16 (level, low) -> IRQ 169
PCI: Setting latency timer of device 0000:00:1c.0 to 64
GSI 17 sharing vector 0xB1 and IRQ 17
ACPI: PCI Interrupt 0000:00:1c.1[B] -> GSI 17 (level, low) -> IRQ 177
PCI: Setting latency timer of device 0000:00:1c.1 to 64
GSI 18 sharing vector 0xB9 and IRQ 18
ACPI: PCI Interrupt 0000:00:1c.3[D] -> GSI 19 (level, low) -> IRQ 185
PCI: Setting latency timer of device 0000:00:1c.3 to 64
PCI: Setting latency timer of device 0000:00:1e.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 65536 (order: 7, 524288 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
Simple Boot Flag at 0x79 set to 0x80
audit: initializing netlink socket (disabled)
audit(1175353753.820:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
Initializing Cryptographic API
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
PCI: Setting latency timer of device 0000:00:01.0 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service\[0000:00:01.0:pcie00]
Allocate Port Service\[0000:00:01.0:pcie03]
PCI: Setting latency timer of device 0000:00:1c.0 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service\[0000:00:1c.0:pcie00]
Allocate Port Service\[0000:00:1c.0:pcie02]
Allocate Port Service\[0000:00:1c.0:pcie03]
PCI: Setting latency timer of device 0000:00:1c.1 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service\[0000:00:1c.1:pcie00]
Allocate Port Service\[0000:00:1c.1:pcie02]
Allocate Port Service\[0000:00:1c.1:pcie03]
PCI: Setting latency timer of device 0000:00:1c.3 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service\[0000:00:1c.3:pcie00]
Allocate Port Service\[0000:00:1c.3:pcie02]
Allocate Port Service\[0000:00:1c.3:pcie03]
vesafb: framebuffer at 0xd0000000, mapped to 0xffffc20010100000, using 3750k, total 16384k
vesafb: mode is 800x600x16, linelength=1600, pages=16
vesafb: scrolling: redraw
vesafb: Truecolor: size=0:5:6:5, shift=0:11:5:0
bootsplash 3.1.6-2004/03/31: looking for picture... 1043
Interrupts samples:
Pre-lockup----
Sun Apr 1 12:32:19 AKDT 2007
0: 409561 0 IO-APIC-edge timer
8: 67648 0 IO-APIC-edge rtc
50: 23439 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 409456 409381
Sun Apr 1 12:32:29 AKDT 2007
0: 412103 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 24091 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 411998 411923
Sun Apr 1 12:32:39 AKDT 2007
0: 414609 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 24784 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 414504 414429
Sun Apr 1 12:32:49 AKDT 2007
0: 417116 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 25318 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 417011 416936
Lockup occurred at 12:33 (exact seconds not recorded)
Sun Apr 1 12:32:59 AKDT 2007
0: 419618 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 25629 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 419513 419438
Sun Apr 1 12:33:09 AKDT 2007
0: 422120 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 25629 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 422015 421940
Sun Apr 1 12:33:19 AKDT 2007
0: 424622 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 25629 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 424517 424442
Sun Apr 1 12:33:29 AKDT 2007
0: 427124 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 25634 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 427019 426944
Sun Apr 1 12:33:39 AKDT 2007
0: 429626 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 25634 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 429521 429446
Sun Apr 1 12:33:49 AKDT 2007
0: 432128 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 25689 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 432023 431948
Sun Apr 1 12:33:59 AKDT 2007
0: 434630 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 25689 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 434525 434450
Sun Apr 1 12:34:09 AKDT 2007
0: 437132 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 25689 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 437027 436952
\------- Post lockup...
Sun Apr 1 12:37:09 AKDT 2007
0: 482168 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 26838 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 482063 481988
Sun Apr 1 12:37:19 AKDT 2007
0: 484670 0 IO-APIC-edge timer
8: 70194 0 IO-APIC-edge rtc
50: 26838 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 484565 484490
Sun Apr 1 12:37:29 AKDT 2007
0: 487172 0 IO-APIC-edge timer
8: 70196 0 IO-APIC-edge rtc
50: 26889 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 487067 486992
Sun Apr 1 12:37:39 AKDT 2007
0: 489674 0 IO-APIC-edge timer
8: 70196 0 IO-APIC-edge rtc
50: 26889 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2
58: 0 0 IO-APIC-level uhci_hcd:usb4
LOC: 489569 489494
>Yep. Take a look at IRQ 8 in your samples - it did not increment while you were dumping data from /proc/interrupts
It also didn't increment BEFORE the lock up either? Is that normal?
\- neither did mine, with hpet=disable
maybe it's just my lack of understanding hpet, but wouldn't disabling it lead to problems with multimedia streams, interrupt request, etc?
Disabling hpet forces the kernel to use a different source for timers. It forces the use of the PIT for timing. (Programmer Interrupt Timer)
See
http://www.kernel.org/pub/linux/kernel/people/gregkh/lkn/lkn_pdf/ch09.pdf
There seems to be another parameter that allows you to choose your timer source. Namely "clocksource"
gotcha - thanks for the clarification.
so, i jut need to add hpet=disable right after the kernel boot image in the bootloader? ie: 'kernel-2.xx.xx.xx hpet=disable'
Well, I would just type it in manually on the boot prompt for now when you
intend to use Vmware. At least till Petr can check it out a bit longer.
Where you make the permanent change depends on which boot loader you use
and whether it proves to be the definitive solution.
For me, it didn't make any difference. I still get the 5 minute freeze even with hpet=disable.
which distro are you running? i'm running opensuse 10.2 w/kernel 2.6.18.8-0.1
Same Distro, kernel 2.6.18.2-34.
Core 2 Duo.
Same lockup problems here. Dual P4 (not dual core), kernel 2.6.19-1.2895.fc6, fedora OS, vmware workstation 5.5.3, windows 2000 guest. hpet=disable did not correct problem. Dmesg output looks similar:
/dev/vmmon\[29489]: host clock rate change request 83 -> 1043
/dev/vmmon\[29489]: host clock rate change request 1043 -> 83
/dev/vmmon\[29489]: host clock rate change request 83 -> 1043
/dev/vmmon\[29489]: host clock rate change request 1043 -> 83
/dev/vmmon\[29489]: host clock rate change request 83 -> 1043
/dev/vmmon\[29489]: host clock rate change request 1043 -> 83
/dev/vmmon\[29489]: host clock rate change request 83 -> 1043
/dev/vmmon\[29489]: host clock rate change request 1043 -> 83
/dev/vmmon\[29489]: host clock rate change request 83 -> 1043
/dev/vmmon\[29489]: host clock rate change request 1043 -> 83
/dev/vmmon\[29489]: host clock rate change request 83 -> 1043
/dev/vmmon\[29489]: host clock rate change request 1043 -> 83
/dev/vmmon\[29489]: host clock rate change request 83 -> 1043
/dev/vmmon\[29489]: host clock rate change request 1043 -> 83
/dev/vmmon\[29489]: host clock rate change request 83 -> 1043
/dev/vmmon\[29489]: host clock rate change request 1043 -> 83
Interrupts as follows:
LOC: 1662577 1662576
Tue Apr 3 15:59:18 EDT 2007
0: 1672931 0 IO-APIC-edge timer
8: 1 0 IO-APIC-edge rtc
18: 218164 0 IO-APIC-fasteoi eth0
20: 0 0 IO-APIC-fasteoi libata
LOC: 1672581 1672580
== froze
Tue Apr 3 15:59:28 EDT 2007
0: 1682934 0 IO-APIC-edge timer
8: 1 0 IO-APIC-edge rtc
18: 219521 0 IO-APIC-fasteoi eth0
20: 0 0 IO-APIC-fasteoi libata
LOC: 1682570 1682569
== unfrozen
Tue Apr 3 15:59:38 EDT 2007
0: 1692944 0 IO-APIC-edge timer
8: 1 0 IO-APIC-edge rtc
18: 220402 0 IO-APIC-fasteoi eth0
20: 0 0 IO-APIC-fasteoi libata
LOC: 1692580 1692579
Tue Apr 3 15:59:48 EDT 2007
0: 1702951 0 IO-APIC-edge timer
8: 1 0 IO-APIC-edge rtc
18: 222174 0 IO-APIC-fasteoi eth0
20: 0 0 IO-APIC-fasteoi libata
LOC: 1702588 1702587
Tue Apr 3 15:59:58 EDT 2007
0: 1712978 0 IO-APIC-edge timer
8: 1 0 IO-APIC-edge rtc
18: 224012 0 IO-APIC-fasteoi eth0
20: 0 0 IO-APIC-fasteoi libata
LOC: 1712609 1712608
Tue Apr 3 16:00:08 EDT 2007
0: 1723095 0 IO-APIC-edge timer
8: 1 0 IO-APIC-edge rtc
18: 226302 0 IO-APIC-fasteoi eth0
20: 0 0 IO-APIC-fasteoi libata
LOC: 1722723 1722722
== froze
Tue Apr 3 16:00:18 EDT 2007
0: 1733094 0 IO-APIC-edge timer
8: 1 0 IO-APIC-edge rtc
18: 227618 0 IO-APIC-fasteoi eth0
20: 0 0 IO-APIC-fasteoi libata
LOC: 1732715 1732714
Tue Apr 3 16:00:28 EDT 2007
0: 1743103 0 IO-APIC-edge timer
8: 1 0 IO-APIC-edge rtc
18: 228105 0 IO-APIC-fasteoi eth0
20: 0 0 IO-APIC-fasteoi libata
LOC: 1742724 1742723
== unfrozen
Tue Apr 3 16:00:38 EDT 2007
0: 1753107 0 IO-APIC-edge timer
8: 1 0 IO-APIC-edge rtc
18: 229654 0 IO-APIC-fasteoi eth0
20: 0 0 IO-APIC-fasteoi libata
LOC: 1752728 1752727
Top shows that iowait pegs at 50% when the guest is frozen.
Cpu(s): 0.0%us, 0.2%sy, 0.0%ni, 50.0%id, 49.8%wa, 0.0%hi, 0.0%si, 0.0%st
These particular freezes were pretty short, but the freeze is usually a few minutes.
this has me really nervous. i'm slated to build out a virtual environment for a customer and had planned on using vmware. i'm thinkin i should start looking for another solution as my experience with vmware on my laptop has been great when it's running - not so great when it freezes every hour.
Dekker:
These freezes are much too short to be the same problem.
Look a the times for the Interrupt mappings. They are all within seconds
of each other.
The lockups under discussion in this thread are reliably 5 minutes long.[/b]
You also said:
>Top shows that iowait pegs at 50% when the guest is frozen.
This sounds like a disk problem to me, perhaps reading/writing massive amounts of data on the virtual disk.
Well if the planned virtual environment is on a core 2 duo platform I'd wait till this is solved.
I have Vmware Server running on a Dual Core Pentium D, and its utterly reliable.
Workstation is a testing and development platform, not really intended for production.
Still it is a difficult situation with me, as I rely on Workstation for my Vista, XP, and Win2k testbeds, and these versions are freezing.
i'm having the problem with server and workstation. but these are servers i'm building out, they are mutli-cpu but not core2duo as my notebook is.
oh - and for the record, hpet=disable didn't solve the problem for me either.