VMware Communities
tardich
Contributor
Contributor

VMWare on Linux hangs for 5 minutes

Hi,

I'm on Mandriva Loinux 2007, running VMWareWorkstation 5 (latest release). My VMWare installation is running WinXP Pro ServicePack 2, and, when operating Explorer (or Firefox, I just tried), VMWare hangs my whole machine for up to 5 minutes! I still can move my mouse (the pointer could show... or not), my Linux panels (which are normally hidden) will show if I pass over them, but I won't be able to click, nor change to any other application.

When VMWare finally release the machine, everything works normally. This could even happen again in my Internet session.

I'm talking of Explorer or Firefox operation, because it can happen on the browser bootup, and it can happen during the Internet session. I even already saw it at the Internet session shut down.

This is annoying at the point of making VMWare completely useless for Internet browsing.

Christian Tardif

0 Kudos
98 Replies
dugan
Contributor
Contributor

\[quote=KevinG]Hi dugan,

It would be great if you could try the " hpet=disable"

Also provide the dmesg, /proc/interrupts information without the hpet kernel option when it freezes.

Thanks

-Kevin[/quote]

Yep. After I posted that message, I performed the tests suggested.

First Round, I passed the hpet=disable option to the kernel that previously demonstrated the vmware "freeze" issue as reported in this thread and others.

With "hpet=disable" passed, I was not able to duplicate the problem in the old kernel that previously demonstrated this "freeze" issue.

(No gathering of dmesg or /proc/interrupts during time of problem, because I wasn't able to duplicate this with this configuration.)

I removed the "hpet=disable" , and rebooted with the problematic combination of kernel/vmware.

I ran a dmesg before starting, and then again after the freeze, and here are the differences:

\[17179752.816000] /dev/vmnet: open called by PID 5725 (vmware-vmx)

\[17179752.816000] /dev/vmnet: port on hub 8 successfully opened

\[17179752.884000] /dev/vmmon\[5732]: host clock rate change request 0 -> 19

\[17179756.820000] /dev/vmmon\[5732]: host clock rate change request 19 -> 83

\[17179773.240000] /dev/vmnet: open called by PID 5732 (vmware-vmx)

\[17179773.240000] /dev/vmnet: port on hub 8 successfully opened

\[17179843.908000] /dev/vmmon\[5732]: host clock rate change request 83 -> 1043

\[17179880.380000] /dev/vmmon\[5732]: host clock rate change request 1043 -> 83

Now for the run on dumping interrupts starting before the freeze, sampled ever 10 seconds, then the results near the end when the "Freeze" ends:

(Before "Freeze")

\# while true ; do cat /proc/interrupts ; date ; sleep 10 ; done | tee /tmp/int.txt

Sat Mar 31 15:14:21 PDT 2007

CPU0 CPU1

0: 73576 0 IO-APIC-edge timer

1: 407 0 IO-APIC-edge i8042

8: 16896 0 IO-APIC-edge rtc

9: 2 0 IO-APIC-level acpi

12: 114 0 IO-APIC-edge i8042

14: 23460 0 IO-APIC-edge libata

15: 1618 0 IO-APIC-edge libata

50: 14095 0 IO-APIC-level uhci_hcd:usb2, HDA Intel

58: 0 0 IO-APIC-level uhci_hcd:usb3

66: 0 0 IO-APIC-level uhci_hcd:usb4

74: 1836 0 PCI-MSI eth0

169: 15947 0 IO-APIC-level nvidia

177: 13871 0 IO-APIC-level ipw3945

185: 7 0 IO-APIC-level yenta

233: 51

368 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5

NMI: 0 0

LOC: 73381 73376

ERR: 0

MIS: 0

Sat Mar 31 15:14:31 PDT 2007

CPU0 CPU1

0: 76079 0 IO-APIC-edge timer

1: 407 0 IO-APIC-edge i8042

8: 16896 0 IO-APIC-edge rtc

9: 2 0 IO-APIC-level acpi

12: 114 0 IO-APIC-edge i8042

14: 24722 0 IO-APIC-edge libata

15: 1712 0 IO-APIC-edge libata

50: 14957 0 IO-APIC-level uhci_hcd:usb2, HDA Intel

58: 0 0 IO-APIC-level uhci_hcd:usb3

66: 0 0 IO-APIC-level uhci_hcd:usb4

74: 3316 0 PCI-MSI eth0

169: 16555 0 IO-APIC-level nvidia

177: 14339 0 IO-APIC-level ipw3945

185: 7 0 IO-APIC-level yenta

233: 53376 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5

NMI: 0 0

LOC: 75884 75879

ERR: 0

MIS: 0

Sat Mar 31 15:14:41 PDT 2007

CPU0 CPU1

0: 78581 0 IO-APIC-edge timer

1: 407 0 IO-APIC-edge i8042

8: 16896 0 IO-APIC-edge rtc

9: 2 0 IO-APIC-level acpi

12: 114 0 IO-APIC-edge i8042

14: 25124 0 IO-APIC-edge libata

15: 1769 0 IO-APIC-edge libata

50: 16874 0 IO-APIC-level uhci_hcd:usb2, HDA Intel

58: 0 0 IO-APIC-level uhci_hcd:usb3

66: 0 0 IO-APIC-level uhci_hcd:usb4

74: 3931 0 PCI-MSI eth0

169: 17177 0 IO-APIC-level nvidia

177: 14759 0 IO-APIC-level ipw3945

185: 7 0 IO-APIC-level yenta

233: 55384 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5

NMI: 0 0

LOC: 78386 78381

ERR: 0

MIS: 0

Sat Mar 31 15:14:51 PDT 2007

(This, below, IIRC, is the first sample after the freeze started. The one above would be before.)

CPU0 CPU1

0: 81084 0 IO-APIC-edge timer

1: 407 0 IO-APIC-edge i8042

8: 16896 0 IO-APIC-edge rtc

9: 2 0 IO-APIC-level acpi

12: 114 0 IO-APIC-edge i8042

14: 25189 0 IO-APIC-edge libata

15: 1789 0 IO-APIC-edge libata

50: 17018 0 IO-APIC-level uhci_hcd:usb2, HDA Intel

58: 0 0 IO-APIC-level uhci_hcd:usb3

66: 0 0 IO-APIC-level uhci_hcd:usb4

74: 3945 0 PCI-MSI eth0

169: 17785 0 IO-APIC-level nvidia

177: 15112 0 IO-APIC-level ipw3945

185: 7 0 IO-APIC-level yenta

233: 57388 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5

NMI: 0 0

LOC: 80889 80884

ERR: 0

MIS: 0

Sat Mar 31 15:15:01 PDT 2007

(skip ahead until around the time when the "Freeze" ends:

CPU0 CPU1

0: 134522 0 IO-APIC-edge timer

1: 407 0 IO-APIC-edge i8042

8: 16896 0 IO-APIC-edge rtc

9: 2 0 IO-APIC-level acpi

12: 114 0 IO-APIC-edge i8042

14: 25546 0 IO-APIC-edge libata

15: 2213 0 IO-APIC-edge libata

50: 20975 0 IO-APIC-level uhci_hcd:usb2, HDA Intel

58: 0 0 IO-APIC-level uhci_hcd:usb3

66: 0 0 IO-APIC-level uhci_hcd:usb4

74: 5669 0 PCI-MSI eth0

169: 30757 0 IO-APIC-level nvidia

177: 22786 0 IO-APIC-level ipw3945

185: 7 0 IO-APIC-level yenta

233: 100164 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5

NMI: 0 0

LOC: 134327 134322

ERR: 0

MIS: 0

Sat Mar 31 15:18:35 PDT 2007

CPU0 CPU1

0: 137025 0 IO-APIC-edge timer

1: 407 0 IO-APIC-edge i8042

8: 16896 0 IO-APIC-edge rtc

9: 2 0 IO-APIC-level acpi

12: 114 0 IO-APIC-edge i8042

14: 25551 0 IO-APIC-edge libata

15: 2233 0 IO-APIC-edge libata

50: 21122 0 IO-APIC-level uhci_hcd:usb2, HDA Intel

58: 0 0 IO-APIC-level uhci_hcd:usb3

66: 0 0 IO-APIC-level uhci_hcd:usb4

74: 5676 0 PCI-MSI eth0

169: 31365 0 IO-APIC-level nvidia

177: 23160 0 IO-APIC-level ipw3945

185: 7 0 IO-APIC-level yenta

233: 102158 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5

NMI: 0 0

LOC: 136830 136825

ERR: 0

MIS: 0

Sat Mar 31 15:18:45 PDT 2007

CPU0 CPU1

0: 139528 0 IO-APIC-edge timer

1: 407 0 IO-APIC-edge i8042

8: 16896 0 IO-APIC-edge rtc

9: 2 0 IO-APIC-level acpi

12: 114 0 IO-APIC-edge i8042

14: 25574 0 IO-APIC-edge libata

15: 2253 0 IO-APIC-edge libata

50: 22368 0 IO-APIC-level uhci_hcd:usb2, HDA Intel

58: 0 0 IO-APIC-level uhci_hcd:usb3

66: 0 0 IO-APIC-level uhci_hcd:usb4

74: 5684 0 PCI-MSI eth0

169: 31973 0 IO-APIC-level nvidia

177: 23559 0 IO-APIC-level ipw3945

185: 7 0 IO-APIC-level yenta

233: 104152 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5

NMI: 0 0

LOC: 139333 139328

ERR: 0

MIS: 0

Sat Mar 31 15:18:55 PDT 2007

(The freeze stopped somewhere in here. Perhaps the previous, or the next.)

CPU0 CPU1

0: 142031 0 IO-APIC-edge timer

1: 407 0 IO-APIC-edge i8042

8: 16896 0 IO-APIC-edge rtc

9: 2 0 IO-APIC-level acpi

12: 114 0 IO-APIC-edge i8042

14: 25579 0 IO-APIC-edge libata

15: 2273 0 IO-APIC-edge libata

50: 22510 0 IO-APIC-level uhci_hcd:usb2, HDA Intel

58: 0 0 IO-APIC-level uhci_hcd:usb3

66: 0 0 IO-APIC-level uhci_hcd:usb4

74: 5690 0 PCI-MSI eth0

169: 32580 0 IO-APIC-level nvidia

177: 23936 0 IO-APIC-level ipw3945

185: 7 0 IO-APIC-level yenta

233: 106150 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5

NMI: 0 0

LOC: 141836 141831

ERR: 0

MIS: 0

Sat Mar 31 15:19:05 PDT 2007

CPU0 CPU1

0: 144534 0 IO-APIC-edge timer

1: 407 0 IO-APIC-edge i8042

8: 16896 0 IO-APIC-edge rtc

9: 2 0 IO-APIC-level acpi

12: 114 0 IO-APIC-edge i8042

14: 25584 0 IO-APIC-edge libata

15: 2293 0 IO-APIC-edge libata

50: 22656 0 IO-APIC-level uhci_hcd:usb2, HDA Intel

58: 0 0 IO-APIC-level uhci_hcd:usb3

66: 0 0 IO-APIC-level uhci_hcd:usb4

74: 5696 0 PCI-MSI eth0

169: 33188 0 IO-APIC-level nvidia

177: 24328 0 IO-APIC-level ipw3945

185: 7 0 IO-APIC-level yenta

233: 108146 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5

NMI: 0 0

LOC: 144339 144334

ERR: 0

MIS: 0

Sat Mar 31 15:19:15 PDT 2007

CPU0 CPU1

0: 147037 0 IO-APIC-edge timer

1: 407 0 IO-APIC-edge i8042

8: 16898 0 IO-APIC-edge rtc

9: 2 0 IO-APIC-level acpi

12: 114 0 IO-APIC-edge i8042

14: 25725 0 IO-APIC-edge libata

15: 2355 0 IO-APIC-edge libata

50: 23777 0 IO-APIC-level uhci_hcd:usb2, HDA Intel

58: 0 0 IO-APIC-level uhci_hcd:usb3

66: 0 0 IO-APIC-level uhci_hcd:usb4

74: 5712 0 PCI-MSI eth0

169: 33817 0 IO-APIC-level nvidia

177: 24702 0 IO-APIC-level ipw3945

185: 7 0 IO-APIC-level yenta

233: 110153 0 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5

NMI: 0 0

LOC: 146842 146837

ERR: 0

MIS: 0

Sat Mar 31 15:19:25 PDT 2007

HTH.

Please let me know if you would like more tests run, and if you would prefer to have vmware run with a debugging level, and what level you would like it run with.

For now, I am going to keep running with (working) 2.6.20.4 unless you have other other tests you think would help you identify the potential hardware/kernel/vmware interoperability "freeze" issue.

0 Kudos
KevinG
Immortal
Immortal

Hi dugan,

Thanks for taking the time to perform these tests.

I will let Petr know that you have posted additional information.

Thanks everyone for your cooperation while this issue is under investigation.

0 Kudos
RDPetruska
Leadership
Leadership

with my SpeedStep switched OFF in the bios,

which according to my Dell bios page locks it in

lowest performance mode (and it certainly felt like

it).

Then, personally, I would take this issue up with Dell for violating Intel's design. Refer to http://www.intel.com/cd/channel/reseller/asmo-na/eng/products/desktop/processor/processors/core2duo/... and other similar pages on Intel's site describing their SpeedStep and Enhanced SpeedStep technology. Pay close attention to the screen shots - when SpeedStep is disabled[/i] (i.e. turned OFF), the processors do NOT adjust their frequency, and are locked at the MAXIMUM power/frequency. If Dell is doing the opposite, then I believe it is time to file a class action suit against Dell for violating Intel's advertised design of their technology, and for lying to their customers.

P.S. Looks like one more reason I choose to avoid any of the big name brand manufacturers.

0 Kudos
jsa
Enthusiast
Enthusiast

Oh, climb down off your soap box. When Dell pays for the processor it becomes

theirs and they can do with it as they please.

They had a good reason, namely preserving battery.

Lets keep this thread on track.

The problem at hand is Xserver input freezes, which are not limited to Dells.

0 Kudos
petr
VMware Employee
VMware Employee

Yep. Take a look at IRQ 8 in your samples - it did not increment while you were dumping data from /proc/interrupts, and only started working after 5 minutes (after 32bit counter wrapped around). There is no other fix than (1) booting with hpet=disable (both disable and disabled should work, code just test first 7 characters only), or (2) build your kernel without HPET support, or (3) disable HPET in the BIOS.

There is nothing else we can do - on kernels after 2.6.21 we can try to use kernel's NOHZ infrastructure to provide precise timming for virtual machines, but if you have traditional HZ based kernel you need working /dev/rtc - and rtc emulation over HPET is not working one. I promise to dig up sample code...

0 Kudos
petr
VMware Employee
VMware Employee

I do not think that Xserver input freezes. You just have grabbed focus to VM, and you cannot release it until timer expires - which happens in 5 minutes. If you have remote connection, you could verify it...

Also can you please post full 'dmesg' when booted kernel with hpet=disable option? All your symptoms look same as others, and it helped to others, so I wonder what's different with your box. And then, can you (before starting your VM) start

(while true; do

date

cat /proc/interrupts | grep \[08C]:

sleep 10

done) > /tmp/interrupt-stats[/code]

and then after you'll observe lockup provide samples logged when lockup started, then few from middle of lockup, and finally samples from time lockup ended?

0 Kudos
jsa
Enthusiast
Enthusiast

I do not think that Xserver input freezes. You just have grabbed focus to VM, and

you cannot release it until timer expires - which happens in 5 minutes.

When I said X input freezes, I was trying to describe symptoms, not necessarily

the cause. I do know that X output still occurs as the clock is updating.

If you have remote connection, you could verify it...

What would you suggest as the best method of verifying this? As I have mentioned

an ssh session into the machine from another machine is possible during the freeze

so the entire machine is not frozen. Cursor also moves.

>Also can you please post full 'dmesg' when booted kernel with hpet=disable option?

Ok, Petr here is a boot-thru-lockup dmesg printout, I'll run the

other test mentioned above today.

Bootdata ok (command line is root=/dev/sda2 vga=0x314 resume=/dev/sda1 splash=silent hpet=disable)

Linux version 2.6.18.2-34-default (geeko@buildhost) (gcc version 4.1.2 20061115 (prerelease) (SUSE Linux)) #1 SMP Mon Nov 27 11:46:27 UTC 2006

BIOS-provided physical RAM map:

BIOS-e820: 0000000000000000 - 000000000009f000 (usable)

BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)

BIOS-e820: 0000000000100000 - 000000007fed3400 (usable)

BIOS-e820: 000000007fed3400 - 0000000080000000 (reserved)

BIOS-e820: 00000000f0000000 - 00000000f4007000 (reserved)

BIOS-e820: 00000000f4008000 - 00000000f400c000 (reserved)

BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)

BIOS-e820: 00000000fed20000 - 00000000feda0000 (reserved)

BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)

BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)

DMI 2.4 present.

ACPI: RSDP (v000 DELL ) @ 0x00000000000fc1b0

ACPI: RSDT (v001 DELL M07 0x27d70205 ASL 0x00000061) @ 0x000000007fed39cd

ACPI: FADT (v001 DELL M07 0x27d70205 ASL 0x00000061) @ 0x000000007fed4800

ACPI: HPET (v001 DELL M07 0x00000001 ASL 0x00000061) @ 0x000000007fed4f00

ACPI: MADT (v001 DELL M07 0x27d70205 ASL 0x00000047) @ 0x000000007fed5000

ACPI: MCFG (v016 DELL M07 0x27d70205 ASL 0x00000061) @ 0x000000007fed4fc0

ACPI: SLIC (v001 DELL M07 0x27d70205 ASL 0x00000061) @ 0x000000007fed509c

ACPI: BOOT (v001 DELL M07 0x27d70205 ASL 0x00000061) @ 0x000000007fed4bc0

ACPI: SSDT (v001 PmRef CpuPm 0x00003000 INTL 0x20050624) @ 0x000000007fed3a0d

ACPI: DSDT (v001 INT430 SYSFexxx 0x00001001 INTL 0x20050624) @ 0x0000000000000000

No NUMA configuration found

Faking a node at 0000000000000000-000000007fed3000

Bootmem setup node 0 0000000000000000-000000007fed3000

No mptable found.

On node 0 totalpages: 515679

DMA zone: 2895 pages, LIFO batch:0

DMA32 zone: 512784 pages, LIFO batch:31

ACPI: PM-Timer IO Port: 0x1008

ACPI: Local APIC address 0xfee00000

ACPI: LAPIC (acpi_id\[0x00] lapic_id\[0x00] enabled)

Processor #0 6:15 APIC version 20

ACPI: LAPIC (acpi_id\[0x01] lapic_id\[0x01] enabled)

Processor #1 6:15 APIC version 20

ACPI: LAPIC_NMI (acpi_id\[0x00] high edge lint\[0x1])

ACPI: LAPIC_NMI (acpi_id\[0x01] high edge lint\[0x1])

ACPI: IOAPIC (id\[0x02] address\[0xfec00000] gsi_base[0])

IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23

ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)

ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)

ACPI: IRQ0 used by override.

ACPI: IRQ2 used by override.

ACPI: IRQ9 used by override.

Setting APIC routing to physical flat

ACPI: HPET id: 0x8086a201 base: 0xfed00000

Using ACPI (MADT) for SMP configuration information

Allocating PCI resources starting at 88000000 (gap: 80000000:70000000)

SMP: Allowing 2 CPUs, 0 hotplug CPUs

Built 1 zonelists. Total pages: 515679

Kernel command line: root=/dev/sda2 vga=0x314 resume=/dev/sda1 splash=silent hpet=disable

bootsplash: silent mode.

Initializing CPU#0

PID hash table entries: 4096 (order: 12, 32768 bytes)

time.c: Using 14.318180 MHz WALL HPET GTOD HPET/TSC timer.

time.c: Detected 2161.249 MHz processor.

Console: colour dummy device 80x25

Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)

Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)

Checking aperture...

Memory: 2056072k/2095948k available (1915k kernel code, 39488k reserved, 1278k data, 188k init)

Calibrating delay using timer specific routine.. 4328.26 BogoMIPS (lpj=8656530)

Security Framework v1.0.0 initialized

Mount-cache hash table entries: 256

CPU: L1 I cache: 32K, L1 D cache: 32K

CPU: L2 cache: 4096K

using mwait in idle threads.

CPU: Physical Processor ID: 0

CPU: Processor Core ID: 0

CPU0: Thermal monitoring enabled (TM2)

SMP alternatives: switching to UP code

checking if image is initramfs... it is

Freeing initrd memory: 3377k freed

ACPI: Core revision 20060707

Using local APIC timer interrupts.

result 10390610

Detected 10.390 MHz APIC timer.

SMP alternatives: switching to SMP code

Booting processor 1/2 APIC 0x1

Initializing CPU#1

Calibrating delay using timer specific routine.. 4322.57 BogoMIPS (lpj=8645150)

CPU: L1 I cache: 32K, L1 D cache: 32K

CPU: L2 cache: 4096K

CPU: Physical Processor ID: 0

CPU: Processor Core ID: 1

CPU1: Thermal monitoring enabled (TM2)

Intel(R) Core(TM)2 CPU T7400 @ 2.16GHz stepping 06

CPU 1: Syncing TSC to CPU 0.

CPU 1: synchronized TSC with CPU 0 (last diff -7 cycles, maxerr 1105 cycles)

Brought up 2 CPUs

testing NMI watchdog ... OK.

migration_cost=25

NET: Registered protocol family 16

ACPI: bus type pci registered

PCI: Using MMCONFIG at f0000000

ACPI: Interpreter enabled

ACPI: Using IOAPIC for interrupt routing

ACPI: PCI Root Bridge \[PCI0] (0000:00)

PCI: Probing PCI hardware (bus 00)

ACPI: Assume root bridge \[\_SB_.PCI0] bus is 0

PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.2

Boot video device is 0000:01:00.0

PCI: Transparent bridge - 0000:00:1e.0

ACPI: PCI Interrupt Routing Table \[\_SB_.PCI0._PRT]

ACPI: PCI Interrupt Link \[LNKA] (IRQs 9 10 11) *4

ACPI: PCI Interrupt Link \[LNKB] (IRQs *5 7)

ACPI: PCI Interrupt Link \[LNKC] (IRQs *9 10 11)

ACPI: PCI Interrupt Link \[LNKD] (IRQs 5 7 9 10 11) *3

ACPI: PCI Interrupt Link \[LNKE] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)

ACPI: PCI Interrupt Link \[LNKF] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)

ACPI: PCI Interrupt Link \[LNKG] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)

ACPI: PCI Interrupt Link \[LNKH] (IRQs 3 4 5 6 *7 9 10 11 12 14 15)

ACPI: PCI Interrupt Routing Table \[\_SB_.PCI0.AGP_._PRT]

ACPI: PCI Interrupt Routing Table \[\_SB_.PCI0.PCIE._PRT]

ACPI: PCI Interrupt Routing Table \[\_SB_.PCI0.RP01._PRT]

ACPI: PCI Interrupt Routing Table \[\_SB_.PCI0.RP02._PRT]

ACPI: PCI Interrupt Routing Table \[\_SB_.PCI0.RP04._PRT]

PCI: Using ACPI for IRQ routing

PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report

hpet0: at MMIO 0xfed00000 (virtual 0xffffffffff5fe000), IRQs 2, 8, 0

hpet0: 3 64-bit timers, 14318180 Hz

PCI-GART: No AMD northbridge found.

PCI: Bridge: 0000:00:01.0

IO window: e000-efff

MEM window: efd00000-efefffff

PREFETCH window: d0000000-dfffffff

PCI: Bridge: 0000:00:1c.0

IO window: disabled.

MEM window: disabled.

PREFETCH window: disabled.

PCI: Bridge: 0000:00:1c.1

IO window: disabled.

MEM window: efc00000-efcfffff

PREFETCH window: disabled.

PCI: Bridge: 0000:00:1c.3

IO window: d000-dfff

MEM window: efa00000-efbfffff

PREFETCH window: e0000000-e01fffff

PCI: Bridge: 0000:00:1e.0

IO window: disabled.

MEM window: ef900000-ef9fffff

PREFETCH window: disabled.

GSI 16 sharing vector 0xA9 and IRQ 16

ACPI: PCI Interrupt 0000:00:01.0[A] -> GSI 16 (level, low) -> IRQ 169

PCI: Setting latency timer of device 0000:00:01.0 to 64

ACPI: PCI Interrupt 0000:00:1c.0[A] -> GSI 16 (level, low) -> IRQ 169

PCI: Setting latency timer of device 0000:00:1c.0 to 64

GSI 17 sharing vector 0xB1 and IRQ 17

ACPI: PCI Interrupt 0000:00:1c.1[B] -> GSI 17 (level, low) -> IRQ 177

PCI: Setting latency timer of device 0000:00:1c.1 to 64

GSI 18 sharing vector 0xB9 and IRQ 18

ACPI: PCI Interrupt 0000:00:1c.3[D] -> GSI 19 (level, low) -> IRQ 185

PCI: Setting latency timer of device 0000:00:1c.3 to 64

PCI: Setting latency timer of device 0000:00:1e.0 to 64

NET: Registered protocol family 2

IP route cache hash table entries: 65536 (order: 7, 524288 bytes)

TCP established hash table entries: 262144 (order: 10, 4194304 bytes)

TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)

TCP: Hash tables configured (established 262144 bind 65536)

TCP reno registered

Simple Boot Flag at 0x79 set to 0x80

audit: initializing netlink socket (disabled)

audit(1175353753.820:1): initialized

Total HugeTLB memory allocated, 0

VFS: Disk quotas dquot_6.5.1

Dquot-cache hash table entries: 512 (order 0, 4096 bytes)

Initializing Cryptographic API

io scheduler noop registered

io scheduler anticipatory registered

io scheduler deadline registered

io scheduler cfq registered (default)

PCI: Setting latency timer of device 0000:00:01.0 to 64

assign_interrupt_mode Found MSI capability

Allocate Port Service\[0000:00:01.0:pcie00]

Allocate Port Service\[0000:00:01.0:pcie03]

PCI: Setting latency timer of device 0000:00:1c.0 to 64

assign_interrupt_mode Found MSI capability

Allocate Port Service\[0000:00:1c.0:pcie00]

Allocate Port Service\[0000:00:1c.0:pcie02]

Allocate Port Service\[0000:00:1c.0:pcie03]

PCI: Setting latency timer of device 0000:00:1c.1 to 64

assign_interrupt_mode Found MSI capability

Allocate Port Service\[0000:00:1c.1:pcie00]

Allocate Port Service\[0000:00:1c.1:pcie02]

Allocate Port Service\[0000:00:1c.1:pcie03]

PCI: Setting latency timer of device 0000:00:1c.3 to 64

assign_interrupt_mode Found MSI capability

Allocate Port Service\[0000:00:1c.3:pcie00]

Allocate Port Service\[0000:00:1c.3:pcie02]

Allocate Port Service\[0000:00:1c.3:pcie03]

vesafb: framebuffer at 0xd0000000, mapped to 0xffffc20010100000, using 3750k, total 16384k

vesafb: mode is 800x600x16, linelength=1600, pages=16

vesafb: scrolling: redraw

vesafb: Truecolor: size=0:5:6:5, shift=0:11:5:0

bootsplash 3.1.6-2004/03/31: looking for picture... 1043

0 Kudos
jsa
Enthusiast
Enthusiast

Interrupts samples:

Pre-lockup----


Sun Apr 1 12:32:19 AKDT 2007

0: 409561 0 IO-APIC-edge timer

8: 67648 0 IO-APIC-edge rtc

50: 23439 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 409456 409381

Sun Apr 1 12:32:29 AKDT 2007

0: 412103 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 24091 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 411998 411923

Sun Apr 1 12:32:39 AKDT 2007

0: 414609 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 24784 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 414504 414429

Sun Apr 1 12:32:49 AKDT 2007

0: 417116 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 25318 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 417011 416936

Lockup occurred at 12:33 (exact seconds not recorded)

Sun Apr 1 12:32:59 AKDT 2007

0: 419618 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 25629 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 419513 419438

Sun Apr 1 12:33:09 AKDT 2007

0: 422120 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 25629 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 422015 421940

Sun Apr 1 12:33:19 AKDT 2007

0: 424622 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 25629 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 424517 424442

Sun Apr 1 12:33:29 AKDT 2007

0: 427124 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 25634 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 427019 426944

Sun Apr 1 12:33:39 AKDT 2007

0: 429626 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 25634 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 429521 429446

Sun Apr 1 12:33:49 AKDT 2007

0: 432128 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 25689 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 432023 431948

Sun Apr 1 12:33:59 AKDT 2007

0: 434630 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 25689 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 434525 434450

Sun Apr 1 12:34:09 AKDT 2007

0: 437132 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 25689 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 437027 436952

\------- Post lockup...

Sun Apr 1 12:37:09 AKDT 2007

0: 482168 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 26838 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 482063 481988

Sun Apr 1 12:37:19 AKDT 2007

0: 484670 0 IO-APIC-edge timer

8: 70194 0 IO-APIC-edge rtc

50: 26838 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 484565 484490

Sun Apr 1 12:37:29 AKDT 2007

0: 487172 0 IO-APIC-edge timer

8: 70196 0 IO-APIC-edge rtc

50: 26889 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 487067 486992

Sun Apr 1 12:37:39 AKDT 2007

0: 489674 0 IO-APIC-edge timer

8: 70196 0 IO-APIC-edge rtc

50: 26889 0 IO-APIC-level ehci_hcd:usb1, uhci_hcd:usb2

58: 0 0 IO-APIC-level uhci_hcd:usb4

LOC: 489569 489494

0 Kudos
jsa
Enthusiast
Enthusiast

>Yep. Take a look at IRQ 8 in your samples - it did not increment while you were dumping data from /proc/interrupts

It also didn't increment BEFORE the lock up either? Is that normal?

\- neither did mine, with hpet=disable

0 Kudos
jlindgren33
Contributor
Contributor

maybe it's just my lack of understanding hpet, but wouldn't disabling it lead to problems with multimedia streams, interrupt request, etc?

0 Kudos
jsa
Enthusiast
Enthusiast

Disabling hpet forces the kernel to use a different source for timers. It forces the use of the PIT for timing. (Programmer Interrupt Timer)

See

http://www.kernel.org/pub/linux/kernel/people/gregkh/lkn/lkn_pdf/ch09.pdf

There seems to be another parameter that allows you to choose your timer source. Namely "clocksource"

0 Kudos
jlindgren33
Contributor
Contributor

gotcha - thanks for the clarification.

so, i jut need to add hpet=disable right after the kernel boot image in the bootloader? ie: 'kernel-2.xx.xx.xx hpet=disable'

0 Kudos
jsa
Enthusiast
Enthusiast

Well, I would just type it in manually on the boot prompt for now when you

intend to use Vmware. At least till Petr can check it out a bit longer.

Where you make the permanent change depends on which boot loader you use

and whether it proves to be the definitive solution.

For me, it didn't make any difference. I still get the 5 minute freeze even with hpet=disable.

0 Kudos
jlindgren33
Contributor
Contributor

which distro are you running? i'm running opensuse 10.2 w/kernel 2.6.18.8-0.1

0 Kudos
jsa
Enthusiast
Enthusiast

Same Distro, kernel 2.6.18.2-34.

Core 2 Duo.

0 Kudos
dekker_dreyer
Contributor
Contributor

Same lockup problems here. Dual P4 (not dual core), kernel 2.6.19-1.2895.fc6, fedora OS, vmware workstation 5.5.3, windows 2000 guest. hpet=disable did not correct problem. Dmesg output looks similar:

/dev/vmmon\[29489]: host clock rate change request 83 -> 1043

/dev/vmmon\[29489]: host clock rate change request 1043 -> 83

/dev/vmmon\[29489]: host clock rate change request 83 -> 1043

/dev/vmmon\[29489]: host clock rate change request 1043 -> 83

/dev/vmmon\[29489]: host clock rate change request 83 -> 1043

/dev/vmmon\[29489]: host clock rate change request 1043 -> 83

/dev/vmmon\[29489]: host clock rate change request 83 -> 1043

/dev/vmmon\[29489]: host clock rate change request 1043 -> 83

/dev/vmmon\[29489]: host clock rate change request 83 -> 1043

/dev/vmmon\[29489]: host clock rate change request 1043 -> 83

/dev/vmmon\[29489]: host clock rate change request 83 -> 1043

/dev/vmmon\[29489]: host clock rate change request 1043 -> 83

/dev/vmmon\[29489]: host clock rate change request 83 -> 1043

/dev/vmmon\[29489]: host clock rate change request 1043 -> 83

/dev/vmmon\[29489]: host clock rate change request 83 -> 1043

/dev/vmmon\[29489]: host clock rate change request 1043 -> 83

Interrupts as follows:

LOC: 1662577 1662576

Tue Apr 3 15:59:18 EDT 2007

0: 1672931 0 IO-APIC-edge timer

8: 1 0 IO-APIC-edge rtc

18: 218164 0 IO-APIC-fasteoi eth0

20: 0 0 IO-APIC-fasteoi libata

LOC: 1672581 1672580

== froze

Tue Apr 3 15:59:28 EDT 2007

0: 1682934 0 IO-APIC-edge timer

8: 1 0 IO-APIC-edge rtc

18: 219521 0 IO-APIC-fasteoi eth0

20: 0 0 IO-APIC-fasteoi libata

LOC: 1682570 1682569

== unfrozen

Tue Apr 3 15:59:38 EDT 2007

0: 1692944 0 IO-APIC-edge timer

8: 1 0 IO-APIC-edge rtc

18: 220402 0 IO-APIC-fasteoi eth0

20: 0 0 IO-APIC-fasteoi libata

LOC: 1692580 1692579

Tue Apr 3 15:59:48 EDT 2007

0: 1702951 0 IO-APIC-edge timer

8: 1 0 IO-APIC-edge rtc

18: 222174 0 IO-APIC-fasteoi eth0

20: 0 0 IO-APIC-fasteoi libata

LOC: 1702588 1702587

Tue Apr 3 15:59:58 EDT 2007

0: 1712978 0 IO-APIC-edge timer

8: 1 0 IO-APIC-edge rtc

18: 224012 0 IO-APIC-fasteoi eth0

20: 0 0 IO-APIC-fasteoi libata

LOC: 1712609 1712608

Tue Apr 3 16:00:08 EDT 2007

0: 1723095 0 IO-APIC-edge timer

8: 1 0 IO-APIC-edge rtc

18: 226302 0 IO-APIC-fasteoi eth0

20: 0 0 IO-APIC-fasteoi libata

LOC: 1722723 1722722

== froze

Tue Apr 3 16:00:18 EDT 2007

0: 1733094 0 IO-APIC-edge timer

8: 1 0 IO-APIC-edge rtc

18: 227618 0 IO-APIC-fasteoi eth0

20: 0 0 IO-APIC-fasteoi libata

LOC: 1732715 1732714

Tue Apr 3 16:00:28 EDT 2007

0: 1743103 0 IO-APIC-edge timer

8: 1 0 IO-APIC-edge rtc

18: 228105 0 IO-APIC-fasteoi eth0

20: 0 0 IO-APIC-fasteoi libata

LOC: 1742724 1742723

== unfrozen

Tue Apr 3 16:00:38 EDT 2007

0: 1753107 0 IO-APIC-edge timer

8: 1 0 IO-APIC-edge rtc

18: 229654 0 IO-APIC-fasteoi eth0

20: 0 0 IO-APIC-fasteoi libata

LOC: 1752728 1752727

Top shows that iowait pegs at 50% when the guest is frozen.

Cpu(s): 0.0%us, 0.2%sy, 0.0%ni, 50.0%id, 49.8%wa, 0.0%hi, 0.0%si, 0.0%st

These particular freezes were pretty short, but the freeze is usually a few minutes.

0 Kudos
jlindgren33
Contributor
Contributor

this has me really nervous. i'm slated to build out a virtual environment for a customer and had planned on using vmware. i'm thinkin i should start looking for another solution as my experience with vmware on my laptop has been great when it's running - not so great when it freezes every hour.

0 Kudos
jsa
Enthusiast
Enthusiast

Dekker:

These freezes are much too short to be the same problem.

Look a the times for the Interrupt mappings. They are all within seconds

of each other.

The lockups under discussion in this thread are reliably 5 minutes long.[/b]

You also said:

>Top shows that iowait pegs at 50% when the guest is frozen.

This sounds like a disk problem to me, perhaps reading/writing massive amounts of data on the virtual disk.

0 Kudos
jsa
Enthusiast
Enthusiast

Well if the planned virtual environment is on a core 2 duo platform I'd wait till this is solved.

I have Vmware Server running on a Dual Core Pentium D, and its utterly reliable.

Workstation is a testing and development platform, not really intended for production.

Still it is a difficult situation with me, as I rely on Workstation for my Vista, XP, and Win2k testbeds, and these versions are freezing.

0 Kudos
jlindgren33
Contributor
Contributor

i'm having the problem with server and workstation. but these are servers i'm building out, they are mutli-cpu but not core2duo as my notebook is.

oh - and for the record, hpet=disable didn't solve the problem for me either.

0 Kudos