VMware Communities
danslay
Contributor
Contributor

WS 5.5.3 freeze on linux host FC6/x86_64 2.6.19 and 2.6.20 kernels

I've found that VMWare workstation 5.5.3 and Player both freeze up and lock the host when running the standard Fedora Core 6 kernels 2.6.19 and 2.6.20 on an x86_64 host machine. The guest OS in all cases (I have 3 setups) is always a fully updated Windows XP, with VMWare Tools installed.

The freeze usually occurs without warning a few hours after the guest OS (winXP) been up and running, and does not appear strongly linked to guest OS activity; it happens if nothing has been run on the guest OS, or if you're using it extensively.

I suspect, but am not sure, that a few seconds before the freeze, CPU activity maxes out, and then if you notice this and move the mouse out of the vmware window, you might escape the freeze. Usually the frozen system shows the cpu monitor frozen at a maximum level for one of the two CPUs. (VMware is only set to use a single CPU).

The freeze locks up the host OS as well as the guest, requiring a poweroff-reboot. (keyboard does not respond).

The system logs and vmware log show no obvious signs of distress before the freeze.

On one of the systems, I used the vmware-any-any-update108, but the same problem occurs.

This problem does NOT occur with the 2.6.18 kernel. Every month or so I've tried updating to the newest kernel to see if the issue has been solved, but so far, it always returns. Then back I go to 2.6.18.

Question[/b]: is this a known problem with 2.6.19 and 2.6.20 kernels, or is this something with a known fix?

Another thread on this topic suggested using the vmware-any-any-update108, and running vmware-install.pl For the life of me, I cannot find any such vmware-install.pl file on my system, or I'd happily run it and see how that goes.

Thanks for any advice!

Reply
0 Kudos
43 Replies
KevinG
Immortal
Immortal

The vmware-install.pl would be what you would run to install VMware Workstation if you installed using the tar package. i assume you installed using the RPM so you would have run vmware-config.pl script.

When the system freezes, does the mouse pointer turn to an hour glass and you can move the mouse around but have no keyboard input?

Reply
0 Kudos
danslay
Contributor
Contributor

Yes I install from rpm and use the vmware-config.pl. And I've had none of the issues it appears other folks have had in getting a working kernel module to compile and run. Everything on the installation/configuration works fine.

But no, the mouse pointer does not turn to an hourglass, it simply freezes in place as the usual arrow; both the keyboard and the mouse lock up. the network connection also stops, so you cannot login externally, or even ping it.

For the first minute or less of the freeze, a graphical system monitor that I have running constantly on the applications bar continues to update, but then it freezes too (at a maxed out CPU level for one of the CPUs, but not the other). and the on-screen clock stops updating at that point. The disks do not seem to be thrashing during, or before, the freeze; there does not seem to be any unusual disk activity.

I've let it sit overnight on one occasion, but that did not wake it.

Reply
0 Kudos
jlindgren33
Contributor
Contributor

this seems to be a growing issue. there have been a lot of posts about freezes with recent kernels. i'm having the exact same problem. my VM is mostly unusable. i've had to actually install windows on one of my machines so i can work. (yuck!)

Reply
0 Kudos
EmmanuelOtton
Contributor
Contributor

I experience this exact problem, too, with a x86_64 2.6.20-1.2933.fc6 kernel: OS freezes after a few hours, if vmware virtual machine started .

Reply
0 Kudos
aburch
Contributor
Contributor

I am also having this same issue on one of my machines.

The machine is running: 2.6.20-1.2307.fc5

The guest OS: Windows XP SP2

Sometimes the log file shows the internal monitor error message, other times it won't show any errors. The host and guest will both hard freeze. Mouse cursor will stop responding, no keyboard, everything stops. Only way to shut it down is to hold the power button in for a few seconds.

It seems sort of random, but very often to a point where we cannot use VMware on this machine anymore.

I have tried uninstalling VMware, and reinstalling with the RPM and also the tar file. I am using the any-any-update108, also was using the 107 update before with the same problem.

Here is the relevent portion of the log:

Mar 28 11:58:42: mks| MKS lost grab

Mar 28 11:58:42: mks| MKS lost grab

Mar 28 11:58:46: mks| MKS lost grab

Mar 28 11:58:49: vcpu-1| Guest: toolbox: Got a logoff event.

Mar 28 11:58:49: vcpu-0| GuestRpc: Channel 2 reinitialized.

Mar 28 11:58:57: vcpu-0| Guest: toolbox: Got a logoff event.

Mar 28 11:59:13: vcpu-0| MONITOR PANIC: vcpu-0:DoubleFault @ 0x4020:0x55c34 (0x0,0x2c4c)

Mar 28 11:59:13: vcpu-0| Core dump with build build-34685

Mar 28 11:59:13: vcpu-0| Writing monitor corefile "/home/f1jmn00/vmware/Windows XP Professional 3/vmware-core0.gz"

Mar 28 11:59:13: vcpu-0| Beginning monitor coredump

Mar 28 11:59:13: vcpu-0| End monitor coredump

Mar 28 11:59:13: vcpu-0| Writing anonymous pages at pos: 401000

Mar 28 11:59:17: vcpu-0| Writing monitor corefile "/home/f1jmn00/vmware/Windows XP Professional 3/vmware-core1.gz"

Mar 28 11:59:17: vcpu-0| Beginning monitor coredump

Mar 28 11:59:17: vcpu-0| End monitor coredump

Mar 28 11:59:17: vcpu-0| Writing anonymous pages at pos: 401000

Mar 28 11:59:21: vcpu-0| Msg_Post: Error

Mar 28 11:59:21: vcpu-0| \[msg.log.monpanic] *** VMware Workstation internal monitor error ***

Mar 28 11:59:21: vcpu-0| vcpu-0:DoubleFault @ 0x4020:0x55c34 (0x0,0x2c4c)

Mar 28 11:59:21: vcpu-0| Please report this problem by selecting menu item Help > VMware on the Web > Request Support, or by going to the Web page "http://www.vmware.com/info?id=8&logFile=%2fhome%2ff1jmn00%2fvmware%2fWindows%20XP%20Professional%203%2fvmware%2elog&coreLocation=%2fhome%2ff1jmn00%2fvmware%2fWindows%20XP%20Professional%203%2fvmware%2dcore%5b0%2d1%5d%2egz". Please provide us with the log file (/home/f1jmn00/vmware/Windows XP Professional 3/vmware.log) and the core file (/home/f1jmn00/vmware/Windows XP Professional 3/vmware-core\[0-1].gz).

Mar 28 11:59:21: vcpu-0| \[msg.log.monpanic.linuxdebug] If the problem is repeatable, please set 'Logging level' to 'Debug' in the Misc panel of Virtual Machine Settings. Then reproduce the incident and file it according to the instructions.

Mar 28 11:59:21: vcpu-0| \[msg.log.monpanic.linux] To collect files to submit to VMware support, run vm-support.

Mar 28 11:59:21: vcpu-0| \[msg.log.monpanic.finish] We will respond on the basis of your support entitlement.

Mar 28 11:59:21: vcpu-0| We appreciate your feedback,

Mar 28 11:59:21: vcpu-0| -- the VMware Workstation team.

Mar 28 11:59:21: vcpu-0| \----


Help!

Reply
0 Kudos
danslay
Contributor
Contributor

The only thing I can suggest is going back to a 2.6.18 kernel. (Do you have a 64-bit dual-AMD system? I'm not sure if this is confined to that config or not, but thats what I have - dual opterons in all cases, but different motherboards). I have never had the problem with the 2.6.18 kernel, but it consistently appears with 2.6.19 and 2.6.20. Since this seems to be an unsolved mystery, I'm rolling back the system I had upgraded to 2.6.20, back to 2.6.18.

Additional clues: I just had the freeze again (on a 2.6.20 kernel), while I was remotely logged in with a terminal. (someone was actively using vmware on the machine when it froze). Surprisingly the only thing that froze was X; the terminal log-in remained active, allowing me to kill vmware, which did not unfreeze things, and then reboot nicely.

I'm not at all sure if this is the normal freeze that I get - those typically disable the network connection, but I dont know that I've ever been logged in externally when one has occurred. So perhaps something slightly different is going on this time, or perhaps having that external login active somehow affected the freeze....

Oddly, killing X (and gdm) did not bring X back; I had to reboot to get X back. On all other freezes I've encountered (those not involving vmware), killing X (and/or gdm) has always worked to restart X and bring back the graphical login screen, without a reboot.

In case this means anything to anyone, here are some log file entries at about the time of the freeze (which I think was probably about 14:51, plus or minus a minute or two). (Note the USB device mentioned in /var/log/messages was unplugged AFTER the freeze, and the USB adaptor was turned off (although still in the config I think) in VMware (this one is VMPlayer).

vmware.log:

Mar 28 14:48:24: vmx| SCSI0:0: Command WRITE(10) took 1.239 seconds (ok)

Mar 28 14:48:24: vmx| SCSI0:0: Command WRITE(10) took 1.244 seconds (ok)

Mar 28 14:48:24: vmx| SCSI0:0: Command WRITE(10) took 1.245 seconds (ok)

Mar 28 14:48:24: vmx| SCSI0:0: Command WRITE(10) took 1.254 seconds (ok)

Mar 28 14:51:10: mks| MKS lost grab

Mar 28 14:51:25: mks| MKS lost grab

Mar 28 14:51:27: mks| MKS lost grab

Mar 28 14:51:29: mks| MKS lost grab

Mar 28 14:51:30: mks| MKS lost grab

Mar 28 14:51:32: mks| MKS lost grab

Mar 28 14:51:38: vcpu-0| DMA port 3: bad 4 byte access (2): 0xffffffff

(no entries in vmware.log after this one).

Xorg.0.log (this is probably from my attempts to kill X and gdm; this log does not have time stamps):

(EE) NVIDIA(0): The NVIDIA kernel module does not appear to be receiving

(EE) NVIDIA(0): interrupts generated by the NVIDIA graphics device

(EE) NVIDIA(0): PCI:1:0:0. Please see Chapter 5: Common Problems in the

(EE) NVIDIA(0): README for additional information.

(EE) NVIDIA(0): Failed to initialize the NVIDIA graphics device!

/var/log/messages:

Mar 28 14:43:50 yupana restorecond: Will not restore a file with more than one hard link (/etc/resolv.conf) Invalid argument

Mar 28 14:51:46 yupana kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 0088665c

Mar 28 14:51:48 yupana kernel: NVRM: Xid (0001:00): 8, Channel 00000000

Mar 28 14:51:54 yupana kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 0088665d

Mar 28 14:51:56 yupana kernel: NVRM: Xid (0001:00): 8, Channel 0000001e

Mar 28 14:52:02 yupana kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 0088665e

Mar 28 14:52:04 yupana kernel: NVRM: Xid (0001:00): 8, Channel 00000020

Mar 28 14:52:10 yupana kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 0088665f

Mar 28 14:52:12 yupana kernel: NVRM: Xid (0001:00): 8, Channel 00000020

Mar 28 14:52:18 yupana kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 00886660

Mar 28 14:52:20 yupana kernel: NVRM: Xid (0001:00): 8, Channel 0000001e

Mar 28 14:52:26 yupana kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 00886661

Mar 28 14:52:28 yupana kernel: NVRM: Xid (0001:00): 8, Channel 00000020

Mar 28 14:52:34 yupana kernel: usb 2-6: USB disconnect, address 2

Mar 28 14:52:34 yupana hald\[2884]: forcibly attempting to lazy unmount /dev/sdc1 as enclosing drive was disconnected

Mar 28 14:52:34 yupana hald: unmounted /dev/sdc1 from '/media/LACIE' on behalf of uid 0

Mar 28 14:52:34 yupana kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 00886662

Mar 28 14:52:36 yupana kernel: NVRM: Xid (0001:00): 8, Channel 00000020

Mar 28 14:52:42 yupana kernel: NVRM: Xid (0001:00): 16, Head 00000000 Count 00886663

Mar 28 14:52:44 yupana kernel: NVRM: Xid (0001:00): 8, Channel 0000001e

Mar 28 14:54:15 yupana nmbd\[2840]: \[2007/03/28 14:54:15, 0] lib/interface.c:load_interfaces(225)

Mar 28 14:54:15 yupana nmbd\[2840]: WARNING: no network interfaces found

Mar 28 14:54:48 yupana kernel: escd\[6150]: segfault at 00007fff80000000 rip 00000031b2c72119 rsp 00007fff83c5b930 error 4

Mar 28 14:54:54 yupana kernel: NVRM: RmInitAdapter failed! (0x12:0x2b:1544)

Mar 28 14:54:54 yupana kernel: NVRM: rm_init_adapter(0) failed

Mar 28 14:54:55 yupana gdm\[3134]: gdm_slave_xioerror_handler: Fatal X error - Restarting :smileyshocked:

Mar 28 14:55:04 yupana kernel: NVRM: RmInitAdapter failed! (0x12:0x2b:1544)

Mar 28 14:55:04 yupana kernel: NVRM: rm_init_adapter(0) failed

Mar 28 14:55:05 yupana gdm\[8015]: gdm_slave_xioerror_handler: Fatal X error - Restarting :smileyshocked:

Mar 28 14:55:05 yupana kernel: 3w-xxxx: scsi6: Character ioctl (0x1f) timed out, resetting card.

Reply
0 Kudos
rdeeming
Contributor
Contributor

Ditto with my X86_64 system both XP and Win2003 server hang or fail. When there is a failure it is always in reference to some internal monitor error. Core files and log file have been submited to vmware for a while now. No answers though.

Reply
0 Kudos
pturbide
Contributor
Contributor

I run Fedora Core 6 X64

I have the exact same issue since running 2.6.19 and later

My logs are also precisely reflecting what the others are reporting.

None of my VM's are usable at all at this point.

Reply
0 Kudos
wpyung
Contributor
Contributor

this is the same as http://www.vmware.com/community/thread.jspa?threadID=77873&tstart=0

in our case the problem only occurs on dual core Turion AMD machines, not single core.

Can any one help ?

Reply
0 Kudos
jabberwocky
Contributor
Contributor

I have this problem too... FC5 with 2.6.20 kernel... strangely I cannot find the 2.6.18 kernel RPMs ANYWHERE on the internet ...

Reply
0 Kudos
poelcat
Contributor
Contributor

I'm running Workstation 5.5.3 on FC6-i386 with the latest FC6 kernels and see constat freezes trying to install F7 test and rawhide guests. Strangely I don't see the same freezes if I run an already installed FC6 guest.

I think it is a definite FC6 kernel problem. With 2.6.20-1.2933.fc6 and 2.6.20-1.2925.fc6 I see this problem repeatedly. If down rev to the kernel that GA'd with FC6--2.6.18-1.2798.fc6 no freezes and everything works fine.

Reply
0 Kudos
wpyung
Contributor
Contributor

I have logged a bug with bugzilla - no response as yet

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=234467

Reply
0 Kudos
wpyung
Contributor
Contributor

Can you check to see if there is a message in /var/log/messages which occurs before each freeze like this:

kernel: ioctl32(vmware-vmx:4668): Unknown cmd fd(145) cmd(40109980)\{00} arg(ff8e52d0) on /proc/bus/usb/003/001

with values 4668 and ff8e52d0 replaced by different ids on each occasion.

Any ideas what it means ?

I only get the problem on dual core AMD laptops , not single core.

Reply
0 Kudos
danslay
Contributor
Contributor

As for the 2.6.18 kernels, I found and installed 2.6.18-1.2798 two days ago from here:

http://distro.ibiblio.org/pub/linux/distributions/fedora/linux/core/6/x86_64/os/Fedora/RPMS/kernel-2...

Note that the "updates" channel has only 2.6.19 and later kernels.

Our two other machines with 2.6.18 kernels have 2.6.18-1.2869.fc6, but I could not find rpms for that specific variant (2869) anywhere.

I could not figure out how to get yum to install an old kernel, so I simply used rpm with the force option.

Also, if you have any kmod packages or nvidia drivers, you will probably need to manually remove those (with either yum or rpm) and install the correct version for the 2.6.18 kernel. I just downloaded the rpms from the llivna website for the 2.6.18 kernel, and installed manually with rpm.

Reply
0 Kudos
aburch
Contributor
Contributor

I wanted to give a little bit more information of what I have done to try and solve the problem and some information about specifics of this machine:

The first time I started noticing this problem was a little before 02/28/07 this was shortly after kernel 2.6.19 was installed on this machine. When running VMWare and have Windows XP booted, after some random time the machine will freeze, usually a few minutes. The mouse cursor and keyboard stop functioning, and the display is frozen. The hard drive does not seem to be doing much and neither is the processor. You have to physically press in the power button and hold it in for 5 seconds for the machine to shut off. Once it is off, you can then power it back up. Occasionally right before the machine freezes, VMWare will popup an error message on the screen stating "VMware Workstation internal monitor error vcpu-0:DoubleFault @ 0x4020:0x55c34 (0x0,0x2c4c)".

I removed the tar ball installation and downloaded the latest RPM version of VMWare version 5.5.3-34685. I booted up the VMWare session and installed the latest Vmware Tools in Windows. At some point the machine froze again. So, I tried creating a new VMWare session and creating the new vmware hard drives using the existing vmware drives from the previous session. The changes I made seemed to be working until just recently since 2.6.20 kernel was released. I started having the problems again.

This time I have tried reinstalling VMWare, installed the latest version of the vmware-any-any-update108 patch which allowed vmware-config.pl to finish properly after the new installation. I tried changing VMWare preferences. I updated the BIOS firmware to the latest version from HP. I disabled the ACPI power settings in the BIOS. When booting up the VMWare session, I went into the VMWare bios and changed the memory settings for ECC. I looked to see if there was a power saving module loaded, which I could not find. I created a new VMWare session using the existing VMWare drives like I had done a month ago. I removed the files and directories related to vmware in the /tmp directory.

None of this has helped this time.

The Desktop is a HP xw9300 Workstation.

Dual Processer AMD Opteron(tm) 248

Fedora Core 5 x86_64 2.6.20-1.2307.fc5

I just wanted to give as much information as possible, if it helps.

Reply
0 Kudos
pturbide
Contributor
Contributor

Fedora Core 6 x84

Tyan Board kw8e (S895)

4 GB Ram (1024 allocated to vm)

Running on Dual Opteron 280

Kernel: 2.6.20-1.2933.fc6 #1 SMP Mon Mar 19 11:00:19 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux

dual nvidia quadro 3450

xorg-x11-server-Xorg-1.1.1-47.7.fc6 (xinerama on)

kmod-nvidia-1.0.9755-2.2.6.20_1.2933.fc6

Guest OS Advanced Server 2000

Freeze occurs sometimes after a few minutes of usage, sometime hours.

Mar 28 10:23:03: vcpu-0| Guest OS = 0x5007

Mar 28 10:23:03: vmx| SCSI0:1 CDROM: CMD 0x25 (READ CAPACITY) FAILED (key 0x2 asc 0x3a ascq 0x2)

Mar 28 10:23:03: vmx| SCSI0:1 CDROM: CMD 0xad (*UNKNOWN (0xad)*) FAILED (key 0x2 asc 0x3a ascq 0x2)

Mar 28 10:23:03: vcpu-0| VNET: Notification enabled for Ethernet0

Mar 28 10:23:09: mks| Ignoring update request in VGA_Expose (mode change pending).

Mar 28 10:23:20: mks| Ignoring update request in VGA_Expose (mode change pending).

Mar 28 10:23:24: mks| Ignoring update request in VGA_Expose (mode change pending).

Mar 28 10:23:25: mks| SVGA: Using extended FIFO: Caps 0x00000007, Flags 0x00000000

Mar 28 10:23:38: mks| HostOps hideCursor before defineCursor!

Mar 28 10:25:08: vcpu-0| MONITOR PANIC: vcpu-0:VMM fault: regs=0xc0f0c, exc=14, eip=0x118ebd

Mar 28 10:25:08: vcpu-0| Core dump with build build-34685

Mar 28 10:25:08: vcpu-0| Writing monitor corefile "/home/patrick/vmware/win2000AdvServ/vmware-core.gz"

Mar 28 10:25:08: vcpu-0| Beginning monitor coredump

Mar 28 10:25:09: vcpu-0| End monitor coredump

Mar 28 10:25:09: vcpu-0| Writing anonymous pages at pos: 401000

Mar 28 10:25:11: vcpu-0| Msg_Post: Error

Mar 28 10:25:11: vcpu-0| \[msg.log.monpanic] *** VMware Workstation internal monitor error ***

Mar 28 10:25:11: vcpu-0| vcpu-0:VMM fault: regs=0xc0f0c, exc=14, eip=0x118ebd

Mar 28 10:25:11: vcpu-0| Please report this problem by selecting menu item Help > VMware on the Web > Request Support, or by going to the Web page "http://www.vmware.com/info?id=8&logFile=%2fhome%2fpatrick%2fvmware%2fwin2000AdvServ%2fvmware%2elog&coreLocation=%2fhome%2fpatrick%2fvmware%2fwin2000AdvServ%2fvmware%2dcore%2egz". Please provide us with the log file (/home/patrick/vmware/win2000AdvServ/vmware.log) and the core file (/home/patrick/vmware/win2000AdvServ/vmware-core.gz).

Mar 28 10:25:11: vcpu-0| \[msg.log.monpanic.linuxdebug] If the problem is repeatable, please set 'Logging level' to 'Debug' in the Misc panel of Virtual Machine Settings. Then reproduce the incident and file it according to the instructions.

Mar 28 10:25:11: vcpu-0| \[msg.log.monpanic.linux] To collect files to submit to VMware support, run vm-support.

Mar 28 10:25:11: vcpu-0| \[msg.log.monpanic.finish] We will respond on the basis of your support entitlement.

Mar 28 10:25:11: vcpu-0| We appreciate your feedback,

Mar 28 10:25:11: vcpu-0| -- the VMware Workstation team.

Mar 28 10:25:11: vcpu-0| \----


Reply
0 Kudos
petr
VMware Employee
VMware Employee

I have no idea about 2.6.20 and older kernels, but if you are using 2.6.21 (which could be your 2.6.20-1.3xxx as RedHat uses non-conforming version numbering) then you may be unable to run WS on your box until WS6 RC2 comes out.

If you'll unpack attached dumpirq.tar.gz, build it (hopefully it will build) it should create dumpirq.ko kernel module. If you'll insmod that module, then besides getting ENODEV error from insmod you should also get lot of messages in dmesg. If you'll look at these messages, and it will say that IOAPIC is configured to use lot of IRQs in range 0x30-0x3F, then you are in deep troubles...

If you want to run WS6.0RC1 or older on your kernel, you either have to revert http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=610142927b5bc149da92... from your tree, or you have to change

#define IRQ0_VECTOR FIRST_EXTERNAL_VECTOR + 0x10

in include/asm-x86_64/hw_irq.h into

#define IRQ0_VECTOR FIRST_EXTERNAL_VECTOR + 0x08

(any value between 0x01 and 0x08 should be fine)

Reply
0 Kudos
f5len
Contributor
Contributor

Hi,

I experienced same troubles with FC6 2.6.18-1-2849 on DELL PE1800 PE830and PE840. I fixed with acpi=off as kernel boot option.

May it help

Reply
0 Kudos
pichwo
Contributor
Contributor

all the same old story.

about a hundred times i regret to have ever used linux x86_64( recent) kernels.

those freeze-problems on amd-x2 are old and also with vmware i always had troubles an NO ONE CARED ever.

i cannot say how much i am fed up - and all those funny questions "did you turn on the power" when serious bug-reports are delivered don't help either.

i am in linux now for more than a decade and have built cloned distros etc. -

kernel development is just DECAY - imo.

regards

w.

Reply
0 Kudos