Nazgulled
Contributor
Contributor

Arch Linux Guest Crashes/Freezes Windows Vista Host - PLEASE HELP!

Hi,

I just installed Arch Linux in VMware Workstation 6.5 and after the first boot just right after the installation (meaning, nothing else but the base system was installed/configured) I can barely use the virtual machien without having the host freezing. I'm just there, in the console and it freezes everything without me doing nothing. Most of the times, I just type "reboot" and it freezes the host machine. Other times, I could type other things, like "ls", or "cd <dir>" and then "cd ..", and after a few seconds, boom, host freezes.

By freezes I mean I have to press the power off button for a few seconds to turn off the computer and reboot (laptop, no reset button).

I've been searching google for the past hours and couldn't find the cause and/or solution for the problem, this is driving me nuts. I installed VirtualBox just to test if the same thing happened and it didn't, it works fine on VirtualBox, but due to other issues, I can't swap to VirtualBox at the moment.

Please, somone, help me out fix this issue in VMware, I can't understand why it crashes the host system, there's no reason for that to happen...

Message was edited by: oreeh

Reason: removed the emoticon from the subject

Oliver Reeh

VMware Communities User Moderator

0 Kudos
69 Replies
continuum
Immortal
Immortal

Have you ever tried with different virtual guest settings ? "other" for example ...

___________________________________

description of vmx-parameters:

VMware-liveCD:

Do you need support with a recovery problem ? - send a message via skype "sanbarrow"
0 Kudos
Nazgulled
Contributor
Contributor

Just tried that, same results...

I'm not liking this at all, I'm uninstalling software by software and it still crashes everytime... I'm afraid to do a clean install of Vista and just install VMware to find out the problem persists, which is not good...

0 Kudos
continuum
Immortal
Immortal

Can you upgrade to XP when you want to reinstall the host ?

___________________________________

description of vmx-parameters:

VMware-liveCD:

Do you need support with a recovery problem ? - send a message via skype "sanbarrow"
0 Kudos
Nazgulled
Contributor
Contributor

That would be a downgrade not an "upgrade".

I could install XP but I'm not going to do it, I don't care that it works on XP. No matter what people say, Vista is better and I'm not going back to XP.

I need it to work on Vista, that's my main work OS.

0 Kudos
Nazgulled
Contributor
Contributor

I decided to install XP anyway just for testing porpuses and the same thing happened.

Clean install of XP SP2 without any driver updates at all and no Windows updates. It still crashes the host. Right now, all drivers are updated and I'm installing all the latest Windows updates and SP3 to see if it helps. But I'm afraid it will crash again...

Which leads me to... This can only be a problem on my laptop right? Something wrong with my hardware that, for some reason, VMware can't handle it and VirtualBox can? What might be? RAM? HDD? Graphics card? What else? I don't understand... If this is real the case, I guess I have no choice than to use VirtualBox, I can't afford a new laptop just to run VMware...

This sucks...

0 Kudos
continuum
Immortal
Immortal

Do you have any energy-saving functions in BIOS like speed-step ... ?

Try to disable them.

___________________________________

description of vmx-parameters:

VMware-liveCD:

Do you need support with a recovery problem ? - send a message via skype "sanbarrow"
0 Kudos
Scissor
Virtuoso
Virtuoso

Try checking your laptop memory for problems. Since you are running Vista, click Start -> type : mdsched.exe to start the Windows Memory Diagnostic program, and follow the prompts to check your RAM.

0 Kudos
Nazgulled
Contributor
Contributor

Do you have any energy-saving functions in BIOS like speed-step ... ?

Try to disable them.

Nope, there's no such feature on my BIOS...

Try checking your laptop memory for problems. Since you are running Vista, click Start -> type : mdsched.exe to start the Windows Memory Diagnostic program, and follow the prompts to check your RAM.

I actually don't have Vista right now since I did a clean install of XP... But since the problem is the same, I'm going to restore the Vista backup I did, cause I need a working system for tomorrow. When the backup is fully restored I'll check for memory problems.

0 Kudos
Scissor
Virtuoso
Virtuoso

You can download the Windows Memory Diagnostics standalone program here and run it to create a bootable CD-ROM: http://oca.microsoft.com/en/windiag.asp

0 Kudos
Nazgulled
Contributor
Contributor

Try checking your laptop memory for problems. Since you are running Vista, click Start -> type : mdsched.exe to start the Windows Memory Diagnostic program, and follow the prompts to check your RAM.

I did this and Windows did not report any problems with memory... However, I'm going back home tomorrow and there I have the original RAM that came with my laptop, 2x512Mb. Currently I have 2x1024Mb which I updated 2 years ago I guess. And now that I think of it, I think I bought these 2Gb when Vista came out and that was when I decided to try Ubuntu on VMware and the same freezes happened as the ones I'm having now. This leads me to two questions:

1) Let's say I test with my original RAM and it works. What does this mean? Windows does not report memory errors on my current RAM, how can the problem be on the RAM?

2) If the original RAM doesn't solve anything, what else could the problem be? I've already teste everything I could think of and the problem must be on my hardware, somehow...

0 Kudos
Scissor
Virtuoso
Virtuoso

If the Windows Memory Diagnostics didn't show a problem, then your RAM is fine. Swapping out the RAM most likely won't fix anything.

This is an interesting problem now that you tried rebuilding your laptop! I wish there was more information available online about your laptop model -- LG's website doesn't even mention the existence of your laptop.

Try resetting your BIOS settings back to default settings. Do you have any expansion cards installed in your laptop that you can remove?

0 Kudos
Nazgulled
Contributor
Contributor

If the Windows Memory Diagnostics didn't show a problem, then your RAM is fine. Swapping out the RAM most likely won't fix anything.

I have nothing to loose than a little bit of time...

This is an interesting problem now that you tried rebuilding your laptop! I wish there was more information available online about your laptop model -- LG's website doesn't even mention the existence of your laptop.

You and me both, LG sucks a lot in support.

Try resetting your BIOS settings back to default settings. Do you have any expansion cards installed in your laptop that you can remove?

I don't think that will do anything since I didn't change anything in the BIOS but I can try... No, I don't have any expansion cards...

0 Kudos
Nazgulled
Contributor
Contributor

Resetting the BIOS to defaults didn't help... Smiley Sad

0 Kudos
ralish
Enthusiast
Enthusiast

Hello,

A devoted Arch user who's a friend of mine pointed me to your post on the Arch forum (). I'm the VMware/Windows loving friend who gave you that apparently not so useful advice Coincidentally, I woke up this morning and remembered your problem and did a quick search, and find this moderately sized thread on the VMware forums by you, and I happen to have an account on these forums, so it's only decent that I try to offer some (better) advice!

It's pretty rare to be in a situation where getting a BSoD is a better outcome than the results you're actually getting, and you seem to be in the latter category; at least with a BSoD you'd have some crash data to sift through that has a reasonable chance of leading you to the culprit. I've read the thread, and you seem to have tried all the basics and some of the more advanced stuff through the help of the VMware community. The way the system does a hard lock-up makes me think there is something going on in kernel mode that is definitely not good (even if whatever it is is being prompted by some mundane user-mode code). Further, you can't really even say that VMware code is the cause, but possibly merely the catalyst. Kernels are complex beasts, and misbehaving code from one completely unrelated module can have ramifications on other code that at first seems to have no links at all.

Seeing as the machine is not blue-screening, I'd suggest trying to MAKE it bluescreen. Follow these instructions to prepare your system:

Once completed, do a test to see if it works. If successful, I'd reproduce your VMware crash and see if you can force a memory dump using the above procedure. This, if it works, could provide valuable insight into exactly what is going on inside the kernel at the time of the freeze. It may be ambiguous at first, in which case, making several memory dumps and analyzing each, to see which similarities they have, particularly the last instructions on the stack, will help to determine if the code running before the crash is consistently similar or even identical. The driver responsible for calling the memory dump runs at a very high priority, so while the system may have frozen, elements of the system may still be responsive, even if they can't typically be interacted with directly by the user, which is why I'd hold some hope that this may work.

If it doesn't, the next step may well be to run Driver Verifier: http://support.microsoft.com/kb/244617

I'd simply create standard settings when setting it up, reboot, and once again, try and reproduce the crash. This will enable a significant number of additional checks on kernel code. The result will be a significant performance penalty (which is why you shouldn't run Driver Verifier on production machines, but purely for debugging bad code), however, it may well catch the fault as a result. In fact, it may well catch the fault well before the system usually crashes. Note, that when if it catches a fault, it'll BSoD the system, however, you'll have a memory dump to analyze. This dump may well be more useful than prior dumps, as DV is likely to have caught what was going wrong at the precise time it happened, and not a memory dump of what was going on well after the damage had been done.

In any cases, you'll need Debugging Tools for Windows (). You should setup WinDbg to use the Microsoft Symbol Server to pull down debugging information on-demand (). You can then open the crash dumps (typically saved in %SystemRoot%\MEMORY.DMP) in WinDbg and, once it's loaded them, run an !analyze -v to perform a thorough automated analysis, which will often reveal the culprit without any further forensics, which you don't want, as that takes a lot of time and skill. There are various guides out there that illustrate the basics of using WinDbg to analyze kernel memory dumps.

While this may all sound very initimidating, a basic analysis of a kernel crash dump is not at all difficult to do, and is in fact mostly automated, requiring very little programming expertise, if any. Further, I think the data gleaned from it in your case may well be invaluable. Let us know how it goes, and attach any WinDbg analyses you make !http://communities.vmware.com/images/emoticons/happy.gif!

0 Kudos
ralish
Enthusiast
Enthusiast

Some additional notes:

1. Ensure you have correctly configured kernel memory dump settings, this can be done through the Advanced tab of the System applet in Control Panel. You want a kernel memory dump at a minimum, but if you have space to spare, feel free to go for a complete memory dump, which may yield more useful data. Keep in mind a complete memory dump will require a file dump equal to the size of your system RAM.

2. Each subsequent crash will overwrite the prior memory dump (if it exists). So, if as I suggest above, you are going to compare memory dumps, ensure you copy the previous one to a new location for safekeeping before the next crash if at all possible.

3. The manual initiation of a memory dump shouldn't apply to Driver Verifier. The whole point is for Driver Verifier to catch any illegal operation being performed in kernel mode if one is occuring, so, if a BSoD does not occur during reproduction while DV is enabled, it's unlikely in my view that a manually initiated memory dump will yield much more useful data than one without DV. However, it still won't do any harm taking a look regardless if a manual memory dump with DV is required.

0 Kudos
Nazgulled
Contributor
Contributor

Hi, thanks for helping out...

>Once completed, do a test to see if it works. If successful, I'd reproduce your VMware crash and see if you can force a memory dump using the above procedure. This, if it works, could provide valuable insight into exactly what is going on inside the kernel at the time of the freeze. It may be ambiguous at first, in which case, making several memory dumps and analyzing each, to see which similarities they have, particularly the last instructions on the stack, will help to determine if the code running before the crash is consistently similar or even identical. The driver responsible for calling the memory dump runs at a very high priority, so while the system may have frozen, elements of the system may still be responsive, even if they can't typically be interacted with directly by the user, which is why I'd hold some hope that this may work.

The manual memory dump works when I press the respective shortcut and the dump file is created but it doesn't work when the system freezes in my problem. When that happens, the keyboard is totally unresponsive. The wireless led still blinks a little but it will stop (and keep lit) after a while, the keyboard does not respond either way while the led is blinking.

>I'd simply create standard settings when setting it up, reboot, and once again, try and reproduce the crash. This will enable a significant number of additional checks on kernel code. The result will be a significant performance penalty (which is why you shouldn't run Driver Verifier on production machines, but purely for debugging bad code), however, it may well catch the fault as a result. In fact, it may well catch the fault well before the system usually crashes. Note, that when if it catches a fault, it'll BSoD the system, however, you'll have a memory dump to analyze. This dump may well be more useful than prior dumps, as DV is likely to have caught what was going wrong at the precise time it happened, and not a memory dump of what was going on well after the damage had been done.

I had to choose between "unsigned drivers", "signed drivers" or "all drivers, I selected "all. This caused Windows to BSoD before I had the change to try and freeze the system with VMware. As soon as the password prompt appeared, I started typing my user password and the system crashed with a BSoD. I have the memory dump, of course. But I'm having problems with the Symbols...

I've attached a log of what's happening in the debugger...

0 Kudos
ralish
Enthusiast
Enthusiast

Hello again,

Looking at what you pasted:

"BugCheck C4, {1001, 9f876ff8, 8f391330, 0}" -> DRIVER_VERIFIER_DETECTED_VIOLATION -> http://msdn.microsoft.com/en-us/library/ms796113.aspx

A 0xC4 bugcheck is a Driver Verifier specific bugcheck that only occurrs when DV is active and detects a violation in kernel mode, which confirms my suspicions, something is going badly wrong in kernel mode.

Looking at the handy provided table, the first parameter (0x1001) indicates that this bugcheck is specific to kernel deadlocks, which makes a lot of sense, as a deadlock occurs when two actions are waiting for each other to finish, and thus, neither ever does (this is all a simplification). You can pretty easily understand how this can translate to a complete system lockup when this happens in the kernel. This is also exactly what you are witnessing, which suggests we are definitely looking in the right direction.

The second parameter (0x9f876ff8) points to the memory address of the resource that was the final cause of the deadlock.

The third and fourth parameters are apparently reserved, meaning they're not in use, despite the fact the 3rd parameter in your data has a value, which might suggest the documentation is out of date or the parameter is undocumented.

I would try running a !deadlock and "!deadlock 1" respectively on your crash dump in WinDbg and paste the results here.

The NETw2v32.sys error can be ignored, this is an Intel networking driver, and as such, the debugging symbols for it can't be pulled from the MS Symbol Server.

I would suggest changing your dump file format to a complete memory dump as it is highly probably this exact same deadlock is what is the cause for all of your crashes, and having more data to work with may be useful.

Finally, you should be running !analyze -v (for verbose output). Are you all the output is in your paste? It might have been trimmed by the forums?

0 Kudos
ralish
Enthusiast
Enthusiast

Although I wouldn't call the crash dump in anyway conclusive, it is beginning to point at the NETw2v32.sys driver, which is a Intel wireless networking driver. Updating it shouldn't do any harm, and is certainly worth a shot.

You said that LG themselves hadn't provided any driver updates, so I'd suggest just getting it direct from Intel, which shouldn't cause any problems, the wireless chipset will be straight from Intel anyway.

http://downloadcenter.intel.com/Product_Filter.aspx?ProductID=1847&lang=eng -< Download link for Intel® PRO/Wireless 2915ABG Network Connection which is what I believe your laptop has judging from some specs I pulled off the web.

0 Kudos
Nazgulled
Contributor
Contributor

The NETw2v32.sys error can be ignored, this is an Intel networking driver, and as such, the debugging symbols for it can't be pulled from the MS Symbol Server.

I'm already using the MS Symbol Server and I'm getting that error...

In my "symbol path" I have this: SRVc:\symbolshttp://msdl.microsoft.com/download/symbols. And it's still reporting that error with that file...

Finally, you should be running !analyze -v (for verbose output). Are you all the output is in your paste? It might have been trimmed by the forums?

I didn't get to run that command yet because the symbols not loaded. The log posted is just after the memory dump was loaded, nothing else.

I have attached the "deadlock" and "analysis" log files...

Although I wouldn't call the crash dump in anyway conclusive, it is beginning to point at the NETw2v32.sys driver, which is a Intel wireless networking driver. Updating it shouldn't do any harm, and is certainly worth a shot.

You said that LG themselves hadn't provided any driver updates, so I'd suggest just getting it direct from Intel, which shouldn't cause any problems, the wireless chipset will be straight from Intel anyway.

-< Download link for Intel® PRO/Wireless 2915ABG Network Connection which is what I believe your laptop has judging from some specs I pulled off the web.

As far as I know, I'm already running the latest drivers directly from Intel and I believe my NIC is the Intel PRO/Wireless 2200BG, not the 2915ABG. But I'm going to open the laptop and check what's written on the network card itself.

All the logs posted here are from the current kernel memory dump, I'll provide new ones with full dump in a few minutes...

0 Kudos
ralish
Enthusiast
Enthusiast

You misinterpret me Smiley Happy

The symbols are set up fine, that error about not finding symbols for netw2v32.sys is occurring because it is not a Microsoft driver (it's from Intel) and as such the debugging symbols for it can't be found on the Microsoft Symbol Server. Bottomline, is it can be safely ignored.

I'm looking at your newest attached files, and this is absolutely, positively, definitely networking related. The stack trace for the two threads engaged in the deadlock are entirely full of networking system calls. You should be looking into ensuring everything networking related is updated to the latest versions, direct from manufacturer if LG can't provide recent drivers. So, ensure all Windows Updates are applied, and all networking equipment is running the latest drivers. Number one on the list to check should be Intel networking drivers.

EDIT:

I should point out, there is the (very unlikely) possibility this isn't even related to your VMware problem. There's no doubt that there are some very badly behaving drivers on your system, but that doesn't mean that these are the ones causing the VMware problem. Regardless, you won't know until you fix this particular problem Smiley Wink

0 Kudos