Hello all,
I have a brand new Dell Latitude E6540 notebook with Core I7-4800MQ CPU @ 2.7 GHz running Windows 7 prof. 64-bit. All drivers and Windows patches are up-to-date.
Whenever I run VMWare Workstation 10 (I develop software inside a virtual machine), I get at least one error message in the Windows event log of the host machine *per minute*;
The error message reads "Event 19, WHEA -Logger, a corrected hardware error has occurred. Reported by component: Processor Core. Error source: Corrected Machine Check. Error type:internal parity error. Processor ID: 0". (The Processor ID in the error message varies.)
The error *only* happens when I work inside a virtual machine. Otherwise, the notebook runs perfectly stable. I ran the Intel processor diagnostic tool which did a two hour burn-in stress test with no errors. Nothing is over-clocked and the processor doesn't overheat either. The RAM tests fine.
I have contacted Dell Pro Support. They first advised I do a factory restore of the operating system. I did that (took me all day to install my apps again) but the error persisted. A Bios update didn't change anything either. All drivers and service packs are up-to-date too.Then they sent me a technician. First the mainboard was replaced. Then the CPU. Then the RAM. But the error still persists.
Given that all hardware has been replaced and the software is up-to-date, I am at my wits' end. I no longer believe that the hardware is faulty. Do I have to live with this error message forever ?
Thanks for your attention,
Arthur
This sounds like Intel erratum HSM102: Processor May Experience a Spurious LLC-Related Machine Check During Periods of High Activity. See http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/4th-gen-core-famil....
I forgot to mention, the "Mcistat" flag in the WHEA error message contains the value 0x90000040000f0005
Does the WHEA error message indicate which bank this MCI status is from?
I am having the same error show up on a Dell Precision M4800 mobile workstation. I have the same CPU , i7-4800MQ @2.70GHz. I am on Windows 7 64bit professional running VMware player 6.0.1 build 1379776.
I too have ran several hours worth of stress diagnostics and was unable to repro the error other than when running VMware player (somtimes with it just running in the background with an "idle" Ubuntu VM).
When I've tried to trigger the error by running a lot of video sessions in the VM it does not happen. It seems random. Some days I go without any of these errors, other days I'll have 1 or 2, and on other days they'll appear in clusters of several so it does not really seem related to "high activity" for me.
I have not noticed any bad affects such as freezing or performance hits. I found some other forum chatter that alludes to this being an overclocking issue?
I noticed the MCABank is always 0, but the processor IDs are different each time.
Any clues would be appreciated. Thanks!
Here is the full error:
Log Name: System
Source: Microsoft-Windows-WHEA-Logger
Date: 3/3/2014 3:40:17 PM
Event ID: 19
Task Category: None
Level: Warning
Keywords:
User: LOCAL SERVICE
Computer: PrecisionM4800
Description:
A corrected hardware error has occurred.
Reported by component: Processor Core
Error Source: Corrected Machine Check
Error Type: Internal parity error
Processor ID: 1
The details view of this entry contains further information.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-Windows-WHEA-Logger" Guid="{C26C4F3C-3F66-4E99-8F8A-39405CFED220}" />
<EventID>19</EventID>
<Version>0</Version>
<Level>3</Level>
<Task>0</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000000</Keywords>
<TimeCreated SystemTime="2014-03-03T23:40:17.382291400Z" />
<EventRecordID>49948</EventRecordID>
<Correlation ActivityID="{DF8A3291-FF9B-438C-80AE-652D7D1918CD}" />
<Execution ProcessID="2204" ThreadID="16684" />
<Channel>System</Channel>
<Computer>PrecisionM4800</Computer>
<Security UserID="S-1-5-19" />
</System>
<EventData>
<Data Name="ErrorSource">1</Data>
<Data Name="ApicId">1</Data>
<Data Name="MCABank">0</Data>
<Data Name="MciStat">0x90000040000f0005</Data>
<Data Name="MciAddr">0x0</Data>
<Data Name="MciMisc">0x0</Data>
<Data Name="ErrorType">12</Data>
<Data Name="TransactionType">256</Data>
<Data Name="Participation">256</Data>
<Data Name="RequestType">256</Data>
<Data Name="MemorIO">256</Data>
<Data Name="MemHierarchyLvl">256</Data>
<Data Name="Timeout">256</Data>
<Data Name="OperationType">256</Data>
<Data Name="Channel">256</Data>
<Data Name="Length">864</Data>
Have a look at my post here and see if it helps..
- | System |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- | EventData |
ErrorSource | 1 |
ApicId | 7 |
MCABank | 0 |
MciStat | 0x90000040000f0005 |
MciAddr | 0x0 |
MciMisc | 0x0 |
ErrorType | 12 |
TransactionType | 256 |
Participation | 256 |
RequestType | 256 |
MemorIO | 256 |
MemHierarchyLvl | 256 |
Timeout | 256 |
OperationType | 256 |
Channel | 256 |
Length | 864 |
RawData | 435045521002FFFFFFFF0300020000000200000060030000312D07000A030E140000000000000000000000000000000000000000000000000000000000000000BDC407CF89B7184EB3C41F732CB57131B18BCE2DD7BD0E45B9AD9CF4EBD4F890644E8DEC333CCF0100000000000000000000000000000000000000000000000058010000C00000000102000001000000ADCC7698B447DB4BB65E16F193C4F3DB0000000000000000000000000000000002000000000000000000000000000000000000000000000018020000400000000102000000000000B0A03EDC44A19747B95B53FA242B6E1D0000000000000000000000000000000002000000000000000000000000000000000000000000000058020000080100000102000000000000011D1E8AF94257459C33565E5CC3F7E80000000000000000000000000000000002000000000000000000000000000000000000000000000057010000000000000002080000000000C30603000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000700000000000000000000000000000000000000000000000000000000000000000000000000000003000000000000000700000000000000C306030000081007FFFBFA7FFFFBEBBF000000000000000000000000000000000000000000000000000000000000000001000000010000005DF537C1343CCF0107000000000000000000000000000000000000000000000005000F0040000090000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 |
I notice something peculiar; the error only seems to occur when I run 32-bit VM's, not in 64-bit ones. And I also believe that the errors start after VMWare tools have been installed inside the VM.
I was testing a lengthy (4-hour) unattended network-based 32-bit Windows installation with lots of applications in a VM. The installation of each application in the VM is logged (with timestamp) in a log file.
I notice that the timestamp of the first WHEA error in the host machine coincides *exactly* with the timestamp in my logfile where VMWare tools finished installing in the VM and made the VM reboot.
I'm experiencing the same issue with Workstation 10.0.4, on Windows 7 x64 host fresh install with latest BIOS, drivers and updates. The PC is a new HP EliteDesk 800 Mini, with Intel i7-4785T CPU. So far I have only worked with Windows XP 32-bit guests.
Based on what I have read here so far, I will do some further tests and report back.
Excerpt from the Event Viewer:
<System>
<Provider Name="Microsoft-Windows-WHEA-Logger" Guid="{C26C4F3C-3F66-4E99-8F8A-39405CFED220}" />
<EventID>19</EventID>
<Version>0</Version>
<Level>3</Level>
<Task>0</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000000</Keywords>
<TimeCreated SystemTime="2014-11-04T21:36:20.230309100Z" />
<EventRecordID>9680</EventRecordID>
<Correlation ActivityID="{EAE38998-3319-44CB-87AF-E1283FC323C5}" />
<Execution ProcessID="1536" ThreadID="2300" />
<Channel>System</Channel>
<Computer>mc1004b</Computer>
<Security UserID="S-1-5-19" />
</System>
<EventData>
<Data Name="ErrorSource">1</Data>
<Data Name="ApicId">3</Data>
<Data Name="MCABank">0</Data>
<Data Name="MciStat">0x90000040000f0005</Data>
<Data Name="MciAddr">0x0</Data>
<Data Name="MciMisc">0x0</Data>
<Data Name="ErrorType">12</Data>
<Data Name="TransactionType">256</Data>
<Data Name="Participation">256</Data>
<Data Name="RequestType">256</Data>
<Data Name="MemorIO">256</Data>
<Data Name="MemHierarchyLvl">256</Data>
<Data Name="Timeout">256</Data>
<Data Name="OperationType">256</Data>
<Data Name="Channel">256</Data>
<Data Name="Length">864</Data>
</EventData>
This is erratum HSM142. The events are indicating MciStat = 0x90000040000f0005, which is bang on:
HSM142: Spurious Corrected Errors May be Reported
Due this erratum, spurious corrected errors may be logged in the IA32_MC0_STATUS register with the valid field (bit 63) set, the uncorrected error field (bit 61) not set, a Model Specific Error Code (bits [31:16]) of 0x000F, and an MCA Error Code (bits [15:0]) of 0x0005. If CMCI is enabled, these spurious corrected errors also signal interrupts.
When this erratum occurs, software may see corrected errors that are benign. These corrected errors may be safely ignored.
Same issue exists on 4th generation desktop processors – erratum HSD131. Looks like Workstation exposes the erratum quite nicely under some circumstances. I wouldn't be surprised if other VM software does the same.
M