Saturday 21 September 2019

bsod - WHEA_UNCORRECTABLE_ERROR, what now?


Ever since I've built my own PC with Windows 10, I've experienced issues where my PC would randomly stop responding. In every case, I've had to perform a cold reboot, because Windows would not shut down or restart when told to. In 2 cases, I was confronted with a WHEA_UNCORRECTABLE_ERROR instead. I suspect that the issues may be related, so I'd like to fix my BSOD.


Apparently, WHEA_UNCORRECTABLE_ERROR appears when there's a faulty hardware or buggy driver. Unfortunately, the error itself isn't very descriptive, so I don't have much information to work with.


I discovered that Windows creates a .dmp file whenever a BSOD occurs, and that the files are saved in C:\Windows\Minidump. Unfortunately, I have no idea what to do with these files. Whenever I try to open a .dmp file with Notepad++, I get:



Can not open file



According to https://support.microsoft.com/en-us/kb/315271, I should use Dumpchk.exe, but it doesn't seem to be installed on my PC and the only relevant-looking link has me download the Windows Driver Kit and the Debugging Tools for Windows together with Visual Studio. I'm not interested in debugging Windows or developing drivers. I just want to see what's written in the .dmp file, so I know which driver needs to be reinstalled/updated, or which component needs to be exchanged.


How do I open a .dmp file?



Answer



To see more details when you get a Bug Check 0x124: WHEA_UNCORRECTABLE_ERROR, open the dmp in Windbg.exe, which is part of the Debugging Tools for Windows, which itself is part of the Windows 10 SDK.


Now setup the debug symbols in windbg, and run the !errrec command with the value from 2nd argument:


*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred. Parameter 1 identifies the type of error
source that reported the error. Parameter 2 holds the address of the
WHEA_ERROR_RECORD structure that describes the error conditon.
Arguments:
Arg1: 0000000000000000, Machine Check Exception
Arg2: ffff8905a872c028, Address of the WHEA_ERROR_RECORD structure.
Arg3: 00000000fe000000, High order 32-bits of the MCi_STATUS value.
Arg4: 0000000000801136, Low order 32-bits of the MCi_STATUS value.

6: kd> !errrec ffff8905a872c028
===============================================================================
Common Platform Error Record @ ffff8905a872c028
-------------------------------------------------------------------------------
Record Id : 01d24ff887f68558
Severity : Fatal (1)
Length : 928
Creator : Microsoft
Notify Type : Machine Check Exception
Timestamp : 12/11/2016 10:04:07 (UTC)
Flags : 0x00000000

===============================================================================
Section 0 : Processor Generic
-------------------------------------------------------------------------------
Descriptor @ ffff8905a872c0a8
Section @ ffff8905a872c180
Offset : 344
Length : 192
Flags : 0x00000001 Primary
Severity : Fatal

Proc. Type : x86/x64
Instr. Set : x64
Error Type : Cache error
Operation : Data Read
Flags : 0x00
Level : 2
CPU Version : 0x00000000000506e3
Processor ID : 0x0000000000000006

===============================================================================
Section 1 : x86/x64 Processor Specific
-------------------------------------------------------------------------------
Descriptor @ ffff8905a872c0f0
Section @ ffff8905a872c240
Offset : 536
Length : 128
Flags : 0x00000000
Severity : Fatal

Local APIC Id : 0x0000000000000006
CPU Id : e3 06 05 00 00 08 10 06 - bf fb fa 7f ff fb eb bf
00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00

Proc. Info 0 @ ffff8905a872c240

===============================================================================
Section 2 : x86/x64 MCA
-------------------------------------------------------------------------------
Descriptor @ ffff8905a872c138
Section @ ffff8905a872c2c0
Offset : 664
Length : 264
Flags : 0x00000000
Severity : Fatal

Error : DCACHEL2_DRD_ERR (Proc 6 Bank 9)
Status : 0xfe00000000801136
Address : 0x00000000b3800000
Misc. : 0x00000030e5000086

Here you can see that you have issues while reading data from L2 Cache of the CPU:


Error Type    : Cache error
Operation : Data Read
Error : DCACHEL2_DRD_ERR

Using !sysinfo machineid shows you use an older BIOS/UEFI:


BiosVersion = 1805
BiosReleaseDate = 05/13/2016
BaseBoardManufacturer = ASUSTeK COMPUTER INC.
BaseBoardProduct = Z170 PRO GAMING

So update the BIOS/UEFI to Version 3016 because it should improve system stability.


If you still get issues, do a CPU stress test, to test if your CPU is damaged. Also make sure you don't undervolt the CPU.


No comments:

Post a Comment

How can I VLOOKUP in multiple Excel documents?

I am trying to VLOOKUP reference data with around 400 seperate Excel files. Is it possible to do this in a quick way rather than doing it m...