Strange errors in dmesg - Dell XPS 13 developer edition / Project Sputnik Feedback - OS and Applications - Dell Community

Strange errors in dmesg

OS and Applications

OS and Applications
Dell OS and Applications Solutions on Dell TechCenter - Project Sputnik, Microsoft Windows, Red Hat Linux, SUSE, Ubuntu, and more

Strange errors in dmesg

This question is not answered

I'm seeing the following errors in dmesg.  My laptop does not feel hot, the fans are not running, and cpu sensors report ~42º C.  I am running:

$ dmesg|grep BIOS

Dell Inc. XPS 13 9360/0839Y6, BIOS 2.1.0 08/02/2017

$ lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 17.04
Release: 17.04
Codename: zesty

$ uname -a
Linux 4.10.0-33-generic #37-Ubuntu SMP Fri Aug 11 10:55:28 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

The errors:

......

[ 371.258220] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 371.258221] CPU2: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 371.258222] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 371.258223] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 371.258225] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 371.258227] mce: [Hardware Error]: Machine check events logged
[ 371.258229] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 371.258230] mce: [Hardware Error]: Machine check events logged
[ 371.258235] mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 128: 0000000088022803
[ 371.258236] mce: [Hardware Error]: TSC 108eb7f8e87
[ 371.258238] mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1503613896 SOCKET 0 APIC 1 microcode 62
[ 371.258239] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 128: 0000000088022803
[ 371.258240] mce: [Hardware Error]: TSC 108eb7fd739
[ 371.258241] mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1503613896 SOCKET 0 APIC 0 microcode 62
[ 371.259235] CPU2: Core temperature/speed normal
[ 371.259236] CPU0: Core temperature/speed normal
[ 371.259236] CPU3: Package temperature/speed normal
[ 371.259237] CPU1: Package temperature/speed normal
[ 371.259238] CPU0: Package temperature/speed normal
[ 371.259239] CPU2: Package temperature/speed normal
[ 371.259247] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 128: 0000000088032802
[ 371.259249] mce: [Hardware Error]: TSC 108ebac8472
[ 371.259252] mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1503613896 SOCKET 0 APIC 0 microcode 62
[ 371.259253] mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 128: 0000000088032802
[ 371.259254] mce: [Hardware Error]: TSC 108ebac9ba9
[ 371.259255] mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1503613896 SOCKET 0 APIC 1 microcode 62

......

[ 4709.006144] CPU3: Core temperature above threshold, cpu clock throttled (total events = 13)
[ 4709.006145] CPU1: Core temperature above threshold, cpu clock throttled (total events = 13)
[ 4709.006146] CPU0: Package temperature above threshold, cpu clock throttled (total events = 61)
[ 4709.006147] CPU2: Package temperature above threshold, cpu clock throttled (total events = 61)
[ 4709.006149] CPU1: Package temperature above threshold, cpu clock throttled (total events = 61)
[ 4709.006151] mce_notify_irq: 1 callbacks suppressed
[ 4709.006152] mce: [Hardware Error]: Machine check events logged
[ 4709.006154] CPU3: Package temperature above threshold, cpu clock throttled (total events = 61)
[ 4709.006155] mce: [Hardware Error]: Machine check events logged
[ 4709.006161] mce: [Hardware Error]: CPU 1: Machine Check: 0 Bank 128: 0000000088022803
[ 4709.006162] mce: [Hardware Error]: TSC c7dd87a7317
[ 4709.006164] mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1503618234 SOCKET 0 APIC 2 microcode 62
[ 4709.006167] mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 128: 0000000088022803
[ 4709.006171] mce: [Hardware Error]: TSC c7dd87ac253
[ 4709.006177] mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1503618234 SOCKET 0 APIC 3 microcode 62
[ 4709.013152] CPU1: Core temperature/speed normal
[ 4709.013153] CPU3: Core temperature/speed normal
[ 4709.013154] CPU0: Package temperature/speed normal
[ 4709.013154] CPU2: Package temperature/speed normal
[ 4709.013156] CPU3: Package temperature/speed normal
[ 4709.013156] CPU1: Package temperature/speed normal
[ 4709.013190] mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 128: 0000000088032802
[ 4709.013191] mce: [Hardware Error]: TSC c7dd9b0fc4f
[ 4709.013193] mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1503618234 SOCKET 0 APIC 3 microcode 62
[ 4709.013195] mce: [Hardware Error]: CPU 1: Machine Check: 0 Bank 128: 0000000088032802
[ 4709.013195] mce: [Hardware Error]: TSC c7dd9b11418
[ 4709.013197] mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1503618234 SOCKET 0 APIC 2 microcode 62

All Replies
  • Also seeing these, although I'm not concerned about the ath10k messages:

    [    2.276604] EXT4-fs (nvme0n1p2): re-mounted. Opts: errors=remount-ro

    [    2.397185] int3403 thermal: probe of INT3403:03 failed with error -22

    [    2.795411] ath10k_pci 0000:3a:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:3a:00.0.bin failed with error -2

    [    2.795431] ath10k_pci 0000:3a:00.0: Direct firmware load for ath10k/cal-pci-0000:3a:00.0.bin failed with error -2

    [    2.796391] ath10k_pci 0000:3a:00.0: Direct firmware load for ath10k/QCA6174/hw3.0/firmware-5.bin failed with error -2

  • Testing with a live image of Fedora and IPTD tool showed no issues, but I'd still like to know what the machine check exceptions are about.

  • it certainly means that your laptop has reach critical temperature (100°), and then was OK again.

    You should monitor the temperature and processes to see what makes it go to the roof.