I'm on latest Ubuntu 16.04
fully up-to-date, x64 Acer Aspire E1-572G
. Graphics drivers are the open source ones: radeon
for the discrete card, and i915
for the integrated card (which is used by everything unless I set DRI_PRIME=1
which switches graphics processing to radeon
).
I also use tlp
for power management and here is my current tlp
configuration: https://pastebin.com/dpCuTF1b
BIOS is in UEFI (default) mode with secure boot turned off. systemctl --failed
reports 0 loaded units listed
which indicates all services started successfully.
Here is the weird behavior that's going on:
Some times when I put the computer to sleep (suspend), it just turns off. It doesn't even shutdown. It turns off as if I took off the battery from the laptop. I'm not sure if this happens only when I close the lid, but I guess it happens even if I suspend the laptop without closing the lid (by clicking the suspend button). I will feedback later if this problem ever happens without closing the lid, because I can't actually remember right now.
When this happens, the background wallpaper is no longer displayed on the lock screen on later system restarts, until I change the wallpaper again.
/var/crash/
is empty, and /var/log/boot.log
has a report indicating that fsck
is fixing some orphaned inodes. Actually, here are some logs:
- /var/log/boot.log: https://pastebin.com/LTYK9FJE
- /var/log/syslog: https://pastebin.com/Q2PLLZA5
- /var/log/kern.log: https://pastebin.com/jWuTSqe6
I had all versions of Windows (7, 8, 8.1, 10) installed on this laptop before and I never had this issue. I also performed memtest86
tests and they all succeeded.
This crash is very annoying because every time it happens I use debsums
and diff
to make sure nothing is corrupted.
I'm ready to share more info upon request, Thank you!
sudo blkid
output:
/dev/sda1: UUID="5331-7707" TYPE="vfat" PARTLABEL="EFI System Partition" PARTUUID="3f8152ad-fccf-4675-9e9d-8ad5ad225726"
/dev/sda2: UUID="715420bf-c241-4f2d-bec4-01a73dbb9806" TYPE="ext4" PARTUUID="0f05b77f-4b96-4f0c-a5cb-eb425a230467"
/dev/sda3: UUID="ffaf844c-98b3-4b94-89c7-c0e69a456921" TYPE="swap" PARTUUID="5c622918-0497-4354-b187-15f08cd35fc3"
cat /etc/fstab
output:
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda2 during installation
UUID=715420bf-c241-4f2d-bec4-01a73dbb9806 / ext4 errors=remount-ro 0 1
# /boot/efi was on /dev/sda1 during installation
UUID=5331-7707 /boot/efi vfat umask=0077 0 1
# swap was on /dev/sda3 during installation
UUID=ffaf844c-98b3-4b94-89c7-c0e69a456921 none swap sw 0 0
sudo fdisk -l
output:
Disk /dev/sda: 698.7 GiB, 750156374016 bytes, 1465149168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 82BDC3DF-6864-454F-AA6D-F59B05865FEF
Device Start End Sectors Size Type
/dev/sda1 2048 1050623 1048576 512M EFI System
/dev/sda2 1050624 1448617983 1447567360 690.3G Linux filesystem
/dev/sda3 1448617984 1465147391 16529408 7.9G Linux swap
ls -alt /var/crash
output:
total 8
drwxrwsrwt 2 root whoopsie 4096 Aug 21 09:35 .
drwxr-xr-x 14 root root 4096 Aug 1 14:34 ..
grep -i Temperature_Celsius /var/log/syslog
output:
Aug 21 09:35:49 AdamPC smartd[1032]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 109 to 115
Aug 21 09:57:55 AdamPC smartd[1032]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 115 to 109
Aug 21 15:33:14 AdamPC smartd[1032]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 109 to 111
Aug 21 15:57:54 AdamPC smartd[1032]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 111 to 107
Aug 21 16:27:54 AdamPC smartd[1032]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 107 to 106
Aug 21 16:57:54 AdamPC smartd[1032]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 106 to 105
Aug 21 18:21:06 AdamPC smartd[1024]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 105 to 111
Aug 21 18:51:06 AdamPC smartd[1024]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 111 to 109
Aug 21 19:21:07 AdamPC smartd[1024]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 109 to 106
Aug 21 20:26:28 AdamPC smartd[1024]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 106 to 113
Aug 21 21:07:26 AdamPC smartd[998]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 113 to 114
Screenshot showing that my disk isn't burning:
Uninstalling tlp
did not solve this problem. My computer suddenly turned off again. This is the syslog output before suspend:
Aug 26 14:03:45 AdamPC NetworkManager[1012]: <info> [1503745425.2165] manager: sleep requested (sleeping: no enabled: yes)
Aug 26 14:03:45 AdamPC NetworkManager[1012]: <info> [1503745425.2166] manager: sleeping...
Aug 26 14:03:45 AdamPC NetworkManager[1012]: <info> [1503745425.2166] device (wlp2s0): state change: disconnected -> unmanaged (reason 'sleeping') [30 10 37]
Aug 26 14:03:45 AdamPC NetworkManager[1012]: <info> [1503745425.2314] manager: NetworkManager state is now ASLEEP
Aug 26 14:03:45 AdamPC wpa_supplicant[1172]: nl80211: deinit ifname=wlp2s0 disabled_11b_rates=0
Aug 26 14:03:46 AdamPC systemd[1]: Reached target Sleep.
Aug 26 14:03:46 AdamPC systemd[1]: Starting Suspend...
Aug 26 14:03:46 AdamPC systemd-sleep[6098]: Failed to connect to non-global ctrl_ifname: (nil) error: No such file or directory
Aug 26 14:03:46 AdamPC systemd-sleep[6099]: /lib/systemd/system-sleep/wpasupplicant failed with error code 255.
Aug 26 14:03:46 AdamPC systemd-sleep[6098]: Suspending system...
Aug 26 14:28:55 AdamPC rsyslogd: [origin software="rsyslogd" swVersion="8.16.0" x-pid="1036" x-info="http://www.rsyslog.com"] start
The first line in the excerpt is when I closed the lid (triggered the suspend). The last line of the excerpt is when I powered on the laptop after the accident (See the timings). Weird thing, I get the same log when having a normal suspend. Complete syslog: http://pasted.co/7354277e
Important Update
I reinstalled tlp
after finding that it is not the cause of the problem. Now I noticed something that might help us scope down the problem. I can replicate the accident by suspending the system and while suspending it, I attach/detach my mouse repeatedly (or any USB device). Once I do that, the accident happens immediately. This can only be replicated while tlp
is installed. I guess tlp
makes it happen more consistently?
The Update That Solved The Problem
I resumed debugging today. Launched my eyeballs towards /var/log/syslog
until I finally found something worthy:
Sep 5 20:19:20 AdamPC kernel: [ 167.044965] pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
Sep 5 20:19:20 AdamPC kernel: [ 167.044971] dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns -16
Sep 5 20:19:20 AdamPC kernel: [ 167.044973] PM: Device 0000:00:14.0 failed to suspend: error -16
Sep 5 20:19:20 AdamPC kernel: [ 167.044975] PM: Some devices failed to suspend, or early wake event detected
Device 0000:00:14.0? Let's run lspci
:
00:14.0 USB controller: Intel Corporation 8 Series USB xHCI HC (rev 04)
Made some google search and found the solution in ArchWiki. Look at my answer for the solution.
Best Answer
From the comments...
We uninstalled
tlp
and also installedintel-microcode
.Update #1:
To check the file system on your Ubuntu partition...
sudo fsck -f /
fsck
command if there were errorsreboot