Ubuntu – My laptop sometimes turns off when suspending it

16.04power-managementsuspend

I'm on latest Ubuntu 16.04 fully up-to-date, x64 Acer Aspire E1-572G. Graphics drivers are the open source ones: radeon for the discrete card, and i915 for the integrated card (which is used by everything unless I set DRI_PRIME=1 which switches graphics processing to radeon).

I also use tlp for power management and here is my current tlp configuration: https://pastebin.com/dpCuTF1b

BIOS is in UEFI (default) mode with secure boot turned off. systemctl --failed reports 0 loaded units listed which indicates all services started successfully.

Here is the weird behavior that's going on:

Some times when I put the computer to sleep (suspend), it just turns off. It doesn't even shutdown. It turns off as if I took off the battery from the laptop. I'm not sure if this happens only when I close the lid, but I guess it happens even if I suspend the laptop without closing the lid (by clicking the suspend button). I will feedback later if this problem ever happens without closing the lid, because I can't actually remember right now.

When this happens, the background wallpaper is no longer displayed on the lock screen on later system restarts, until I change the wallpaper again.

/var/crash/ is empty, and /var/log/boot.log has a report indicating that fsck is fixing some orphaned inodes. Actually, here are some logs:

I had all versions of Windows (7, 8, 8.1, 10) installed on this laptop before and I never had this issue. I also performed memtest86 tests and they all succeeded.

This crash is very annoying because every time it happens I use debsums and diff to make sure nothing is corrupted.

I'm ready to share more info upon request, Thank you!


sudo blkid output:

/dev/sda1: UUID="5331-7707" TYPE="vfat" PARTLABEL="EFI System Partition" PARTUUID="3f8152ad-fccf-4675-9e9d-8ad5ad225726"
/dev/sda2: UUID="715420bf-c241-4f2d-bec4-01a73dbb9806" TYPE="ext4" PARTUUID="0f05b77f-4b96-4f0c-a5cb-eb425a230467"
/dev/sda3: UUID="ffaf844c-98b3-4b94-89c7-c0e69a456921" TYPE="swap" PARTUUID="5c622918-0497-4354-b187-15f08cd35fc3"

cat /etc/fstab output:

# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/sda2 during installation
UUID=715420bf-c241-4f2d-bec4-01a73dbb9806 /               ext4    errors=remount-ro 0       1
# /boot/efi was on /dev/sda1 during installation
UUID=5331-7707  /boot/efi       vfat    umask=0077      0       1
# swap was on /dev/sda3 during installation
UUID=ffaf844c-98b3-4b94-89c7-c0e69a456921 none            swap    sw              0       0

sudo fdisk -l output:

Disk /dev/sda: 698.7 GiB, 750156374016 bytes, 1465149168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 82BDC3DF-6864-454F-AA6D-F59B05865FEF

Device          Start        End    Sectors   Size Type
/dev/sda1        2048    1050623    1048576   512M EFI System
/dev/sda2     1050624 1448617983 1447567360 690.3G Linux filesystem
/dev/sda3  1448617984 1465147391   16529408   7.9G Linux swap

ls -alt /var/crash output:

total 8
drwxrwsrwt  2 root whoopsie 4096 Aug 21 09:35 .
drwxr-xr-x 14 root root     4096 Aug  1 14:34 ..

grep -i Temperature_Celsius /var/log/syslog output:

Aug 21 09:35:49 AdamPC smartd[1032]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 109 to 115
Aug 21 09:57:55 AdamPC smartd[1032]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 115 to 109
Aug 21 15:33:14 AdamPC smartd[1032]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 109 to 111
Aug 21 15:57:54 AdamPC smartd[1032]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 111 to 107
Aug 21 16:27:54 AdamPC smartd[1032]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 107 to 106
Aug 21 16:57:54 AdamPC smartd[1032]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 106 to 105
Aug 21 18:21:06 AdamPC smartd[1024]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 105 to 111
Aug 21 18:51:06 AdamPC smartd[1024]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 111 to 109
Aug 21 19:21:07 AdamPC smartd[1024]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 109 to 106
Aug 21 20:26:28 AdamPC smartd[1024]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 106 to 113
Aug 21 21:07:26 AdamPC smartd[998]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 113 to 114

Screenshot showing that my disk isn't burning:

enter image description here

Uninstalling tlp did not solve this problem. My computer suddenly turned off again. This is the syslog output before suspend:

Aug 26 14:03:45 AdamPC NetworkManager[1012]: <info>  [1503745425.2165] manager: sleep requested (sleeping: no  enabled: yes)
Aug 26 14:03:45 AdamPC NetworkManager[1012]: <info>  [1503745425.2166] manager: sleeping...
Aug 26 14:03:45 AdamPC NetworkManager[1012]: <info>  [1503745425.2166] device (wlp2s0): state change: disconnected -> unmanaged (reason 'sleeping') [30 10 37]
Aug 26 14:03:45 AdamPC NetworkManager[1012]: <info>  [1503745425.2314] manager: NetworkManager state is now ASLEEP
Aug 26 14:03:45 AdamPC wpa_supplicant[1172]: nl80211: deinit ifname=wlp2s0 disabled_11b_rates=0
Aug 26 14:03:46 AdamPC systemd[1]: Reached target Sleep.
Aug 26 14:03:46 AdamPC systemd[1]: Starting Suspend...
Aug 26 14:03:46 AdamPC systemd-sleep[6098]: Failed to connect to non-global ctrl_ifname: (nil)  error: No such file or directory
Aug 26 14:03:46 AdamPC systemd-sleep[6099]: /lib/systemd/system-sleep/wpasupplicant failed with error code 255.
Aug 26 14:03:46 AdamPC systemd-sleep[6098]: Suspending system...
Aug 26 14:28:55 AdamPC rsyslogd: [origin software="rsyslogd" swVersion="8.16.0" x-pid="1036" x-info="http://www.rsyslog.com"] start

The first line in the excerpt is when I closed the lid (triggered the suspend). The last line of the excerpt is when I powered on the laptop after the accident (See the timings). Weird thing, I get the same log when having a normal suspend. Complete syslog: http://pasted.co/7354277e


Important Update

I reinstalled tlp after finding that it is not the cause of the problem. Now I noticed something that might help us scope down the problem. I can replicate the accident by suspending the system and while suspending it, I attach/detach my mouse repeatedly (or any USB device). Once I do that, the accident happens immediately. This can only be replicated while tlp is installed. I guess tlp makes it happen more consistently?

The Update That Solved The Problem

I resumed debugging today. Launched my eyeballs towards /var/log/syslog until I finally found something worthy:

Sep  5 20:19:20 AdamPC kernel: [  167.044965] pci_pm_suspend(): hcd_pci_suspend+0x0/0x30 returns -16
Sep  5 20:19:20 AdamPC kernel: [  167.044971] dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns -16
Sep  5 20:19:20 AdamPC kernel: [  167.044973] PM: Device 0000:00:14.0 failed to suspend: error -16
Sep  5 20:19:20 AdamPC kernel: [  167.044975] PM: Some devices failed to suspend, or early wake event detected

Device 0000:00:14.0? Let's run lspci:

00:14.0 USB controller: Intel Corporation 8 Series USB xHCI HC (rev 04)

Made some google search and found the solution in ArchWiki. Look at my answer for the solution.

Best Answer

From the comments...

We uninstalled tlp and also installed intel-microcode.

Update #1:

To check the file system on your Ubuntu partition...

  • boot to the GRUB menu
  • choose Advanced Options
  • choose Recovery mode
  • choose Root access
  • at the # prompt, type sudo fsck -f /
  • repeat the fsck command if there were errors
  • type reboot
Related Question