Ubuntu – ecryptfs Mounted Home Folder Disappears When Samba Closes Session

ecryptfskdemountUbuntu

On Ubuntu 20.04 – and I have encountered this with (vanilla) GNOME before – with KDE Plasma (no, not Kubuntu!), I am faced with a strange thing that happens every few hours or so and for which I have no explanation or remedy as of yet.

Somehow the ecryptfs-encrypted home folder which gets mounted when I log on "disappears" out of the blue. I mostly notice it due to weird symptoms starting to occur, such as all sorts of programs reporting files from $HOME they can not find, which they deem corrupt or for which they simply report they can't open them.

The first time this happens, I can usually run /usr/bin/ecryptfs-mount-private, enter my passphrase and be done with it. Alas, this still doesn't recover functionality of certain KDE desktop elements. As an example, I am unable search for installed programs from that point on and so everything that isn't already running becomes unavailable until I log off and back on.

Subsequent times this happens and I attempt using /usr/bin/ecryptfs-mount-private I usually see:

$ /usr/bin/ecryptfs-mount-private
Enter your login passphrase:
Inserted auth tok with sig [2123456789012312] into the user session keyring
mount: No such file or directory

Even logging off in such situation becomes a minor nightmare as you can see from the following screenshot. The dialogs pop up merely based on the fact that I am opting to log off!

Screenshot of error dialogs when attempting to log off

So my questions (yeah, plural … since I'm currently at a loss how to even start diagnosing this):

  1. which entity could be causing this automatic removal of my $HOME? … I was reminded of weird behavior like when sessions get purged when you log off and so suddenly your Screen or Tmux sessions also get killed (unless you use loginctl with enable-linger)
  2. what are the steps to troubleshoot such an issue? (keep in mind that the desktop behaves all weird when this happens!). I tried to look at journalctl output and in the logs with ripgrep, but I don't really know what terms to look for …
  3. suppose this is a known bug, what's the workaround if any?

It reminds me a bit of Tmux/Screen getting killed when logging out, something I'd not normally expect and that can be prevented only by starting Tmux/Screen after logging into SSH (i.e. separate login session) or enabling session lingering.


The one thing I found with journalctl which seems odd and correlates to the "lost" home directory is the following:

Sep 01 23:39:11 machine smbd[220424]: pam_unix(samba:session): session closed for user johndoe
Sep 01 23:39:11 machine systemd[1]: home-johndoe.mount: Succeeded.
-- Subject: Unit succeeded
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- The unit home-johndoe.mount has successfully entered the 'dead' state.
Sep 01 23:39:11 machine systemd[1977]: home-johndoe.mount: Succeeded.
-- Subject: Unit succeeded
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--

… but that would indicate that something caused by the Samba daemon on behalf of my interactive user account leads to another part of the system assuming that I logged off and unmounting my $HOME … that sounds exceedingly unlikely, no?

The above pattern pam_unix(samba:session) closing a session for my username followed by the $HOME folder becoming inaccessible is the the smoking gun, but also the only one so far. Currently reading up on how this whole session business is supposed to work and why that mount unit "thinks" it can "reap" my mounted home folder while I am still interactively logged on.

Edit #1: since the comment indicates that the configuration of Samba could be relevant, I am adding it here. I replaced my actual username with johndoe in the dump from testparm:

# Global parameters
[global]
debug uid = Yes
dns proxy = No
guest account = johndoe
log file = /var/log/samba/log.%m
map to guest = Bad Password
max log size = 1000
obey pam restrictions = Yes
panic action = /usr/share/samba/panic-action %d
passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* .
passwd program = /usr/bin/passwd %u
security = USER
server role = standalone server
server string = %h server (Samba, Ubuntu)
syslog = 7
syslog only = Yes
workgroup = NULL
idmap config * : backend = tdb

[sharename]
force create mode = 0660
force directory mode = 0770
guest ok = Yes
guest only = Yes
path = /data/sharedir
read only = No

As you can tell nothing special, but my guess is that the fact that I am "defaulting" to my own user as guest user via global setting is somehow causing the login session to appear for my user.

There are no entries with samba:session marker other than a handful more entries like the log line reproduced above.

Edit #2: my /etc/pam.d/samba looks like this:

@include common-auth
@include common-account
@include common-session-noninteractive

… and so I attempted to edit those referenced files and add debug (separated by a blank space) on every line that referenced either pam_unix or pam_ecryptfs. The result – after a reboot – was that I could no longer log into KDE at all. It simply stalled. So I used one of the other terminals to log on as root and revert my changes (which thanks to etckeeper was trivial).

Edit #3: a temporary workaround is to disable session lingering for my user by setting KillExcludeUsers=root johndoe in /etc/systemd/logind.conf or "locally" via loginctl. Which makes this seem more and more like a defect. … Edit 4: the workaround turned out not to work.

Best Answer

Well, that's stupid of course, since I "wasted" 200 reputation on a bounty mere hours ago, but I seem to have solved the puzzle. Anyone providing hints what to look out for and try which are more straightforward than mine will get the bounty.

Alright, so it turned out that pam_unix from the logs was an important clue. I was able in the end to provoke the situation and thereby reproduce the unmounting reliably.

What I did is also described in the respective ticket on launchpad.net, but I'll reproduce the relevant parts which aren't in the question above here.

My smb.conf before I dug into this issue looked like this as per testparm output:

# Global parameters
[global]
debug uid = Yes
dns proxy = No
guest account = johndoe
log file = /var/log/samba/log.%m
map to guest = Bad Password
max log size = 1000
obey pam restrictions = Yes
panic action = /usr/share/samba/panic-action %d
passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* .
passwd program = /usr/bin/passwd %u
security = USER
server role = standalone server
server string = %h server (Samba, Ubuntu)
syslog = 7
syslog only = Yes
workgroup = NULL
idmap config * : backend = tdb

[sharename]
force create mode = 0660
force directory mode = 0770
guest ok = Yes
guest only = Yes
path = /data/sharedir
read only = No

I opted for a sort of brute-force trial&error method. In Tmux I had several panes open, while attempting to produce an MWE for a defect report. This was effectively what I was running:

  1. while mountpoint /home/johndoe; do sudo service smbd restart; date; sleep 2s ; done
  2. watch 'mount|grep ecryptfs'
  3. sudo tail -F /var/log/auth.log|grep samba:session

... in another Tmux window I then edited/saved the /etc/samba/smb.conf.

Bang!

The auth.log showed the log entry (smbd[144802]: pam_unix(samba:session): session closed for user johndoe) and the mount point vanished.

I had found how to reproduce the annoying condition at last.

Given its name my first pick was indeed the obey pam restrictions setting. So I set it to no (but I could have simply commented it out, because it defaults to no).

Restarted the smbd service, logged off and back in and attempted to reproduce the error condition again.

This time it could not be reproduced. So evidently the obey pam restrictions setting had influenced this whole pam_unix and samba:session business.

Edit #1: in the mentioned ticket further information was requested. In particular in pam-auth-update I was asked to deactivate all but the Unix authentication setting. Like this:

[*] Unix authentication
[ ] Register user sessions in the systemd control group hierarchy
[ ] Create home directory on login
[ ] eCryptfs Key/Mount Management
[ ] Inheritable Capabilities Management

And it turned out that not the second systemd-related setting was the issue, but the fourth one: eCryptfs Key/Mount Management.

Lessons learned

  1. don't place a bounty if you are going to investigate it yourself ?
  2. cargo cult garbage can really harm what you're doing ... this particular setting was one I had sort of carried around in my configuration management for smb.conf while evidently it could have been thrown out by now ... oh well
  3. if all else fails, brute force and trial & error seem to be viable methods to hunt down a root cause
Related Question