Debian – Unable to remount / back to read-only after package upgrade

aptdebianfileslsofreadonly

I am using Debian Stretch. My root partition is mounted read-only. Only when I install or upgrade packages, is / remounted to read-write (by using apt hook), and then remounted back to ro.

Sometimes after package upgrade I am unable to remount / back to read-only:

mount -o remount,ro /
mount: / is busy

On older Debian versions (Wheezy), I could list open files that have been unlinked with lsof:

 lsof +L1

or, more specifically, files that prevent / from being remounted back to ro:

{ lsof +L1 ; lsof|sed -n '/SYSV/d; /DEL|(path /p;' ; } | grep -Ev '/(dev|home|tmp|var)'

However, on Debian Stretch, lsof +L1 does not list any files.

I don't see any changes to +|-L in man lsof that would explain why it stopped working.

Why does lsof +L1 no longer list open files that have been unlinked ?

How can I list those files that prevent / from being remounted to read-only?

UPDATE

I have stopped all processes that can be stopped, and only have init and getty still running, but I still cannot remount / to ro.

Best Answer

How can I list those files that prevent / from being remounted to read-only?

A) fuser can be found in the psmisc package; this is a use case where I find fuser shines & is more useful than lsof.

# fuser -v -m / 2>&1 | grep '[Ff]r.e'

That will show all processes that have files open on / for reading (f) and writing (F). The files that would prevent / from being remounted to read-only are those that are opened for writing (F).

Kill the processes that are an executable being run with root directory files open for writing., i.e.

# for fupid in $(fuser -v -m / 2>&1 | grep Fr.e | awk '{print $2}'); do kill $fupid; done

That is above the systemd comments with a caveat. If systemd is init then fuser will see it and there are other considerations. With systemd running, it can (re)start processes behind your back, even if they've just been identified and killed with fuser. systemd is much more advanced than the traditional sysvinit.

B) The UPDATE in the description states the system only has ... init and getty still running ...

I see the comment that says the system is not using systemd, it's using init. On stretch, systemd is init. The comment didn't explicitly say sysvinit, so I'm assuming the system in question may be using the default stretch systemd for init. Or that other people who stumble on this post, that are using stretch's systemd, find this part useful.

Per the Debian Wiki,

The system initialization process is handled by the init daemon. In squeeze and earlier releases, that daemon is provided by the sysvinit package, and no alternatives are supported. In wheezy, the default init daemon is still sysvinit, but a "technology preview" of systemd is available. In jessie and stretch, the default init system is systemd, but switching to sysvinit is supported.

Since jessie, only systemd is fully supported; sysvinit is mostly supported, but Debian packages are not required to provide sysvinit start scripts. runit is also packaged, but has not received the same level of testing and support as the others, and is not currently supported as PID 1.

With systemd running, there are a few additional steps that should be taken to free up / so that it can be remounted without issue.

It's likely system.slice is holding open files for systemd-journald.service or systemd-udevd.service (both of which have socket dependencies). Or, if NetworkManager is running it can respawn dhclient which writes leases to /var/... (& /var/ isn't always its own device), etc. fuser might find & you kill dhclient but NetworkManager starts it right back up.

The moral is lots of things are automated that could 'want' / (and even more so with systemd).

To be sure, if it's feasible, the systemd equivalent of run level 1 is matched by rescue.target (and runlevel1.target is a symbolic link to rescue.target).

1) Start by isolating the system to rescue.target

# systemctl isolate rescue.target

It should prompt you to enter the root password; follow on screen instructions.

2) At the rescue shell, find out what wants /.

# systemctl show -p Wants /

Typically, it's system.slice; stop everything that Wants /. e.g.

# systemctl stop system.slice

3) At this point, the remount should not report mount: / is busy and mount -o remount,ro / should work. If not, check again with fuser.

4) FWIW; I've also seen times when umount fails when/if another device is mounted on a sub-directory of another mount, i.e. nested mounts. For example, umount / would fail if /var/ or /boot/ is on another device (and mounted). Though mount -o remount,ro / should still work in this case.

lsblk can be helpful to visualize nested mounts.

Why does lsof +L1 no longer list open files that have been unlinked ?

Because they aren't available (sockets or most FIFOs & pipes), they're not open files anymore (the parent process closed the file descriptor), or they (still) have a link count greater than 1.

man lsof(8) details ...

+|-L [l]

This option enables ('+') or disables ('-') the listing of file link counts, where they are available - e.g., they aren't available for sockets, or most FIFOs and pipes.

When +L is specified without a following number, all link counts will be listed. When -L is specified (the default), no link counts will be listed.

When +L is followed by a number, only files having a link count less than that number will be listed. (No number may follow -L.) A specification of the form ''+L1'' will select open files that have been unlinked. A specification of the form +aL1 <file_system> will select unlinked open files on the specified file system.

Related Question