Linux – Why doesn’t mount respect the read only option for bind mounts

bind-mountlinuxmountreadonly

On my Arch Linux system (Linux Kernel 3.14.2) bind mounts do not respect the read only option

# mkdir test
# mount --bind -o ro test/ /mnt
# touch /mnt/foo

creates the file /mnt/foo. The relevant entry in /proc/mounts is

/dev/sda2 /mnt ext4 rw,noatime,data=ordered 0 0

The mount options do not match my requested options, but do match both the read/write behaviour of the bind mount and the options used to originally mount /dev/sda2 on /

/dev/sda2 / ext4 rw,noatime,data=ordered 0 0

If, however, I remount the mount then it respects the read only option

# mount --bind -o remount,ro test/ /mnt
# touch /mnt/bar
touch: cannot touch ‘/mnt/bar’: Read-only file system

and the relevant entry in /proc/mounts/

/dev/sda2 /mnt ext4 ro,relatime,data=ordered 0 0

looks like what I might expect (although in truth I would expect to see the full path of the test directory). The entry in /proc/mounts/ for the orignal mount of /dev/sda2/ on / is also unchanged and remains read/write

/dev/sda2 / ext4 rw,noatime,data=ordered 0 0

This behaviour and the work around have been known since at least 2008 and are documented in the man page of mount

Note that the filesystem mount options will remain the same as those on the original mount point, and cannot be changed by passing the -o option along with –bind/–rbind. The mount options can be changed by a separate remount command

Not all distributions behave the same. Arch seems to silently fail to respect the options while Debian generates a warning when the bind mount does not get mount read-only

mount: warning: /mnt seems to be mounted read-write.

There are reports that this behaviour was "fixed" in Debian Lenny and Squeeze although it does not appear to be a universal fix nor does it still work in Debian Wheezy. What is the difficultly associated with making bind mount respect the read only option on the initial mount?

Best Answer

Bind mount is just... well... a bind mount. I.e. it's not a new mount. It just "links"/"exposes"/"considers" a subdirectory as a new mount point. As such it cannot alter the mount parameters. That's why you're getting complaints:

# mount /mnt/1/lala /mnt/2 -o bind,ro
mount: warning: /mnt/2 seems to be mounted read-write.

But as you said a normal bind mount works:

# mount /mnt/1/lala /mnt/2 -o bind

And then a ro remount also works:

# mount /mnt/1/lala /mnt/2 -o bind,remount,ro

However what happens is that you're changing the whole mount and not just this bind mount. If you take a look at /proc/mounts you'll see that both bind mount and the original mount change to read-only:

/dev/loop0 /mnt/1 ext2 ro,relatime,errors=continue,user_xattr,acl 0 0
/dev/loop0 /mnt/2 ext2 ro,relatime,errors=continue,user_xattr,acl 0 0

So what you're doing is like changing the initial mount to a read-only mount and then doing a bind mount which will of course be read-only.

UPDATE 2016-07-20:

The following are true for 4.5 kernels, but not true for 4.3 kernels (This is wrong. See update #2 below):

The kernel has two flags that control read-only:

The MS_READONLY: Indicating whether the mount is read-only
The MNT_READONLY: Indicating whether the "user" wants it read-only

On a 4.5 kernel, doing a mount -o bind,ro will actually do the trick. For example, this:

# mkdir /tmp/test
# mkdir /tmp/test/a /tmp/test/b
# mount -t tmpfs none /tmp/test/a
# mkdir /tmp/test/a/d
# mount -o bind,ro /tmp/test/a/d /tmp/test/b

will create a read-only bind mount of /tmp/test/a/d to /tmp/test/b, which will be visible in /proc/mounts as:

none /tmp/test/a tmpfs rw,relatime 0 0
none /tmp/test/b tmpfs ro,relatime 0 0

A more detailed view is visible in /proc/self/mountinfo, which takes into consideration the user view (namespace). The relevant lines will be these:

363 74 0:49 / /tmp/test/a rw,relatime shared:273 - tmpfs none rw
368 74 0:49 /d /tmp/test/b ro,relatime shared:273 - tmpfs none rw

Where on the second line, you can see that it says both ro (MNT_READONLY) and rw (!MS_READONLY).

The end result is this:

# echo a > /tmp/test/a/d/f
# echo a > /tmp/test/b/f
-su: /tmp/test/b/f: Read-only file system

UPDATE 2016-07-20 #2:

A bit more digging into this shows that the behavior in fact depends on the version of libmount which is part of util-linux. Support for this was added with this commit and was released with version 2.27:

commit 9ac77b8a78452eab0612523d27fee52159f5016a
Author: Karel Zak 
Date:   Mon Aug 17 11:54:26 2015 +0200

    libmount: add support for "bind,ro"

    Now it's necessary t use two mount(8) calls to create a read-only
    mount:

      mount /foo /bar -o bind
      mount /bar -o remount,ro,bind

    This patch allows to specify "bind,ro" and the remount is done
    automatically by libmount by additional mount(2) syscall. It's not
    atomic of course.

    Signed-off-by: Karel Zak

which also provides the workaround. The behavior can be seen using strace on an older and a newer mount:

Old:

mount("/tmp/test/a/d", "/tmp/test/b", 0x222e240, MS_MGC_VAL|MS_RDONLY|MS_BIND, NULL) = 0 <0.000681>

New:

mount("/tmp/test/a/d", "/tmp/test/b", 0x1a8ee90, MS_MGC_VAL|MS_RDONLY|MS_BIND, NULL) = 0 <0.011492>
mount("none", "/tmp/test/b", NULL, MS_RDONLY|MS_REMOUNT|MS_BIND, NULL) = 0 <0.006281>

Conclusion:

To achieve the desired result one needs to run two commands (as @Thomas already said):

mount SRC DST -o bind
mount DST -o remount,ro,bind

Newer versions of mount (util-linux >=2.27) do this automatically when one runs

mount SRC DST -o bind,ro

Related Solutions

Unable to mount raid on the NAS, trying to rescue the data, how should I proceed

This is an attempt to summarize from the chat troubleshooting session.

The setup turns out to be physical disk -> mdraid raid1 -> LVM. So there are several layers to work through. The old setup was (due to unfortunate prior recovery efforts) not available.

However, the NAS gui had been used to create another volume on a different disk, and thankfully the GUI created the new volume exactly the same way. So it was possible to discover the setup from the new disk:

mdadm -E new-disk provided the offset to the start of the data, under the mdraid layer (2048 sectors).
dmsetup table provided the start block of the logical volume (relative to the start of the physical volume) (1152 sectors)
There is a magic number (0x53ef) in the third sector of an ext4 volume. Using dd and xxd, we verified that the magic number is present at that offset on the disk we're trying to recover data from.

Armed with the start sector of the ext4 filesystem, you can use a read-only loop device to recover the data:

# losetup /dev/loop0 -o $((512*(1152+2048))) -r /dev/sda1
# mount -text4 -o ro /dev/loop0 /mnt

And then copy it off.

Linux – Determine What Device a Directory Is Located On

If I understand your question you want to know which device was used for a given mount. For this you can use the df command:

$ df -h 
Filesystem                         Size  Used Avail Use% Mounted on
/dev/mapper/fedora_greeneggs-root   50G   21G   27G  44% /
devtmpfs                           3.8G     0  3.8G   0% /dev
tmpfs                              3.8G   14M  3.8G   1% /dev/shm
tmpfs                              3.8G  984K  3.8G   1% /run
tmpfs                              3.8G     0  3.8G   0% /sys/fs/cgroup
tmpfs                              3.8G  3.4M  3.8G   1% /tmp
/dev/sda1                          477M   99M  349M  23% /boot
/dev/mapper/fedora_greeneggs-home  402G  184G  198G  49% /home

To find which device a particular file/directory is found on, give the file as an argument to df. Using your example:

$ df -h /mnt
Filesystem                         Size  Used Avail Use% Mounted on
/dev/sda1                          477M   99M  349M  23% /

You can also use the mount command:

$ mount | grep '^/dev'
/dev/mapper/fedora_greeneggs-root on / type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda1 on /boot type ext4 (rw,relatime,seclabel,data=ordered)
/dev/mapper/fedora_greeneggs-home on /home type ext4 (rw,relatime,seclabel,data=ordered)

The directory mounted for each device is the 3rd argument in the output above. So for device /dev/sda1 would be /boot. The other devices are making use of LVM (Logical Volume Management) and would need to be further queried to know which actual device is being used by LVM.