Mount and Hardlink – Why Can’t Create Hardlink from a Bind Mount Directory?

bind-mountlnmount

Original Problem

I have a file on one filesystem: /data/src/file

and I want to hard link it to: /home/user/proj/src/file

but /home is on one disk, and /data is on another so I get an error:

$ cd /home/user/proj/src
$ ln /data/src/file .
ln: failed to create hard link './file' => '/data/src/file': Invalid cross-device link

Okay, so I learned I can't hard link across devices. Makes sense.

Problem at hand

So I thought I'd get fancy and bind mount a src folder that's on /data's file system:

$ mkdir -p /data/other/src
$ cd /home/user/proj
$ sudo mount --bind /data/other/src src/
$ cd src
$ # (now we're technically on `/data`'s file system, right?)
$ ln /data/src/file .
ln: failed to create hard link './file' => '/data/src/file': Invalid cross-device link

Why does this still not work?

Workaround

I know I have this setup right because I can make the hard link as long as I'm in the "real" /data directory instead of the bound one.

$ cd /data/other/src
$ ln /data/src/file .
$ # OK
$ cd /home/user/proj/src
$ ls -lh
total 35M
-rw------- 2 user user 35M Jul 17 22:22 file

$

Some System Info

$ uname -a
Linux <host> 4.10.0-24-generic #28-Ubuntu SMP Wed Jun 14 08:14:34 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

$ findmnt
.
.
.
├─/home                               /dev/sdb8   ext4       rw,relatime,data=ordered
│ └─/home/usr/proj/src             /dev/sda2[/other/src]
│                                                 ext4       rw,relatime,data=ordered
└─/data                               /dev/sda2   ext4       rw,relatime,data=ordered

$ mountpoint -d /data
8:2

$ mountpoint -d /home/usr/proj/src/
8:2

Note: I manually changed the file and directory names to make the situation more clear, so there may be a typo or two in the command readouts.

Best Answer

There's a disappointing lack of comments in the code. It's as if no-one ever thought it useful, since the time bind mounts were implemented in v2.4. Surely all you'd need to do is substitute .mnt->mnt_sb where it says .mnt...

Because it gives you a security boundary around a subtree.

PS: that had been discussed quite a few times, but to avoid searches: consider e.g. mount --bind /tmp /tmp; now you've got a situation when users can't create links to elsewhere no root fs, even though they have /tmp writable to them. Similar technics works for other isolation needs - basically, you can confine rename/link to given subtree. IOW, it's a deliberate feature. Note that you can bind a bunch of trees into chroot and get predictable restrictions regardless of how the stuff might get rearranged a year later in the main tree, etc.

-- Al Viro

There's a concrete example further down the thread

Whenever we get mount -r --bind working properly (which I use to place copies of necessary shared libraries inside chroot jails while allowing page cache sharing), this feature would break security.

mkdir /usr/lib/libs.jail
for i in $LIST_OF_LIBRARIES; do
ln /usr/lib/$i /usr/lib/libs.jail/$i
done
mount -r /usr/lib/libs.jail /jail/lib
chown prisoner /usr/log/jail
mount /usr/log/jail /jail/usr/log
chrootuid /jail prisoner /bin/untrusted &

Although protections should be enough, but I'd rather avoid having the prisoner link /jail/lib/libfoo.so (write returns EROFS) to /jail/usr/log where it's potentially writeable.