I frequently move directory trees to other locations or copy their tarballs to other machines, and I would like to have a method to check whether any symlinks in a directory tree A point to locations outside of A since these will be broken in the moved / copied directory.
Find all symbolic links in a directory tree pointing outside that tree
symlink
Related Solutions
With mount --bind
, a directory tree exists in two (or more) places in the directory hierarchy. This can cause a number of problems. Backups and other file copies will pick all copies. It becomes difficult to specify that you want to copy a filesystem: you'll end up copying the bind-mounted files twice. Searches with find
, grep -r
, locate
, etc., will traverse all the copies, and so on.
You will not gain any “increased functionality and compatibility” with bind mounts. They look like any other directory, which most of the time is not desirable behavior. For example, Samba exposes symbolic links as directories by default; there is nothing to gain with using a bind mount. On the other hand, bind mounts can be useful to expose directory hierarchies over NFS.
You won't have any performance issues with bind mounts. What you'll have is administration headaches. Bind mounts have their uses, such as making a directory tree accessible from a chroot, or exposing a directory hidden by a mount point (this is usually a transient use while a directory structure is being remodeled). Don't use them if you don't have a need.
Only root can manipulate bind mounts. They can't be moved by ordinary means; they lock their location and the ancestor directories.
Generally speaking, if you pass a symbolic link to a command, the command acts on the link itself if it operates on files, and on the target of the link if it operates on file contents. This goes for directories too. This is usually the right thing. Some commands have options to treat symbolic links differently, for example ls -L
, cp -d
, rsync -l
. Whatever you're trying to do, it's far more likely that symlinks are the right tool, than bind mounts being the right tool.
What you're asking for doesn't make much sense in the general case, so it's not surprising that find
has no provision for it.
A symlink with a relative target is relative to the path of the symlink. So for instance, if by traversing a directory by following symlinks, find
encounters a/b/c/d
and a
, a/b
, a/b/c
are all relative or absolute symlinks (or symlinks to paths with symlink components), what should it do?
If you're looking for a find
predicate or a GNU -printf
%
directive that expands to a symlink-free path to the file relative to the current directory or any directory, I'm afraid there's none.
If you're on Linux, you can get the absolute path of those files with:
find -L foo -type f -exec readlink -f {} \;
As you found out, there exists at least one realpath
command which accepts more than one path argument which in combination with the standard -exec cmd {} +
syntax is going to be a lot more efficient since it's running as few realpath commands as necessary:
find -L foo -type f -exec realpath {} +
find -L foo -type f -print0 | xargs -r0 realpath
might be quicker as if more than one realpath
command is needed, find
can keep on looking for more files while the first realpath
starts working which even on a single processor system might make it more efficient.
-print0
and xargs -r0
are not standard, come from GNU but are found in a number of other implementations like most modern BSDs.
Zsh has builtin support for it:
print -rl foo/***/*(-.:A)
If you don't care about the sorting order, you can disable sorting and make it a bit more efficient with:
print -rl foo/***/*(-.oN:A)
If you want to convert those to relative paths to the current directory, you could have a look at that SO question.
If you know that all those files have an absolute canonical path (whose none of the components are symlinks) inside the current directory, you can simplify it to (still with zsh
):
files=(foo/***/*(-.:A))
print -rl -- ${files#$PWD/}
Though short and convenient, and works whatever character filenames contain, I doubt it would faster than find
+ realpath
.
With the Debian realpath
and GNU tools, you can do:
cd -P .
find -L foo -type f -exec realpath -z {} + |
gawk -v p="$PWD" -v l="${#PWD}" -v RS='\0' -vORS='\0' '
substr($0, 1, l+1) == p "/" {$0 = substr($0, l+2)}; 1' |
xargs -r0 whatever you want to do with them
As I realise now, there's now a realpath
in recent versions of GNU coreutils, which has the exact feature you're looking for, so it's just a matter of
find -L foo -type f -print0 |
xargs -r0 realpath -z --relative-base . |
xargs -r0 whatever you want to do with them
(use --relative-to .
instead of --relative-base .
if you want relative paths even for files whose symlink free path doesn't reside below the current working directory).
Best Answer
You want a program called
realpath
, used in conjunction withfind
.E.g.: