Deleting Broken Symbolic Links – Pros and Cons

symlink

I was running a script that iterated over all the files on my Linux system and created some metadata about them, and it threw an error when it hit a broken symbolic link.

I am newish to *nix, but I get the main idea behind linking files and how broken links come to exist. As far as I know, they are like the equivalent of litter in the street. Things that a program I'm removing wasn't smart enough to tell the package manager existed, and belonged to it, or something that got left behind in an upgrade. At first, I started to tweak the script I'm running to skip them, then I thought, 'well we could always delete them while we're down here…'

I'm running Ubuntu 14.04 (Trusty Tahr). I can't see any reason not to, but before I go ahead and run this over my development system, is there any reason this might actually be a terrible idea? Do broken symlinks serve some purpose I am not aware of?

Best Answer

There are many reasons for broken symbolic links:

A link was created to a target which no longer exists.
Resolution: remove the broken symlink.
A link was created for a target which has been moved. Or it's a relative link that's been moved relative to its target. (Not to imply that relative symlinks are a bad idea — quite the opposite: absolute symlinks are more prone to going stale because their target moved.)
Resolution: find the intended target and fix the link.
There was a mistake when creating the link.
Resolution: find the intended target and fix the link.
The link is to a file which is on a removable disk, network filesystem or other storage area which is not currently mounted. Resolution: none, the link isn't broken all the time. The link will work when the storage area is mounted.
The link is to a file which exists only some of the time, by design. For example, the file is the cached output of a process, which is deleted when the information goes stale but only re-created upon explicit request. Or the link is to an inbox which is deleted when empty. Or the link is to a device file which is only present when the corresponding peripheral is attached. Resolution: none, the link isn't broken all the time.
The link is only valid in a different storage hierarchy. For example, it is valid only in a chroot jail, or it's exported by an NFS server and only valid on the server or on some of its clients.
Resolution: none, the link isn't broken everywhere.
The link is broken for you, because you lack the permission to traverse a directory to reach the target, but it isn't broken for users with appropriate privilege.
Resolution: none, the link isn't broken for everybody.
The link is used to store information, as in the Firefox lock example cited by vinc17. One reason to do it this way is that it's easier to populate a symlink atomically — there's no other way, whereas populating a file atomically is more complex: you need to create the file content under a temporary name, then move it into place, and handle stale temporary files left behind by a crash. Another reason is that symlinks are typically stored directly inside their inode on some filesystems, which makes reading them faster than reading the content of a file.
Resolution: none. In this case, removing the link would be detrimental.

If you can determine that a symlink falls into the first category, then sure, go ahead and delete it. Otherwise, abstain.

A program that traverses directories recursively and cares about file contents should usually ignore broken symbolic links.

Related Solutions

Shell – automate the process of fixing all broken links

The first problem is that your find command will only find links that used full paths, not relative ones. To illustrate:

$ ln -s /home/terdon/foo/NonExistantFile foo
$ ln -s NonExistantFile bar

$ tree
.
|-- bar -> NonExistantFile
`-- foo -> /home/terdon/foo/NonExistantFile

In the example above, I created two broken links. The first used an absolute path and the second, a relative one. If I now try your find command (having it echo the relinking command instead of running it so we can see what it's doing), only one of the two will be found:

$ find . -lname '/home/terdon/*' -exec \
    sh -c 'echo ln -snf "/home$(readlink "$0")" "$0"' {} \; 
ln -snf /home/home/terdon/foo/NonExistantFile ./foo

The second issue is that your path is wrong. You are recreating links as "/home$(readlink "$0")" "$0". The readlink command will already show the full path so adding /home to it results in /home/home/... which is not what you want.

More importantly, what you are attempting is not possible. If a link is broken, that means that its target does not exist. Since the target doesn't exist, you can't simply relink the file, there's nowhere to link it to. The only thing you could do is recreate the link's target. This, however, is unlikely to be very useful since it would simply make your broken links point to new, empty files. If that is indeed what you want to do, you could try

 find . -type l -exec sh -c 'touch "$(readlink "{}")" ' \;

Finally, you might want to create a more complex script that i) finds all broken links ii) searches your machine for files with the same name as the target of the link iii) presents you with a list of them and iv) asks you which one it should now link to.

Best Answer

Related Solutions

Shell – automate the process of fixing all broken links

Related Question