Shell – Triple dot wildcards

directoryfile-copyrecursiveshellwildcards

I have a very large directory structure of source code that is very difficult to work with. I would like to run a tool to transform it to a Maven like structure that works better for me. When I have finished my work I run the tool again to transform the Maven structure back to the original awful structure. I am very familiar with various shells and yes I could write a script containing hundreds of cp commands. But this will be hard to maintain.

I want to be able to move or copy files using something like Perforce's triple dot wild card that:

Matches all files under the current working directory and all subdirectories.
(matches anything, including slashes, and does so across subdirectories)

My script would then contain commands like:

cp src/.../foobar/.../*.java trusted/src/main/java/.../foobar/.../*.java

The idea is to move a subtree of the directory while maintaining the structure of that subtree.

Any ideas?

I am having issues with the Gilles' rsync solution. Here is a test script:

#!/bin/bash

rm -rf source

mkdir -p source/server/src/com/bodhi/foobar/this
mkdir -p source/server/src/com/bodhi/foobar/that
mkdir -p source/server/src/com/bodhi/other

echo "Hello World" > source/server/src/com/bodhi/foobar/this/A.java
echo "Hello World" > source/server/src/com/bodhi/foobar/that/B.java
echo "Hello World" > source/server/src/com/bodhi/other/C.java

rm -rf target
mkdir -p target/foobar/src/main/java

rsync \
  --include='**/foobar/**/*.java' \
  --include='**/foobar/**/' \
  --exclude='*' \
  --prune-empty-dirs \
  source/server/src/ target/foobar/src/main/java/

Best Answer

This wildcard exists in ksh93, bash ≥4.3 (≥ 4.0 if there are no symbolic link to directories in your tree) and zsh. It's spelled **. In ksh93, it needs to be activated first with set -o globstar. In bash, it needs to be activated first with shopt -o globstar.

ls -l src/**/foobar/**/*.java

This won't do to make the copy, though. The target of cp is a single directory, cp doesn't do any wildcard matching. You can't use a single cp command to drop files in different places.

You can use rsync instead. Pass it the root of the source tree and the root of the destination tree, and define include and exclude rules to copy only the files you want and the directories leading to them. Rsync will copy empty directories as well if they match the pattern, you can make it delete them afterwards with --prune-empty-dirs.

rsync --include='**/foobar/**/*.java' --include='**/' \
      --exclude='*' --prune-empty-dirs \
      src/ trusted/src/main/java/

Another tool you can use is pax. Note that pax is standard in that it is defined by POSIX (unlike rsync), but some Linux distributions omit it from the default installation (it's always available as a package however). The approach is similar to rsync: include .java files, exclude the rest; the syntax is a bit stranger: you specify a pattern replacement, which can be the original to include a file and not rename it, or an empty replacement to exclude a file. Leading directories are automatically created in the destination as necessary.

pax -rw -s '/\.java$/&/' -s '/.*//' src/* trusted/src/main/java/
Related Question