Ubuntu – Copy files of same name into same directory

bashcommand linecopyfiles

I have a large number of folders with a file named genome.txt that I need to cp into the same folder. I am trying to figure out how I can do this so the files look like genome1.txt, genome2.txt, etc.. I am looking for a simple solution to this.

Best Answer

Shell script and its usage

This can be done with a simple shell script which makes use of find ...|while read var ; do ... done structure ( very common in dealing with filenames ). The bash script below operates on current (top-most) directory and takes single argument for destination.

#!/bin/bash
find -name "genome.txt" -print0 | while IFS= read -r -d '' path
do
    base=$(basename --suffix=".txt"  "$path")
    new_path="${1%/}"/"$base"$counter".txt"
    echo "$path" "$new_path"
    counter=$(( $counter +1 ))
done

>>>NOTE<<<: script uses echo just for testing purposes. When you are satisfied with resulting paths, replace echo with mv to move all filenames or cp to copy all filenames.

Example:

bash-4.3$ tree
.
├── destination
├── dir1
│   └── genome.txt
├── dir2
│   └── genome.txt
├── dir3
│   └── genome.txt
└── move_enumerated.sh

4 directories, 4 files
bash-4.3$ ./move_enumerated.sh  ./destination
./dir2/genome.txt ./destination/genome.txt
./dir3/genome.txt ./destination/genome1.txt
./dir1/genome.txt ./destination/genome2.txt

Improving the script for more flexibility

This script can be further improved to make it more general, where user can specify filename , top directory to traverse, and destination all as command-line arguments:

#!/bin/bash
find "$2"  -name "$1" -print0 | while IFS= read -r -d '' path
do
    base=$(basename --suffix=".txt"  "$path")
    new_path="${3%/}"/"$base"$counter".txt"
    echo "$path" "$new_path"
    counter=$(( $counter +1 ))
done

Test run:

bash-4.3$ ./move_enumerated.sh "genome.txt"  "./testdir" "./testdir/destination"
./testdir/dir2/genome.txt ./testdir/destination/genome.txt
./testdir/dir3/genome.txt ./testdir/destination/genome1.txt
./testdir/dir1/genome.txt ./testdir/destination/genome2.txt

Syntax and theory of operation

Over all the script makes use of command | while read variable ; do ... done structure. This is a very common approach, and it is frequently used to avoid dealing with ls and difficult filenames that can break scripts.

On the left side of the pipe we have find command, which takes directory as argument ( and if that's not given, GNU find which is used in Linux assumes . - the current working directory ). The other options include -name which is the specific filename that we are searching, and -print0 which is used to output results as delimited via non-printable \0 character. It is frequently used to avoid splitting on newline or other characters, because those characters can potentially appear within the filename itself and as a result break the script.

On the right side of the pipe we have while IFS= read -r -d '' ; do . . . done structure. The while loop with read shell built-in is frequently used to real stdin input, which in this case comes from the pipe. IFS= -r and -d '' are necessary to ensure that we receive filenames safely and recognize that each item is delimited with \0.

The rest of the script is fairly easy. We extract basename of the file using basename command. Since in this case we're dealing specifically with known extension and expect to have single dot in the filename, we can use --suffix=".txt" to strip that part out , leaving genome part. We then build new path to file via joining destination , basename, and counter variable. Notice that we use parameter expansion with "${3%/}" argument (destination folder) in our improved script. This is done to ensure that regardless of whether or not user added / character at the command-line ( ./destination or ./destination/ ), we extract only bare directory name, and join it via different / with basename. Notice also that counter variable is not set initially, so the first filename that we receive, will be plain genome.txt After than, counter variable will be incremented and thus created and will show up when we deal with other filenames.

For more info , please read Filenames and Pathnames in Shell: How to do it Correctly.

Related Solutions

Ubuntu – Copy files from a directory to a sub-directory (excluding the sub-directory itself)

Assuming you are using bash as your interactive shell, you can enable extglob which allows you to specify "all files except these ones".

shopt -s extglob
cd Parent
cp !(Child1) Child1/

Ubuntu – How to copy files with common names and paste them into another folder

I assume you mean you have a structure something like:

├── f1
│   ├── a1
│   ├── a2
│   ├── b1
│   ├── b2
│   ├── c1
│   ├── c2
├── f2
│   ├── a3
│   ├── a4
│   ├── b3
│   ├── b4
│   ├── c3
│   ├── c4

and you want to end up with a directory like this:

a-files
├── a1
├── a2
├── a3
└── a4

Assuming:

the current working directory is the parent directory of all the directories f1 f2 f3

You could do:

mkdir a-files
for files in f*/a* ; do cp "$files" a-files ; done

to copy all files starting with a to a new directory a-files from all directories starting with f. You can repeat for files starting with b...

mkdir b-files
for files in f*/b* ; do cp "$files" b-files ; done

Note: if there are any duplicate filenames, each file written to the new directory will overwrite another with the same name, so at the end of the loop, the new directory would only have a copy of the last file to be written with that name. You could use the -n flag to cp to prevent overwriting, and then you would get the first file with that name instead of the last one:

for files in f*/a* ; do cp -n "$files" a-files ; done