Bash – copy and rename files 2 directory up

bashfile-copymacintoshrename

I am trying to copy multiple files named "F3.bam" two level of directories to up and then rename these files with the name of the sub-directory after copy.

For example:

/samples/mydata1/RUN1/ID_date/PCR2/TIME1/F3.bam
/samples/mydata2/RUN1/ID2_date4/PCR2/TIME7/F3.bam
/samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/TIMExxx/F3.bam

Expected results:

1. The files are first copied two directories level up:

/samples/mydata1/RUN1/ID_date/F3.bam
/samples/mydata2/RUN1/ID2_date4/F3.bam
/samples/mydataxxx/RUN1/IDxxx_datexxx/F3.bam

2. The files are renamed according to the name of the current sub-directory:

/samples/mydata1/RUN1/ID_date/ID_date_F3.bam
/samples/mydata2/RUN1/ID2_date4/ID2_date4_F3.bam
/samples/mydataxxx/RUN1/IDxxx_datexxx/IDxxx_datexxx_F3.bam

Ideally a bash loop would be great (working on a Mac).

Best Answer

Here's the TLDR version of my solution: you can use the dirname and basename commands along with process substitution in order to construct the target path for your copy command.

A longer explanation follows.


Here is a (super verbose) script that does roughly what you want using a Bash loop:

#!/bin/bash

# copy_and_rename.bash
#
#   Copy multiple files 2 folders up and rename these files
#   to contain their parent directory as a prefix.
#

# Set internal field separator to handle spaces in file names
IFS=$'\n'

# Iterate over the list of file paths
for _file_path in $@; do

    # Get the file name
    _file_name="$(basename ${_file_path})"

    echo "${_file_name}"

    # Get the path to the target directory (two levels above the file)
    _target_directory_path=$(dirname $(dirname ${_file_path}))

    echo "${_target_directory_path}"

    # Get the parent directory of the target directory
    _parent_directory_path=$(dirname ${_target_directory_path})

    echo "${_parent_directory_path}"

    # Get the name of the parent directory
    _parent_directory_name=$(basename ${_parent_directory_path})

    echo "${_parent_directory_name}"

    # Construct the new file path
    _new_file_path="${_target_directory_path}/${_parent_directory_name}_${_file_name}"

    echo "${_new_file_path}"

    # Copy and rename the file
    echo "cp -i \"${_file_path}\" \"${_new_file_path}\""
    cp -i "${_file_path}" "${_new_file_path}"
    echo
done

You can obviously compress this a lot, but I kept it this way for explanatory value.

Here is what the preceding script looks like without any comments or superfluous variables or echo statements:

for _file_path in $@; do
    cp -i "${_file_path}" \
    "$(dirname $(dirname ${_file_path}))/$(basename $(dirname $(dirname $(dirname ${_file_path}))))_$(basename ${_file_path})"
done

It's very fragile and doesn't do much in the way of error-handling. I also left in some echo statements for debugging so you see what it's doing and can sanity-check it when you run it for the first time.

To test it I created your files by using the following script, which I include here in case you find it useful for further testing:

#!/bin/bash

# create_test_files.bash

# Set internal field separator to handle spaces in file names
IFS=$'\n'

# Choose an prefix for the file paths
_prefix="/tmp"

# Create array of sample files
_sample_files=(
    "/samples/mydata1/RUN1/ID_date/PCR2/TIME1/F3.bam"
    "/samples/mydata2/RUN1/ID2_date4/PCR2/TIME7/F3.bam"
    "/samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/TIMExxx/F3.bam"
)

# Create directories and files
for _file in "${_sample_files[@]}"; do

    # Add the prefix to the path
    _path="${_prefix}${_file}"

    # Create parent directory
    mkdir -p "$(dirname ${_path})"

    # Create file
    touch "${_path}"
done

I check that the files were created properly by using the find command:

$ find /tmp/samples -type f

/tmp/samples/mydata1/RUN1/ID_date/PCR2/TIME1/F3.bam
/tmp/samples/mydata2/RUN1/ID2_date4/PCR2/TIME7/F3.bam
/tmp/samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/TIMExxx/F3.bam

Then I invoke the script like this:

bash copy_and_rename.bash \
/tmp/samples/mydata1/RUN1/ID_date/PCR2/TIME1/F3.bam \
/tmp/samples/mydata2/RUN1/ID2_date4/PCR2/TIME7/F3.bam \
/tmp/samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/TIMExxx/F3.bam

And then I check that the script worked by using find again:

$ find /tmp/samples -type f

/tmp/samples/mydata1/RUN1/ID_date/PCR2/ID_date_F3.bam
/tmp/samples/mydata1/RUN1/ID_date/PCR2/TIME1/F3.bam
/tmp/samples/mydata2/RUN1/ID2_date4/PCR2/ID2_date4_F3.bam
/tmp/samples/mydata2/RUN1/ID2_date4/PCR2/TIME7/F3.bam
/tmp/samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/IDxxx_datexxx_F3.bam
/tmp/samples/mydataxxx/RUN1/IDxxx_datexxx/PCR2/TIMExxx/F3.bam

Finally, I delete all of the test files, also using find:

find /tmp/samples -type f -exec rm {} \;
Related Question