Bash script to rename files from a text file source

bashrenamescriptingshell

I'm fairly new to bash; I can just about perform simple administrative tasks with simple commands 1 at a time. However, I've been tasked with renaming some files in a directory using a text file as the source for my renaming and would really appreciate a few pointers, as I am well out of my depth.

Let me explain:

New File Name.xlsx 0.1  000011F4.dat 
New File Name.xlsx 0.2  000011F5.dat 
New File Name.xlsx 0.3  000011F6.dat 
New File Name.xlsx 0.4  000011F7.dat 
New File Name.xlsx 0.5  000011F8.dat 
New File Name.xlsx 0.6  000011F9.dat 

The source text file I have resembles the above somewhat. The intention is that the first 'column' is the new name for the file, the middle is the version and the third is the current filename.

I need to rename the .dat files in the directory, changing them to the names presented in the first column. I also need to prepend the version number 0.1, 0.2 etc… to the beginning of each file.

I have a few questions:

Is it a massive problem that the files have whitespace in them? Would it be better adding " " around each file string?

Basically I have no idea where to start and any help would be massively appreciated. As you can see it's slightly more complex than a usual renaming, giving the need to add the version column to the beginning of the filename and the whitespace in the list.

Best Answer

This ought to work:

sh <(sed -r 's/^\s*(.*)\s+([0-9\.]+)\s+([0-9A-Z]{8}\.dat)\s*$/mv -iv \3 "\2 \1"/' files)

... where files is the name of your source file.

What this does is pass the result of the sed command to a new instance of sh (the shell), using process substitution. The output of the sed command is:

mv -iv 000011F4.dat "0.1 New File Name.xlsx"
mv -iv 000011F5.dat "0.2 New File Name.xlsx"
mv -iv 000011F6.dat "0.3 New File Name.xlsx"
mv -iv 000011F7.dat "0.4 New File Name.xlsx"
mv -iv 000011F8.dat "0.5 New File Name.xlsx"
mv -iv 000011F9.dat "0.6 New File Name.xlsx"

Taking the sed command apart, it searches for a pattern:

  • ^ - the beginning of the line
  • \s* - any whitespace at the start
  • (.*) - any characters (the parentheses store the result to \1)
  • \s+ - at least one whitespace character
  • ([0-9\.]+) - at least one of 0-9 and . (stored to \2)
  • \s+ - at least one whitespace character
  • ([0-9A-Z]{8}\.dat) - 8 characters in 0-9 or A-Z, followed by .dat (stored to \3)
  • \s* - any whitespace at the end
  • $ - the end of the line

... and replaces it with mv -iv \3 "\2 \1", where \1 to \3 are the previously stored values. You can use something other than a space between the version number and the rest of the filename, if you like.

Here's the result:

$ ls -l
total 60
-rw-rw-r-- 1 z z   0 Aug  8 14:15 000011F4.dat
-rw-rw-r-- 1 z z   0 Aug  8 14:15 000011F5.dat
-rw-rw-r-- 1 z z   0 Aug  8 14:15 000011F6.dat
-rw-rw-r-- 1 z z   0 Aug  8 14:15 000011F7.dat
-rw-rw-r-- 1 z z   0 Aug  8 14:15 000011F8.dat
-rw-rw-r-- 1 z z   0 Aug  8 14:15 000011F9.dat
-rw-rw-r-- 1 z z 222 Aug  8 13:47 files
$ sh <(sed -r 's/^\s*(.*)\s+([0-9\.]+)\s+([0-9A-Z]{8}\.dat)\s*$/mv -iv \3 "\2 \1"/' files)
`000011F4.dat' -> `0.1 New File Name.xlsx'
`000011F5.dat' -> `0.2 New File Name.xlsx'
`000011F6.dat' -> `0.3 New File Name.xlsx'
`000011F7.dat' -> `0.4 New File Name.xlsx'
`000011F8.dat' -> `0.5 New File Name.xlsx'
`000011F9.dat' -> `0.6 New File Name.xlsx'
$ ls -l
total 60
-rw-rw-r-- 1 z z   0 Aug  8 14:15 0.1 New File Name.xlsx
-rw-rw-r-- 1 z z   0 Aug  8 14:15 0.2 New File Name.xlsx
-rw-rw-r-- 1 z z   0 Aug  8 14:15 0.3 New File Name.xlsx
-rw-rw-r-- 1 z z   0 Aug  8 14:15 0.4 New File Name.xlsx
-rw-rw-r-- 1 z z   0 Aug  8 14:15 0.5 New File Name.xlsx
-rw-rw-r-- 1 z z   0 Aug  8 14:15 0.6 New File Name.xlsx
-rw-rw-r-- 1 z z 222 Aug  8 13:47 files
Related Question