Bash Command for Each File in a Folder – Script Guide

bashcommand line

I have a set of files on which I would like to apply the same command and the output should contain the same name as the processed file but with a different extension.

Currently I am doing rename /my/data/Andrew.doc to /my/data/Andrew.txt I would like to do this for all the .doc files from the /my/data/ folder and to preserve the name.

I tried several versions but I guess I have something wrong in the syntax as I an new to linux.

Best Answer

There are at least a hundred thousand million different ways of approaching this but here are the top contenders:

The Bash for loop

for f in ./*.doc; do
    # do some stuff here with "$f"
    # remember to quote it or spaces may misbehave
done

Using `find`

The find command has a lovely little exec command that's great for running things (with some caveats). Find is better than basic globbing because you can really filter down on the files you're selecting. Be careful of the odd syntax.

find . -iname '*.doc' -exec echo "File is {}" \;

Note that find is recursive so you might want to use -maxdepth 1 to keep find in current working directory. -type f can be used to filter out regular files.

If we're just renaming doc to txt...

The rename command is sed-like in searching. Obviously this won't do anything to convert the format.

rename 's/doc$/txt/' *.doc

Related Solutions

Ubuntu – Search for duplicate file names within folder hierarchy

FSlint is a versatile duplicate finder that includes a function for finding duplicate names:

FSlint

The FSlint package for Ubuntu emphasizes the graphical interface, but as is explained in the FSlint FAQ a command-line interface is available via the programs in /usr/share/fslint/fslint/. Use the --help option for documentation, e.g.:

$ /usr/share/fslint/fslint/fslint --help
File system lint.
A collection of utilities to find lint on a filesystem.
To get more info on each utility run 'util --help'.

findup -- find DUPlicate files
findnl -- find Name Lint (problems with filenames)
findu8 -- find filenames with invalid utf8 encoding
findbl -- find Bad Links (various problems with symlinks)
findsn -- find Same Name (problems with clashing names)
finded -- find Empty Directories
findid -- find files with dead user IDs
findns -- find Non Stripped executables
findrs -- find Redundant Whitespace in files
findtf -- find Temporary Files
findul -- find possibly Unused Libraries
zipdir -- Reclaim wasted space in ext2 directory entries
$ /usr/share/fslint/fslint/findsn --help
find (files) with duplicate or conflicting names.
Usage: findsn [-A -c -C] [[-r] [-f] paths(s) ...]

If no arguments are supplied the $PATH is searched for any redundant
or conflicting files.

-A reports all aliases (soft and hard links) to files.
If no path(s) specified then the $PATH is searched.

If only path(s) specified then they are checked for duplicate named
files. You can qualify this with -C to ignore case in this search.
Qualifying with -c is more restictive as only files (or directories)
in the same directory whose names differ only in case are reported.
I.E. -c will flag files & directories that will conflict if transfered
to a case insensitive file system. Note if -c or -C specified and
no path(s) specifed the current directory is assumed.

Example usage:

$ /usr/share/fslint/fslint/findsn /usr/share/icons/ > icons-with-duplicate-names.txt
$ head icons-with-duplicate-names.txt 
-rw-r--r-- 1 root root    683 2011-04-15 10:31 Humanity-Dark/AUTHORS
-rw-r--r-- 1 root root    683 2011-04-15 10:31 Humanity/AUTHORS
-rw-r--r-- 1 root root  17992 2011-04-15 10:31 Humanity-Dark/COPYING
-rw-r--r-- 1 root root  17992 2011-04-15 10:31 Humanity/COPYING
-rw-r--r-- 1 root root   4776 2011-03-29 08:57 Faenza/apps/16/DC++.xpm
-rw-r--r-- 1 root root   3816 2011-03-29 08:57 Faenza/apps/22/DC++.xpm
-rw-r--r-- 1 root root   4008 2011-03-29 08:57 Faenza/apps/24/DC++.xpm
-rw-r--r-- 1 root root   4456 2011-03-29 08:57 Faenza/apps/32/DC++.xpm
-rw-r--r-- 1 root root   7336 2011-03-29 08:57 Faenza/apps/48/DC++.xpm
-rw-r--r-- 1 root root    918 2011-03-29 09:03 Faenza/apps/16/Thunar.png

Ubuntu – For loop with file names

If the files are all in the same dir, you can:

ls |
awk -F_ '{ i=$1; m=$2; s=$3; f[i"_"s] = f[i"_"s] " " $0 }
         END{ for(insc in f)
                printf "paste%s >out_%s.txt\n",f[insc],insc
         }'

which splits the filename on "_" (-F_), sets the variables i,m,s to the first 3 parts of the filename (institute,model,scenario), and accumulates in array f the filename. The array is indexed only by the institute and scenario, so all the models are concatenated (m isn't used). The final END prints the f array, and uses the index (institute_scenario) as the name for the output file. With your examples this produces

paste wbm_gfdl_rcp8p5_mississippi.txt wbm_hadgem_rcp8p5_mississippi.txt >out_wbm_rcp8p5.txt
paste matsiro_hadgem_rcp4p5_mississippi.txt matsiro_ipsl_rcp4p5_mississippi.txt >out_matsiro_rcp4p5.txt
paste matsiro_gfdl_rcp8p5_mississippi.txt matsiro_miroc_rcp8p5_mississippi.txt >out_matsiro_rcp8p5.txt

You then need to pipe this into the shell to have it executed. Add | sh to the last line above to do this.

To remove some columns from the input files, you need to alter the awk line that is collecting all the input filenames. In the 1st awk line:

{ i=$1; m=$2; s=$3; f[i"_"s] = f[i"_"s] " " $0 }

the filename is the "$0". For example, if you change this line into:

{ i=$1; m=$2; s=$3; f[i"_"s] = f[i"_"s] sprintf(" <(cut -f4 %s)",$0) }

then you will get the example output:

paste <(cut -f4 wbm_gfdl_rcp8p5_mississippi.txt) <(cut -f4 wbm_hadgem_rcp8p5_mississippi.txt) >out_wbm_rcp8p5.txt

but if you want to cut only the 2nd filename, it is a bit more complicated and you need this instead:

{ i=$1; m=$2; s=$3; 
  if(f[i"_"s]=="")add = $0; else add = sprintf("<(cut -f4 %s)",$0);
  f[i"_"s] = f[i"_"s] " " add }

so you will get

paste wbm_gfdl_rcp8p5_mississippi.txt <(cut -f4 wbm_hadgem_rcp8p5_mississippi.txt) >out_wbm_rcp8p5.txt

If sh does not understand the syntax <(cut ...) then replace it by bash.