Ubuntu – Recursive bash script to collect information about each file in a directory structure

bashscripts

How do I work recursively through a directory tree and execute a specific command on each file, and output the path, filename, extension, filesize and some other specific text to a single file in bash.

Best Answer

While find solutions are simple and powerful, I decided to create a more complicated solution, that is based on this interesting function, which I saw few days ago.

  • More explanations and two other scripts, based on the current are provided here.

1. Create executable script file, called walk, that is located in /usr/local/bin to be accessible as shell command:

sudo touch /usr/local/bin/walk
sudo chmod +x /usr/local/bin/walk
sudo nano /usr/local/bin/walk
  • Copy the below script content and use in nano: Shift+Insert for paste; Ctrl+O and Enter for save; Ctrl+X for exit.

2. The content of the script walk is:

#!/bin/bash

# Colourise the output
RED='\033[0;31m'        # Red
GRE='\033[0;32m'        # Green
YEL='\033[1;33m'        # Yellow
NCL='\033[0m'           # No Color

file_specification() {
        FILE_NAME="$(basename "${entry}")"
        DIR="$(dirname "${entry}")"
        NAME="${FILE_NAME%.*}"
        EXT="${FILE_NAME##*.}"
        SIZE="$(du -sh "${entry}" | cut -f1)"

        printf "%*s${GRE}%s${NCL}\n"                    $((indent+4)) '' "${entry}"
        printf "%*s\tFile name:\t${YEL}%s${NCL}\n"      $((indent+4)) '' "$FILE_NAME"
        printf "%*s\tDirectory:\t${YEL}%s${NCL}\n"      $((indent+4)) '' "$DIR"
        printf "%*s\tName only:\t${YEL}%s${NCL}\n"      $((indent+4)) '' "$NAME"
        printf "%*s\tExtension:\t${YEL}%s${NCL}\n"      $((indent+4)) '' "$EXT"
        printf "%*s\tFile size:\t${YEL}%s${NCL}\n"      $((indent+4)) '' "$SIZE"
}

walk() {
        local indent="${2:-0}"
        printf "\n%*s${RED}%s${NCL}\n\n" "$indent" '' "$1"
        # If the entry is a file do some operations
        for entry in "$1"/*; do [[ -f "$entry" ]] && file_specification; done
        # If the entry is a directory call walk() == create recursion
        for entry in "$1"/*; do [[ -d "$entry" ]] && walk "$entry" $((indent+4)); done
}

# If the path is empty use the current, otherwise convert relative to absolute; Exec walk()
[[ -z "${1}" ]] && ABS_PATH="${PWD}" || cd "${1}" && ABS_PATH="${PWD}"
walk "${ABS_PATH}"      
echo                    

3. Explanation:

  • The main mechanism of the walk() function is pretty well described by Zanna in her answer. So I will describe only the new part.

  • Within the walk() function I've added this loop:

    for entry in "$1"/*; do [[ -f "$entry" ]] && file_specification; done
    

    That means for each $entry that is a file will be executed the function file_specification().

  • The function file_specification() has two parts. The first part gets data related to the file - name, path, size, etc. The second part output the data in well formatted form. To format the data is used the command printf. And if you want to tweak the script you should read about this command - for example this article.

  • The function file_specification() is good place where you can put the specific command that should be execute for each file. Use this format:

    command "${entry}"

    Or you can save the output of the command as variable, and then printf this variable, etc.:

    MY_VAR="$(command "${entry}")"
    printf "%*s\tFile size:\t${YEL}%s${NCL}\n" $((indent+4)) '' "$MY_VAR"

    Or directly printf the output of the command:

    printf "%*s\tFile size:\t${YEL}%s${NCL}\n" $((indent+4)) '' "$(command "${entry}")"

  • The section to the begging, called Colourise the output, initialise few variables that are used within the printf command to colourise the output. More about this you could find here.

  • To the bottom of the scrip is added additional condition that deals with absolute and relative paths.

4. Examples of usage:

  • To run walk for the current directory:

    walk      # You shouldn't use any argument, 
    walk ./   # but you can use also this format
    
  • To run walk for any child directory:

    walk <directory name>
    walk ./<directory name>
    walk <directory name>/<sub directory>
    
  • To run walk for any other directory:

    walk /full/path/to/<directory name>
    
  • To create a text file, based on the walk output:

    walk > output.file
    
  • To create output file without colour codes (source):

    walk | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" > output.file
    

5. Demonstration of usage:

enter image description here