How to Grab File Extension in Bash

bashfilenamesshell

How do I get the file extension from bash? Here's what I tried:

filename=`basename $filepath`
fileext=${filename##*.}

By doing that I can get extension of bz2 from the path /dir/subdir/file.bz2, but I have a problem with the path /dir/subdir/file-1.0.tar.bz2.

I would prefer a solution using only bash without external programs if it is possible.

To make my question clear, I was creating a bash script to extract any given archive just by a single command of extract path_to_file. How to extract the file is determined by the script by seeing its compression or archiving type, that could be .tar.gz, .gz, .bz2 etc. I think this should involve string manipulation, for example if I get the extension .gz then I should check whether it has the string .tar before .gz — if so, the extension should be .tar.gz.

Best Answer

If the file name is file-1.0.tar.bz2, the extension is bz2. The method you're using to extract the extension (fileext=${filename##*.}) is perfectly valid¹.

How do you decide that you want the extension to be tar.bz2 and not bz2 or 0.tar.bz2? You need to answer this question first. Then you can figure out what shell command matches your specification.

  • One possible specification is that extensions must begin with a letter. This heuristic fails for a few common extensions like 7z, which might be best treated as a special case. Here's a bash/ksh/zsh implementation:

    basename=$filename; fileext=
    while [[ $basename = ?*.* &&
             ( ${basename##*.} = [A-Za-z]* || ${basename##*.} = 7z ) ]]
    do
      fileext=${basename##*.}.$fileext
      basename=${basename%.*}
    done
    fileext=${fileext%.}
    

    For POSIX portability, you need to use a case statement for pattern matching.

    while case $basename in
            ?*.*) case ${basename##*.} in [A-Za-z]*|7z) true;; *) false;; esac;;
            *) false;;
          esac
    do …
    
  • Another possible specification is that some extensions denote encodings and indicate that further stripping is needed. Here's a bash/ksh/zsh implementation (requiring shopt -s extglob under bash and setopt ksh_glob under zsh):

    basename=$filename
    fileext=
    while [[ $basename = ?*.@(bz2|gz|lzma) ]]; do
      fileext=${basename##*.}.$fileext
      basename=${basename%.*}
    done
    if [[ $basename = ?*.* ]]; then
      fileext=${basename##*.}.$fileext
      basename=${basename%.*}
    fi
    fileext=${fileext%.}
    

    Note that this considers 0 to be an extension in file-1.0.gz.

¹ ${VARIABLE##SUFFIX} and related constructs are in POSIX, so they work in any non-antique Bourne-style shell such as ash, bash, ksh or zsh.

Related Question