Shell – List file + directory path recursively sorted by access time

filesshell-scriptsorttimestamps

I'm doing a LRU script, but after 20 hours working on it, I have a problem with recursive mode that I can't fix.

I just need a command that show me files sorted by access time (–time=atime); I want to manage the depth too, but if I can't, it's okay too.

Main directory :
- File 1
- Directory 1 :
  - File 1
  - File2
  - Subdir 1 :
    - file1
    - file 2
- Directory 2 :
  - File 1
  - File 2
  - Subdir 2 :
    - file 1
    - file 2
    - subdir 3 :
      - file 1

I want to exclude directory, just having file sorted by access time

/Main Directory/Directory 1/file 1

/Main Directory/File 1

/Main Directory/Directory 1/Subdir 1/file 2

/Main Directory/Directory 2/Subdir 2/subdir 3/file 1

etc..

Best Answer

There are six very commonly used tools to solve similar problems:

find, to look for files or directories matching specific entries.

The -mindepth and -maxdepth options control how deep in the filesystem tree (relative to the specified names, which are always at depth 0) the command will work on.

The -type option is useful for restricting consideration to files, directories, symbolic links, or devices.

The -printf option is extremely useful, as it makes the command print out the information on the matching names (directory items) in the desired format. I particularly like %TY%Tm%Td %TT %p\n, which prints the date and time of the last modification, and full path and name of each match on each line, using format YYYYMMDD HH:MM:SS.sss PATH. This format sorts correctly, you see. For last access, use %AY%Am%Ad %AT %p\n, but note that access timestamps are not recorded at all if noatime mount option is used, or if relatime mount option is used, access timestamps are only modified for the first access after a modification; least-recently-used checking is thus not reliable. (Least-recently-modified list, however, is pretty reliable; the users can modify the timestamps by hand, but otherwise they are maintained automatically.)
sort to sort the output.

The -d, -g, -h, -M, and -n options define how items are compared, and the -R option makes the order random.

The -r option can be used to reverse the sort order (used in addition to one of the above options).

The -t option redefines how fields (columns) are defined; by default, blanks (spaces and tabs) separate columns.

The -k option can be used to define which part of each line is considered the sort key; by default, the entire line is considered.
uniq is often used after sorting to combine multiple consecutive items into one -- so that only the unique lines are output.
cut is the simplest way to pick only specific columns from each line in the output.

The -f option chooses the fields to be printed. (By default, lines with at most one field (no separators) are printed; option -s suppresses printing such lines.)

The -d option can be used to redefine the definition of a field; by default, blanks separate fields.
sed is a powerful stream editor, which applies regular expressions to the input, filtering and modifying it as needed.
awk is an interpreter for the awk language. Awk scripts are basically collections of actions, snippets of code, that are executed for each line (or before or after all processing, or if the line (or record) matches some rule).

This particular problem can be solved using three of the above commands in a simple pipeline: use find to find files at the desired depths of the tree, printing a sortable date and time for each file, plus the relative path to the file; sort the output; remove the date and time part of each line, leaving just the relative path to each file on each line.

Related Solutions

Files – How to List Files Sorted by Modification Date Recursively Without stat Command

My shortest method uses zsh:

print -rl -- **/*(.Om)

(add the D glob qualifiers if you also want to list the hidden files or the files in hidden directories).

If you have GNU find, make it print the file modification times and sort by that. I assume there are no newlines in file names.

find . -type f -printf '%T@ %p\n' | sort -k 1 -n | sed 's/^[^ ]* //'

If you have Perl (again, assuming no newlines in file names):

find . -type f -print |
perl -l -ne '
    $_{$_} = -M;  # store file age (mtime - now)
    END {
        $,="\n";
        print sort {$_{$b} <=> $_{$a}} keys %_;  # print by decreasing age
    }'

If you have Python (again, assuming no newlines in file names):

find . -type f -print |
python -c 'import os, sys; times = {}
for f in sys.stdin.readlines(): f = f[0:-1]; times[f] = os.stat(f).st_mtime
for f in sorted(times.iterkeys(), key=lambda f:times[f]): print f'

If you have SSH access to that server, mount the directory over sshfs on a better-equipped machine:

mkdir mnt
sshfs server:/path/to/directory mnt
zsh -c 'cd mnt && print -rl **/*(.Om)'
fusermount -u mnt

With only POSIX tools, it's a lot more complicated, because there's no good way to find the modification time of a file. The only standard way to retrieve a file's times is ls, and the output format is locale-dependent and hard to parse.

If you can write to the files, and you only care about regular files, and there are no newlines in file names, here's a horrible kludge: create hard links to all the files in a single directory, and sort them by modification time.

set -ef                       # disable globbing
IFS='
'                             # split $(foo) only at newlines
set -- $(find . -type f)      # set positional arguments to the file names
mkdir links.tmp
cd links.tmp
i=0 list=
for f; do                     # hard link the files to links.tmp/0, links.tmp/1, …
  ln "../$f" $i
  i=$(($i+1))
done
set +f
for f in $(ls -t [0-9]*); do  # for each file, in reverse mtime order:
  eval 'list="${'$i'}         # prepend the file name to $list
$list"'
done
printf %s "$list"             # print the output
rm -f [0-9]*                  # clean up
cd ..
rmdir links.tmp

Shell – Access the most recent file in (alphabetically sorted) directory

This is really a job for zsh.

vim my_dir/*(om[1])

The bits in parentheses are glob qualifiers. The * before is a regular glob pattern, for example if you wanted to consider only log files you could use *.log. The o glob qualifier changes the order in which the matches are sorted; om means sort by modification time, most recent first. The [ glob qualifier means to only return some of the matches: [1] returns the first match, [2,4] return the next three, [-2,-1] return the last two and so on. If the files have names that begin with their timestamp, *([1]) will suffice.

In other shells, there's no good way to pick the most recent file. If your file names don't contain unprintable characters or newlines, you can use

vim "$(ls -t mydir | head -n 1)"

If you want to pick the first or last file based on the name, there is a fully reliable and portable method, which is a bit verbose for the command line but perfectly serviceable in scripts.

set -- mydir/*
first_file_name=$1
eval "last_file_name=\${$#}"

Best Answer

Related Solutions

Files – How to List Files Sorted by Modification Date Recursively Without stat Command

Shell – Access the most recent file in (alphabetically sorted) directory

Related Question