Shell – Find command returns wrong files

findshell

I'm trying to find files in a folder using find command which are a day older than the day on which the command is run.. I use the following command:

FILES_dcn=($(find  $dir_dcn -maxdepth 1 -type f -name "*.pcap" -mtime +1 -print0 | xargs -0 ls -lt | tail -15 | awk '{print $9}'))

But the output seems to be like :

-rw-rw-rw- 1 nethawk nethawk  2097664 Mar 16 01:58 /mnt/md0/capture/dcn/dcn_2014_03_16_01_58_00_438.pcap
-rw-r--r-- 1 root    root    27935978 Mar 17 10:00 /mnt/md0/capture/dcn/dcn_2014_03_16_18_29_18_983.pcap
-rw-rw-rw- 1 nethawk nethawk  2097296 Mar 17 10:02 /mnt/md0/capture/dcn/dcn_2014_03_17_10_02_00_335.pcap
-rw-rw-rw- 1 nethawk nethawk  2097192 Mar 17 10:02 /mnt/md0/capture/dcn/dcn_2014_03_17_10_02_49_476.pcap
-rw-rw-rw- 1 nethawk nethawk  2097936 Mar 17 10:07 /mnt/md0/capture/dcn/dcn_2014_03_17_10_06_59_326.pcap
-rw-rw-rw- 1 nethawk nethawk  2097464 Mar 17 10:10 /mnt/md0/capture/dcn/dcn_2014_03_17_10_10_00_407.pcap
-rw-rw-rw- 1 nethawk nethawk  2097232 Mar 17 10:13 /mnt/md0/capture/dcn/dcn_2014_03_17_10_13_48_603.pcap
-rw-rw-rw- 1 nethawk nethawk   426800 Mar 17 10:14 /mnt/md0/capture/dcn/dcn_2014_03_17_10_13_58_428.pcap
-rw-rw-rw- 1 nethawk nethawk  2097544 Mar 17 10:14 /mnt/md0/capture/dcn/dcn_2014_03_17_10_14_10_259.pcap
-rw-rw-rw- 1 nethawk nethawk  2097600 Mar 17 10:14 /mnt/md0/capture/dcn/dcn_2014_03_17_10_14_49_609.pcap
-rw-rw-rw- 1 nethawk nethawk  2097472 Mar 17 10:17 /mnt/md0/capture/dcn/dcn_2014_03_17_10_16_59_503.pcap
-rw-rw-rw- 1 nethawk nethawk  2097696 Mar 17 10:17 /mnt/md0/capture/dcn/dcn_2014_03_17_10_17_48_698.pcap
-rw-rw-rw- 1 nethawk nethawk  2098048 Mar 17 10:18 /mnt/md0/capture/dcn/dcn_2014_03_17_10_18_29_981.pcap
-rw-rw-rw- 1 nethawk nethawk  2097352 Mar 17 10:20 /mnt/md0/capture/dcn/dcn_2014_03_17_10_20_10_320.pcap
-rw-rw-rw- 1 nethawk nethawk  2097416 Mar 17 10:20 /mnt/md0/capture/dcn/dcn_2014_03_17_10_20_49_703.pcap

Its should have been

-rw-rw-rw- 1 nethawk nethawk  2097296 2014-03-17 10:02 dcn_2014_03_17_10_02_00_335.pcap
-rw-rw-rw- 1 nethawk nethawk   443736 2014-03-17 10:02 dcn_2014_03_17_10_01_58_254.pcap
-rw-rw-rw- 1 nethawk nethawk  2098136 2014-03-17 10:01 dcn_2014_03_17_10_01_48_427.pcap
-rw-rw-rw- 1 nethawk nethawk  2097456 2014-03-17 10:01 dcn_2014_03_17_10_01_38_622.pcap
-rw-rw-rw- 1 nethawk nethawk  2097480 2014-03-17 10:01 dcn_2014_03_17_10_01_28_773.pcap
-rw-rw-rw- 1 nethawk nethawk  2097184 2014-03-17 10:01 dcn_2014_03_17_10_01_18_966.pcap
-rw-rw-rw- 1 nethawk nethawk  2097184 2014-03-17 10:01 dcn_2014_03_17_10_01_09_127.pcap
-rw-rw-rw- 1 nethawk nethawk  2097272 2014-03-17 10:01 dcn_2014_03_17_10_00_59_280.pcap
-rw-rw-rw- 1 nethawk nethawk  2097896 2014-03-17 10:00 dcn_2014_03_17_10_00_49_462.pcap
-rw-rw-rw- 1 nethawk nethawk  2097376 2014-03-17 10:00 dcn_2014_03_17_10_00_39_653.pcap
-rw-rw-rw- 1 nethawk nethawk  2097344 2014-03-17 10:00 dcn_2014_03_17_10_00_29_816.pcap
-rw-rw-rw- 1 nethawk nethawk  2097656 2014-03-17 10:00 dcn_2014_03_17_10_00_19_977.pcap
-rw-rw-rw- 1 nethawk nethawk  2097232 2014-03-17 10:00 dcn_2014_03_17_10_00_10_172.pcap
-rw-rw-rw- 1 nethawk nethawk  2097656 2014-03-17 10:00 dcn_2014_03_17_10_00_00_323.pcap
-rw-rw-rw- 1 nethawk nethawk   435544 2014-03-17 10:00 dcn_2014_03_17_09_59_58_280.pcap

And the current time Fri Mar 21 16:10:42 UTC 2014. Why is this happening ? The files are stored in a samba share drive.

Best Answer

The syntax you are using will find files older than 24 hours. For your current time of Fri Mar 21 16:10:42 UTC 2014, this would be files modified before Fri Mar 20 16:10:42 UTC 2014. However from your question it seems that you want files modified before Fri Mar 21 00:00:00 UTC 2014.

The way to do this is to create a temporary file and change the modification time to midnight that day (simply specify the date only). The find command can then compare to this file. This will work on Linux:

time_file=$(mktemp)
touch -d "$(date +%F)" "$time_file"
find  $dir_dcn -maxdepth 1 -type f -name "*.pcap" ! -newer "$time_file" \
  -exec ls -lt {} + |
  tail -15 |
  awk '{print $(NF-1)}'
rm "$time_file"

Note the +%F is not POSIX, here you would have to use +%Y-%m-%d (also mktemp is no POSIX, but can be found on most Unix-like systems). Note also the differences in the time format of the output you have posted. In the first the filename is field 9, in the second field 8. This will vary depending on how locale related environment variables are set. I have got around this by making awk print the last field rather than a specific field number. This will work as long as there are no spaces etc in the filenames.

Update

Actually looking more carefully at your intended vs expected output, it looks more like what has happened is that there have been so many files that xargs has done more than one run of ls. This would prevent the files from being sorted properly. Since the files are date stamped, it the simplest thing to do is to pipe to sort instead of using ls.

time_file=$(mktemp)
touch -d "$(date +%F)" "$time_file"
find  $dir_dcn -maxdepth 1 -type f -name "*.pcap" ! -newer "$time_file" |
  sort |
  tail -15
rm "$time_file"
Related Question