You can use the cat
command (see man cat
for more information) to concatenate the text files.
If you want to create a new file
cat [FILE1] [FILE2]... > new_file
or if you want to append to an existing file use it like this
cat [FILE1] [FILE2]... >> file
As already noted, the short answer is "yes".
The long answer is: You can do it with a bash script that uses awk
to extract the filename elements you want to base your directory structure on. It could look something like this (where more emphasis is placed on readability than "one-liner" compactness).
#!/bin/bash
for FILE in p-*
do
if [[ ! -f $FILE ]]; then continue; fi
LVL1="$(awk '{match($1,"^p-([[:digit:]]+)_[[:print:]]*",fields); print fields[1]}' <<< $FILE)"
LVL2="$(awk '{match($1,"^p-([[:digit:]]+)_n-([[:digit:]]+)_[[:print:]]*",fields); print fields[2]}' <<< $FILE)"
echo "move $FILE to p-$LVL1/n-$LVL2"
if [[ ! -d "p-$LVL1" ]]
then
mkdir "p-$LVL1"
fi
if [[ ! -d "p-$LVL1/n-$LVL2" ]]
then
mkdir "p-$LVL1/n-$LVL2"
fi
mv $FILE "p-$LVL1/n-$LVL2"
done
To explain:
- We perform a loop over all files starting with "p-" in the current directory.
- The first instruction in the loop ensures that the file exists and is a workaround for empty directories (the reason why this is necessary is that on this forum, you will always be told not to parse the output of
ls
, so something like FILES=$(ls p-*); for FILE in $FILES; do ...
would be considered a no-go).
- Then, we extract the numerals between
p-
and _n
needed to generate the first level of your directory structure using awk
(as you suspected, with regular expressions), the same for the numerals between n-
and _a
for the second level. The idea is to use the match
function which not only looks for the place where the specified regular expression occurs in your input, but also gives you the "completed" value of all elements enclosed in round brackets ( ... )
in the array "fields".
- Third, we check if the directories for the first and second level of your intended directory structure already exist. If not, we create them.
- Last, we move the file to the target directory.
For more information, have a look at the Advanced bash scripting guide and the GNU Awk Users Guide.
Once you are more firm in scripting and regular expressions, you can make this much more compact; in the above script, for example, the generation of the directory/subdirectory path could easily be contracted to just one awk
call.
For one, since the directory names are actually p-<number>
and n-<number>
, the same as in your filename, we could have let awk
do the work to extract these characters for us, too, by writing
match($1,"(^p-[[:digit:]]+)_(n-[[:digit:]]+)_[[:print:]]*",fields)
We can further offload work to awk
by having it generate the directory-subdirectory path at the same time with a suitable argument of print
:
awk '{match($1,"(^p-[[:digit:]]+)_(n-[[:digit:]]+)_[[:print:]]*",fields); print fields[1] "/" fields[2]}'
would readily yield (e.g.) p-12345/n-384
for file p-12345_n-384_a-583.pdf
. If we combine that with the usage of mkdir -p
as indicated by @wurtel, the script could look like
for FILE in p-*
do
if [[ ! -f $FILE ]]; then continue; fi
TARGET="$(awk '{match($1,"(^p-[[:digit:]]+)_(n-[[:digit:]]+)_[[:print:]]*",fields); print fields[1] "/" fields[2]}' <<< $FILE)"
echo "move $FILE to $TARGET"
mkdir -p "$TARGET"
mv $FILE $TARGET
done
Best Answer
No
cat
will not buffer all the files before it starts writing out.However if you have a large number of files you can run into an issue with the number of arguments passed to
cat
. By default the linux kernel only allows a fixed number of arguments to be passed to any program (I can't remember how to get the value, but its a few thousand in most cases).To solve this issue you can do something like this instead:
This will basically call
cat
separately for each and every file found byfind
.