I downloaded many YouTube videos and want to process them using bash
scripts. However the filenames used contain all kinds of special and non-ASCII characters.
How do I handle this in a bash
script?
Lets say I want to create a symbolic link to each such file in a folder:
# Write filenames to filelist.txt in parent folder
ls ./* > ../filelist.txt
# Create sym links for all files in filelist.txt
counter=0
while read video_name;
do
counter=$((counter+1));
ln -s $video_name link_name_${counter}.mp4
done < ../filelist.txt
The above function is not working due to the special characters in the filename.
Here are some example filenames:
पेट (Stomach) कम करने के लिए 5 योग आसन-3G4pEY5njYE.mp4
मन शांत करने के लिए करे वृक्षासन योग _ स्वामी रामदेव-sPytQlaxoIg.mp4
वृक्षासन करने का तरीका और फायदे _ Swami Ramdev-A-2d04ON9hA.mp4
Bonus:
I also would like to have "leading zeros" when printing the counter
variable, but that's not crucial.
Best Answer
Variables in the shell can contain any character, except for the NUL character, just like filenames in the filesystem. You should therefore not have any problem storing the filenames in variables, unless you read the mangled output of
ls
, which will possibly be modified for display purposes (ls
output is strictly for looking at).In the edited question, you additionally read the filenames from a text file with
read
and the default value of$IFS
(which determines aspects of howread
works). This would strip flanking whitespace from the lines read from the file, and may interpret the\
character specially if it occurs in the input. Also note that technically, filenames may contain newline characters, so storing them as a newline-delimited list (lines in a text file) limits the types of names that can be used.You also need to quote the expansion of variables. You have filenames with spaces in them, and without quoting the
$video
value, the shell would split these up in to multiple words and give these words (after additionally performing filename globbing with these as patterns) as separate arguments toln -s
.Don't use
ls
to generate the list of the filenames, and quote the expansions of all variables:Note that the above code would generate the symbolic links in the current directory. If you run this a second time, it would pick up these links and create further links to those symbolic links. It would be better to create the links in a separate directory, to be more careful with the filename globbing pattern used with the loop so that the links are avoided, or explicitly test for links in the loop and skip these.
To get a zero-filled counter with four digits, you may use
This prints the re-formatted counter directly to the
zcounter
variable. You would then use that variable in generating the filename. Or you could just generate the name of the symbolic link in one go in this way:See also: