Dealing with file names with special first characters (ex. ♫)

command linefilenamesfishspecial characterswildcards

I have recently come across a file whose name begins with the character '♫'. I wanted to copy this file, feed it into ffmpeg, and reference it in various other ways in the terminal. I usually auto-complete weird filenames but this fails as I cannot even type the first letter.

I don't want to switch to the mouse to perform a copy-paste maneuver. I don't want to memorize a bunch of codes for possible scenarios. My ad hoc solution was to switch into vim, paste !ls and copy the character in question, then quit and paste it into the terminal. This worked but is quite horrific.

Is there an easier way to deal with such scenarios?

NOTE: I am using the fish shell if it changes things.

Best Answer

If the first character of file name is printable but neither alphanumeric nor whitespace you can use [[:punct:]] glob operator:

$ ls *.txt
f1.txt  f2.txt  ♫abc.txt
$ ls [[:punct:]]*.txt
♫abc.txt

`iconv` (internationalization conversion)

Here is a solution using iconv:

iconv -c -f utf-8 -t ascii input_file.csv

The -f flag (from) specifies an input format, the -t flag (to) specifies an output format, and the -c flag tells iconv to discard characters that cannot be converted to the target. This writes the results to standard output (i.e. to your console). If you want to write the results to a new file you would do something like this instead:

iconv -c -f utf-8 -t ascii input_file.csv -o output_file.csv

Then, if you want, you can replace the original file with the new file:

mv -i output_file.csv input_file.csv

Here is how iconv handles your first example string:

$ echo "'÷ÞW' , 'ŸŸŸŸŸŸŸ', '³ŸŸÙ÷'" | iconv -c -f utf8 -t ascii
'W' , '', ''

`tr` (translate)

Here is a solution using the tr (translate) command:

cat input_file.csv | tr -cd '\000-\177'

The \000-\177 pattern specifies the numerical range 0-127 using octal notation. This is the range of values for ASCII characters. The -c flag tells tr to match values in the complement of this range (i.e. to match non-ASCII characters) and the -d flag tells tr perform deletion (instead of translation).

To write the results to a file you would use output redirection:

cat input_file.csv | tr -cd '\000-\177' > output_file.csv

Here is how tr handles your first example string:

$ echo "'÷ÞW' , 'ŸŸŸŸŸŸŸ', '³ŸŸÙ÷'" | tr -cd '\000-\177'
'W' , '', ''

`sed` (stream editor)

Here is a solution using sed:

sed 's/[\d128-\d255]//g' input_file.csv

The s prefix tells sed to perform substitution, the g suffix tells sed to match patterns globally (by default only the first occurrence is matched), the pattern [\d128-\d255] tells sed to match characters with decimal values in the range 128-255 (i.e. non-ASCII characters), and the empty string between the second and third forward-slashes tells sed to replace matched patterns with the empty string (i.e. to remove them).

Unlike many other programs, sed has an option to update the file in-place (instead of manually writing to a different file and then replacing the original):

sed -i 's/[\d128-\d255]//g' input_file.csv

Here is how sed handles your first example string:

$ echo "'÷ÞW' , 'ŸŸŸŸŸŸŸ', '³ŸŸÙ÷'" | sed 's/[\d128-\d255]//g'
'W' , '', ''

Best Answer

Related Solutions

Remove all type of special characters in unix .csv file

iconv (internationalization conversion)

tr (translate)

sed (stream editor)

Related Question

`iconv` (internationalization conversion)

`tr` (translate)

`sed` (stream editor)