Ubuntu – Replacing dots (.) in sed

sedtext processing

So actual question is – does anyone have an idea how to remove M-BM- special character without risking losing other characters?

I have a string of text:

" . . ."

that is

space dot space dot space dot

I am trying to replace all occurence of this string in text file to

"..."

that is

dot dot dot

I was trying to do with sed:

sed -r 's:\s\.\s\.\s\.:...:g' -i sed-dots

Unfortunately, it does not change input file even a bit.
File: https://www.dropbox.com/s/46zmiruy3ln85a1/sed-dots

When I try to do replace same string in text editor (I use geany) it is found and replaced properly.

Only reason I can think of is that some (or all) of those spaces are not really spaces, but some special character.

Does anyone have idea how to find and replace that string with sed (or any other command line tool)? Please test your idea on my file, as problem is not as obvious as it might seem to be – this is why I asked about it.

After using cat -A myfile it seems problem that those spaces are not spaces, but M-BM- special character. Using any symbol . suggested for search is not a good idea as there is risk some other characters will be removed.

Best Answer

First I would start by testing with echo and piping that into sed, than using a real file. Secondly, you can use a {n} in the extended regex model to denote multiples and limits.

You were pretty much there but your regex expected a leading space.

$ echo 'cheese . . . muffins' | sed -r 's/(\s?\.){3}/ dot dot dot/g'
cheese dot dot dot muffins

Note the \s? is still greedy enough to ruin the output, so I've added a space to the output. You might not want that. I've also made the space optional, so it'll match all of the following:

...
. ..
.. .
. . .
 . . . 

Just remove the optional ? flag.


Given your problem with unicode (in the comments) you can force the data to its ASCII equivalence with iconv and then sed it:

$ iconv -f utf-8 -t ascii//translit sed-dots | sed -r 's/(\s?\.){3}/ dot dot dot/g'
Lorem ipsum dot dot dot
Some dot dot dot more text