How to remove extra invisible character in a text editor

notepadsublime-text-2

I have two "identical" 5-character strings in my text editors (Sublime Text2 | Notepad++).

The first string was copied from Gmail and the second one just typed by hand.

When I select the first string, I see 6 characters selected.
When I select the second string, I see 5 characters selected.

enter image description here

When I select both strings in Sublime Text2 at the same time, I can see that there is an extra space selected after the first string.

enter image description here

I enabled "Display all characters" in Notepad++ but don't see anything obviously different between the first and the second string.

The file uses UTF-8 encoding. And the issue is consistent in both text editors.

Can anyone please advise how to remove the invisible extra character and where it came from?

Best Answer

Based on the ANSI string that you got, gffk9​, it appears that the additional character present in the text is a zero-width space. Zero-width spaces are used to indicate where a program displaying text may "safely" break a line when the text does not actually visibly contain spaces. Since you copied it from Gmail, it seems likely that this came from an email that used HTML to format the text.

How you can go about removing the extra character may depend on your system. This hex viewer plugin for Sublime Text looks promising since it offers some search capabilities, but it does not explicitly mention searching by hex string or replacement. Since you are using Notepad++, I assume you are on Windows. XVI32 will let you search and replace hex strings in a file.

For reference, if you are in a Unix-like environment, sed would allow you to replace occurrences of a hex string in a file using the process described in this post.

In any case, the hex string that you would be looking to find and replace would be E2 80 8B.

Related Question