Outlook – Character sequence “�” being inserted into messages by Outlook 2007

emailencodingmicrosoft-outlook-2007thunderbird

Email messages between my wife and I have recently started being corrupted by the insertion of the character sequence "�" into messages. This appears to be an encoding issue.

This question on SO identifies the string as UTF-8 for character "�". Various discussions found by Googling indicate that people have experienced a variety of other characters (e.g., apostrophe and ellipsis) being replaced with this string in email and when browsing web sites. It seems that this UTF-8 character is used as a generic substitution for a variety of non-renderable characters.

In this case, I'm using Thunderbird (V52.5.0) in Linux and my wife is using Outlook (2007) in Windows 7 (both formatting messages as HTML).

Thunderbird inserts some invisible formatting markers, which are being replaced by this string in Outlook. The string appears in two types of locations:

  • replacing the first of two blank spaces between the period at the end of a sentence and the next sentence
  • carriage returns used to create blank lines between paragraphs.

The substitution happens in Outlook and applies to everything in the message thread that has been rendered in Thunderbird. This includes older messages that originated in Outlook but were part of the thread rendered in Thunderbird before replying back to Outlook.

This is a character substitution rather than a rendering issue. Once the string has been substituted, the corruption remains when viewing the message later on either system.

This is a very recent problem, so I assume a software or OS update on one of the systems is responsible.

I looked at the encoding settings in Thunderbird and Outlook and, indeed, they were a motley mix; some UTF-8, some Western European (no idea where that could have come from), and something else, I think Western ISO 8859). I set everything to UTF-8. These are what I found and changed:

  • Thunderbird: Preferences | Display | Formating | Fonts Advanced | Text Encoding
  • Outlook: Tools | Options | Mail Format | International Options
    also: Tools | Options | Other | General | Advanced Options | Use Unicode message format when saving messages

I have also verified that the locale settings are still correct on both systems (US, American English, etc.).

Unfortunately, the symptom is unchanged. Are there encoding settings hidden in various and sundry places in Thunderbird or Outlook that I could have missed (or some other issue)?

Best Answer

It looks like the encoding discrepancy actually was the issue. I had been retesting with old messages that I thought hadn't been corrupted. In fact, it appears that they were already corrupted. Saeed Sepehr's suggestion to test with web mail led to creation of fresh messages and replies in both directions, which were symptom free.

So setting all of the encoding to UTF-8 on both systems was what solved the problem. And now it's verified.

I discovered that another Thunderbird user posted a similar question on the Mozilla support forum in early December. People experiencing the issue had encoding set to Western (ISO-8859-1). Setting it to UTF-8 was at least part of the solution.

An additional recommendation that seemed to help some users (or at least didn't hurt):

Edit | Preferences | Display | Formatting | Fonts Advanced | Text Encoding | 

Checkmark the box for When possible, use the default text encoding in replies. Warning: only do this in combination with setting encoding to UTF-8. Doing just this with encoding set to Western (ISO-8859-1) will exacerbate the problem.

Note that the menu paths have moved around a bit over the different versions, so you might need to find your way to the settings mentioned here and in the question depending on your version of Thunderbird.

Related Question