When I open a binary(in this case it is C:\\Windows\\System32\\notepad.exe
), different hex editors show different result each other, for the one same file. I tested it on starting point of section headers, so notice the starting address of 2E 74 65 78 74 00 00 00
(".text..."
).
Windows – different hex editors show different binary for a file
file formathex-editorhexdumpwindows
Related Solutions
Well, you understand that every file that has content is a binary file, every single one without exception, including a file with a .txt
extension.
The one and only difference between a binary file with a .txt
extension and one with a .jpg
extension is really a meta difference: convention and historical practice tell us that we can make assumptions about the first file:
- it is to be interpreted as a collection of contiguous 8-bit fields;
- each such field represents an ASCII character; and
- most important, there are no control fields -- no counts, no state-change indicators, none of that.
Otherwise, there's no difference between what we -- only by convention -- call a text file and any other file.
Furthermore, there is no way to know how a file should be interpreted just by looking at its contents. We have to depend upon something external to the file -- like its extension, say -- to give us a hint at what the thing is.
Binary and text data aren't separated: They are simply data. It depends on the interpretation that makes them one or the other. If you open binary data (such as an image file) in a text editor, much of it won't make sense, because it does not fit your chosen interpretation (as text).
What you call text is a subset of the possible file contents: Data that in a given character set translates to readable characters.
For example, in ASCII, you can see that, of 128 "allowed" values, only about half are letters and numbers, 30 are punctuation, and the rest are control characters. The latter group just isn't used a lot in text files, and they have no really good textual representation. Some of them are Tab and Newline characters, where text editors already need to get creative in displaying them.
Some text editors have options to explicitly display whitespace. Then they'll actually be drawn as characters, in addition to their regular formatting behavior (which is also just the interpretation of these characters).
Pure ASCII only interprets 128 values. The bytes used to store this information have 256 possible values each, so half of the possible values aren't allowed in ASCII. Those are e.g. used in region-specific character sets, such as Latin 1, but in ASCII, they're undefined. They have no useful representation in a text viewer that can only handle ASCII.
Binary data is not usually interpreted as text. So in these files, all possible byte values are commonly found. Everything else would be wasteful (and that's a reason you can compress text very well). Image file formats are complicated, and you don't usually view them as text, so they don't need to be readable.
As there is no common data interpretation (character set) that maps all possible values to readable characters, and since that wouldn't make lot of sense anyway (as it's not readable text), major parts are displayed as gibberish.
A hex editor chooses a different representation for the data: It displays each byte as two hexadecimal digits. It's just a different representation, and one with an easily human-readable character set: All 256 possible byte values can be represented as two hex digits.
Since there's an easy mapping of binary data to hex and vice versa (4 binary digits to/from one hexadecimal digit), and binary contains very little information per digit, hexadecimal is generally the preferred way for humans to read binary, unless there are specific reasons to prefer a different representation.
Some text editors might have a hex editor mode and some heuristic that tried to determine whether a file is text or binary, and automatically select one mode or the other. But this can be difficult to get right and it's not a specific property of the file that says whether it's one kind or the other.
Some FTP clients ask you to specify which file endings are used for text data. These programs will then change the file contents to match the OS of the machine you're connected to, as Windows uses a different line ending character sequence (CR/LF
) than Linux and Unix (including Mac OS X; LF
).
Best Answer
These are different files.
From When is System32 not System32? [emphasis mine]:
My guess is some hex editors are 32-bit and get redirected to
SysWOW64
, some are 64-bit and see the "real"System32
. Different editors perceive differentSystem32
, hence differentnotepad.exe
.If you copy
notepad.exe
to a folder that is not affected and analyze the copy then all editors will show the same content. Which file will you see? This depends whether the copying tool is 32-bit or 64-bit.