In emacs (or other editor) how to display the byte offset of the cursor

editorsemacs

The question indicates my preference to use emacs, but the overriding issue is that I want to be able to do a normal text search and somehow see/copy-paste the byte-offset of the matched text.

To be clear, by byte-offset, I do not mean emac's point value, which shows the number of characters from the start of the buffer, eg. in UTF-16LE, point considers \x0d\x00\x0a\x00 as 1 character, whereas I'm interested in it as 4 bytes.

Any other editor (or viewer) which presents this basic information while displaying the text in a "normally" readable and searchable fashion is worthwile.

Even a hex view with a synchronized normal-text view would be okay, but a typical Hex-dump viewer/editor is not what I'm after, as they (typically) only display ASCII chars, and I haven't found a FOSS Hex-dump viewer/editor which can perform a simple text-mode search for non ASCII UTF-8 or for any UTF-16 strings.

I'm primarily concerned with legibility and search-ability of the text, so a "normal" Hex dump program is only a fallback (which I'm already using).

Best Answer

First of all, in case you don't know about it, Emacs has hexl-find-file which opens up a file in hex editing mode. I know that it's not what you asked for, but if you're already using one, and you're comfortable with Emacs, then it's good to know about it for future needs.

Second, for this kind of "raw" editing of a file (which I tend to do often), find-file-literally is really great. It does what you'd expect it to do, and pretends to be a pre-unicode version of itself and open the file with escapes showing up for non-ascii characters (and control chars etc). This is likely to do what you want, though it does have the obvious disadvantage of not being able to actually read the text if you have a lot of non-ascii content.

So going down further into primitive support, there's the enable-multibyte-characters variable and the set-buffer-multibyte function that is used to toggle it. The nice thing about this is that it changes the buffer presentation dynamically -- for example, try this:

(defun my-multi-toggle ()
  (interactive)
  (set-buffer-multibyte (not enable-multibyte-characters)))
(global-set-key (kbd "C-~") 'my-multi-toggle)

and you now have a key that toggles the raw mode dynamically. It also has the nice property of leaving the cursor in the same place. But this raw mode shows you the internal representation (which looks like UTF-8) and not whatever the file happens to be using as its encoding. It should be possible to do what you're talking about with some hack (for example, using find-file-literally on an open file will ask you about revisiting it, but that resets the location and reloads the file too) -- but it sounds like the above is already fine. (That is, my guess is that you're trying to edit some text field in an otherwise binary file...)

Related Solutions

Window display table and buffer display table conflict in Emacs

Sorry for your trouble. I don't see the problem you report, following your recipe. Perhaps the description is not complete? I can turn on both pretty-control-l-mode and whitespace-mode, and the behavior I see for each seems normal. Perhaps there is some custom setting you use for whitespace-style or something?

Anyway, maybe it would help if you make a change like this to pretty-control-l-mode. If so, let me know and I will apply it to pp-c-l.el. (To test, set the new option to nil.)

 (defcustom pp^L-use-window-display-table-flag t
   "Non-nil: use `window-display-table'; nil: use `buffer-display-table`."
   :type 'boolean :group 'Pretty-Control-L)

 (define-minor-mode pretty-control-l-mode
     "Toggle pretty display of Control-l (`^L') characters.
 With ARG, turn pretty display of `^L' on if and only if ARG is positive."
   :init-value nil :global t :group 'Pretty-Control-L
   (if pretty-control-l-mode
       (add-hook 'window-configuration-change-hook 'refresh-pretty-control-l)
     (remove-hook 'window-configuration-change-hook 'refresh-pretty-control-l))
   (walk-windows
    (lambda (window)
      (let ((display-table  (if pp^L-use-window-display-table-flag ; <=========
                                (or (window-display-table window)
                                    (make-display-table))
                              (if buffer-display-table
                                  (copy-sequence buffer-display-table)
                                (make-display-table)))))
        (aset display-table ?\014 (and pretty-control-l-mode
                                       (pp^L-^L-display-table-entry window)))
        (if pp^L-use-window-display-table-flag                     ; <=========
            (set-window-display-table window display-table)
          (setq buffer-display-table display-table))))
    'no-minibuf
    'visible))

UPDATED to add comment thread, in case comments get deleted at some point:

BTW, I wonder if the hierarchy of display tables described in the doc shouldn't perhaps be applied using inheritance of some kind. Seems a bit primitive for one level (e.g. window) to completely shadow a lower level (e.g. buffer). You might consider sending a question about this to M-x report-emacs-bug. – Drew Sep 24 '14 at 16:36

Ping? Could you please let me know if the change above helps? Thx. – Drew Oct 14 '14 at 18:12

I just read this answer (I have not been around this part of the Internet for a while...). I will check this when I get round to it, perhaps in a few days or so. I'll get back with an ‘Answer approved’ (if it works), or comments (otherwise), as appropriate, later. – Johan E Oct 25 '14 at 22:32

I edited the question to add a more fleshed-out recipe for showing the problem. I'd be interested whether you get the same results. --- Also, is there a way to shadow a system installed .el-file with a user-supplied one (I'm really just a “user”, not a lisp-programmer...)? I don't really feel like messing with the files installed by deb-packages. (That's why I did the problem-recipe before testing your answer...) – Johan E Oct 27 '14 at 1:02

Five seconds after I wrote the last comment I realized that I could just paste the code into scratch and C-j-run it to test. (No need to edit any files.) The results: It works a charm! Thank you! (=> Answer accepted) However, I'd still like to know if you get the same results as I from my problem-recipe (before patching the code). – Johan E Oct 27 '14 at 1:09

I just followed your new recipe, and I saw everything you described (so clearly). And then I read the new comment that you just added. Glad to know that things work OK. Thx for your feedback. – Drew Oct 27 '14 at 1:12

Best Answer

Related Solutions

Window display table and buffer display table conflict in Emacs

Related Question