MacOS – OS X Terminal – Open Tab in current directory, troubles with umlauts

bashmacosterminal

I run OS X 10.7.5. I am currently experiencing an issue with Terminal. I have enabled the option to open new tabs in the current working directory. However, this does not work as expected, when the path of the current working directory contains one or more umlauts. For instance, being in a directory Uni/Semester\ 7/C++/Übung\ 2 and hitting Cmd-T to open a new tab places me in the directory I most recently cded to, e.g. Uni/Semester\ 7/C++ or something. Same thing if I am in a subdirectory of Übung\ 2.

Another symptom (at least they appear to be related) is that when quitting Terminal while in a directory containing umlauts, on reopening it will start in my home directory, not even in the closest parent without umlauts as in the new tab case.

I've read that some people have troubles with Tab-autocompletion and umlauts. I do not, it works just fine, and I do not know if that is related.

Configuration-wise, I set the option Startup in Preferences > Settings > Shell to /opt/local/bin/bash -l (because the preinstalled bash version is outdated, removing this made no difference in behaviour). The option Shells open with in the preferences is set to default, I do not know if that is relevant.

Now, the question: Does anyone know how to make Terminal work with umlauts such that I don't always have to renavigate to my working directory upon opening a new tab? It seems weird to me that I should be the first one to have that problem, I did not manage to google anything up.

EDIT: I now upgraded to Yosemite. The problem persists. I cannot believe no one else has this problem. I also logged in as guest user to obtain default settings and the same thing happens.

Best Answer

Prior to OS X El Capitan 10.11, the code in /etc/bashrc arranges to send an escape sequence at each prompt to tell Terminal what the current working directory is, but this code only percent-encodes spaces, which means that it doesn't work with characters that are not valid URL characters, which includes any non-ASCII characters like “Ü”:

update_terminal_cwd() {
    # Identify the directory using a "file:" scheme URL,
    # including the host name to disambiguate local vs.
    # remote connections. Percent-escape spaces.
    local SEARCH=' '
    local REPLACE='%20'
    local PWD_URL="file://$HOSTNAME${PWD//$SEARCH/$REPLACE}"
    printf '\e]7;%s\a' "$PWD_URL"
}

On 10.11 and later, the code has been moved to /etc/bashrc_Apple_Terminal and has been updated to percent-encode all characters that require it, so it can now work with characters like “Ü” (your example case works for me on 10.11.1):

update_terminal_cwd() {
    # Identify the directory using a "file:" scheme URL, including
    # the host name to disambiguate local vs. remote paths.

    # Percent-encode the pathname.
    local url_path=''
    {
        # Use LC_CTYPE=C to process text byte-by-byte. Ensure that
        # LC_ALL isn't set, so it doesn't interfere.
        local i ch hexch LC_CTYPE=C LC_ALL=
        for ((i = 0; i < ${#PWD}; ++i)); do
            ch="${PWD:i:1}"
            if [[ "$ch" =~ [/._~A-Za-z0-9-] ]]; then
                url_path+="$ch"
            else
                printf -v hexch "%02X" "'$ch"
                # printf treats values greater than 127 as
                # negative and pads with "FF", so truncate.
                url_path+="%${hexch: -2:2}"
            fi
        done
    }

    printf '\\e]7;%s\\a' "file://$HOSTNAME$url_path"
}

[iTerm 2 apparently reads the working directory from the shell process state. This has the advantage that it works without any shell setup; however, it isn't guaranteed to be correct (there's no reason a shell's current working directory has to actually match the cwd it uses when executing a command, at any given moment), it doesn't work through indirect connections like ssh or shells running within editors or screen multiplexers, and it can't read the directory from processes owned by other users—for example, if you use sudo -s to create a root shell, it can't read the working directory from the root shell process. Furthermore, the program state only includes a file descriptor for the open directory, not the path that the shell is using for $PWD, so you won't actually get the path you used to navigate to the current directory in some cases—e.g., if you traversed through a symbolic link.]