Shell – Why can’t I define a readonly variable named path in zsh

shellvariablezsh

In zsh, path is a special array variable, the contents of which is linked to the well-known PATH variable.

So special, in fact, that defining and calling the function

f() { local -r path=42 }

causes the error f: read-only variable: path. If the local variable is declared as mutable (i.e. without -r), everything works as expected. I haven't been able to reproduce this error with other variable names.

Why does this error occur and is it intentional? Do similar rules exist for other names?

I'm using zsh 5.2 (x86_64-apple-darwin16.0) on macOS 10.12.6.

Best Answer

TL;DR don't reuse "special builtin parameter" such as path because uh they're special. Or according to The Mailing List one can use the -h flag:

% () { local -hr path=42; echo $path }
42
%

(However changing path to an integer might mess up subsequent code that forgets this override and assumes path is instead path...)

Longer digging around follows (but I totally missed the -h hide thing...)

% print ${(t)path}
array-special

This is a property (feature? bug?) of special variables, but not similar variables linked by the user:

% () { typeset -r PATH=/blah; echo $PATH }
(anon): read-only variable: PATH
% typeset -Tar FOO=bar foo
% print $foo
bar
% print ${(t)foo}
array-readonly-tag_local
% () { local -r foo=blah; echo $foo }
blah

There are various other parameters that exhibit this behavior:

% for p in $parameters[(I)*]; do print $p $parameters[$p]; done | grep array-
cdpath array-special
...
% () { local -r cdpath=42 }
(anon): read-only variable: cdpath

So some variables are like in Animal Farm more special than others. This error message comes from various places in Src/params.c which if that is modified to print which message is the specific message we on compiling that zsh find:

% () { local -r path }
% () { local -r path=foo }
(anon): read-only variable (setarrvalue): path

Is the rather generic code

/**/
mod_export void
setarrvalue(Value v, char **val)
{
    if (unset(EXECOPT))
        return;
    if (v->pm->node.flags & PM_READONLY) {
        zerr("read-only variable (setarrvalue): %s", v->pm->node.nam);
        freearray(val);
        return;
    }

This shows that the problem happens elsewhere; non-special variables doubtless do not have PM_READONLY set while the special variables that fail do. The next obvious place to look is the code for local which goes by a variety of names (typeset export ...). These are all builtins so can be found lurking in the depths of Src/builtin.c

% grep BUILTIN Src/builtin.c | grep local
    BUILTIN("local", BINF_PLUSOPTS | BINF_MAGICEQUALS | BINF_PSPECIAL | BINF_ASSIGN, (HandlerFunc)bin_typeset, 0, -1, 0, "AE:%F:%HL:%R:%TUZ:%ahi:%lp:%rtux", NULL),

These all call bin_typeset with various flags set so let's study the source for that function...swearing in the comments, check. Notes that things are complicated, check. Nothing really jumps out, though the rabbit hole (for when the "treat arguments as patterns" -m option is not set, which is the case here) appears to lead to the typeset_single function...

There is some code for POSIXBUILTINS related to readonly, but that's turned off in my test shells

% print $options[POSIXBUILTINS]
off

so I'm going to ignore that code (I hope. Could this be a shoggoth lair and no mere rabbit hole?). Meanwhile! Some debugging points to the PM_READONLY flag being toggled on for path by the following line

    /*
     * The remaining on/off flags should be harmless to use,
     * because we've checked for unpleasant surprises above.
     */
    pm->node.flags = (PM_TYPE(pm->node.flags) | on | PM_SPECIAL) & ~off;

Which in turn comes from the on variable which in turn is already on when the typeset_single function is entered, sigh, so back to bin_typeset we go... okay, basically there's a TYPESET_OPTSTR that somehow via some macros enables PM_READONLY by default; when instead a user-supplied variable runs through this code path the PM_READONLY gets turned off and all is well.

Whether this can be changed so that special variables such as path can be made readonly is a question for a ZSH developer (try the zsh-workers mailing list?) otherwise meanwhile don't mess around with the special variables.

Simple inline implementation for serializing one or more variables

Yes, in both bash and zsh you can serialize the contents of a variable in a way that is easy to retrieve using the typeset builtin and the -p argument. The output format is such that you can simply source the output to get your stuff back.

 # You have variable(s) $FOO and $BAR already with your stuff
 typeset -p FOO BAR > ./serialized_data.sh

You can get your stuff back like this either later in your script or in another script altogether:

# Load up the serialized data back into the current shell
source serialized_data.sh

This will work for bash, zsh and ksh including passing data between different shells. Bash will translate this to its builtin declare function while zsh implements this with typeset but as bash has an alias for this to work either way for we use typeset here for ksh compatibility.

More complex generalized implementation using functions

The above implementation is really simple, but if you call it frequently you might want to give yourself a utility function to make it easier. Additionally if you ever try to include the above inside custom functions you will run into issues with variable scoping. This version should eliminate those issues.

Note for all of these, in order to maintain bash/zsh cross-compatibility we will be fixing both the cases of typeset and declare so the code should work in either or both shells. This adds some bulk and mess that could be eliminated if you were only doing this for one shell or another.

The main problem with using functions for this (or including the code in other functions) is that the typeset function generates code that, when sourced back into a script from inside a function, defaults to creating a local variable rather than a global one.

This can be fixed with one of several hacks. My initial attempt to to fix this was parse the output of the serialize process through sed to add the -g flag so the created code defines a global variable when sourced back in.

serialize() {
    typeset -p "$1" | sed -E '0,/^(typeset|declare)/{s/ / -g /}' > "./serialized_$1.sh"
}
deserialize() {
    source "./serialized_$1.sh"
}

Note that the funky sed expression is to only match the first occurrence of either 'typeset' or 'declare' and add -g as a first argument. It is necessary to only match the first occurrence because, as Stéphane Chazelas rightly pointed out in comments, otherwise it will also match cases where the serialized string contains literal newlines followed by the word declare or typeset.

In addition to correcting my initial parsing faux pas, Stéphane also suggested a less brittle way to hack this that not only side steps the issues with parsing the strings but could be a useful hook to add additional functionality by using a wrapper function to redefine the actions taken when sourcing the data back in. This assumes you are not playing any other games with the declare or typeset commands, but this technique would be easier to implement in a situation where you were including this functionality as part of another function of your own or you were not in control of the data being written and whether or not it had the -g flag added. Something similar could also be done with aliases, see Gilles's answer for an implementation.

To make the result even more useful, we can iterate over multiple variables passed to our functions by assuming that each word in the argument array is a variable name. The result becomes something like this:

serialize() {
    for var in $@; do
        typeset -p "$var" > "./serialized_$var.sh"
    done
}

deserialize() {
    declare() { builtin declare -g "$@"; }
    typeset() { builtin typeset -g "$@"; }
    for var in $@; do
        source "./serialized_$var.sh"
    done
    unset -f declare typeset
}

With either solution, usage would look like this:

# Load some test data into variables
FOO=(an array or something)
BAR=$(uptime)

# Save it out to our serialized data files
serialize FOO BAR

# For testing purposes unset the variables to we know if it worked
unset FOO BAR

# Load  the data back in from out data files
deserialize FOO BAR

echo "FOO: $FOO\nBAR: $BAR"

How to increment a dynamically named variable in `zsh`

$ name=hello
$ hello=42
$ (($name++))
$ echo $hello
43

Just like in any Korn-like shell. Or POSIXly:

$ name=hello
$ hello=42
$ : "$(($name += 1))"
$ echo "$hello"
43

The point is that all parameter expansion, command substitutions and arithmetic expansions are done inside arithmetic expressions prior to the arithmetic expression being evaluated.

((something))

is similar to

let "something"

So in (($name++)) (like let "$name++"), that's first expanded to hello++ and that's evaluated as the ++ operator applied to the hello variable.

POSIX sh has no ((...)) operator but it has the $((...)) arithmetic expansion operator. It doesn't have ++ (though it allows implementations to have one as an extension instead of requiring it to be a combination of unary and/or binary + operators), but it has +=.

By using : "$((...))" where : is the null command, we get something similar to ksh's ((...)). Though a strict equivalent would be [ "$((...))" -ne 0 ], as ((expression)) returns false when the expression resolves to 0.

Best Answer

Related Solutions

Bash – Serialize shell variable in bash or zsh

Simple inline implementation for serializing one or more variables

More complex generalized implementation using functions

How to increment a dynamically named variable in `zsh`

Related Question