Bash vs zsh: scoping and `typeset -g`

bashzsh

From https://unix.stackexchange.com/a/381782/674

For instance:

integer() { typeset -gi "$1"; }

To make a variable an integer works in mksh/yash/zsh. It works
in bash only on variables that have not been declared local by
the caller:

$ bash -c 'f() { declare a; integer a; a=1+1; echo "$a"; }; integer() { typeset -gi "$1"; }; f'
1+1
$ bash -c 'f() { integer a; a=1+1; echo "$a"; }; integer() { typeset -gi "$1"; }; f'
2

Note that export var is neither typeset -x var nor typeset -gx
var
. It adds the export attribute without declaring a new variable
if the variable already existed. Same for readonly vs typeset -r.

For bash,

  • inside f of the first example, what does integer ado, to declare a different a from the a declared inside f or to make the a declared inside f have the global scope? Why does it output 1+1?

  • inside f of the second example, does integer a declare a with the global scope? Why does it output 2?

For zsh, the same questions, except that why the first example outputs 2 instead of 1+1 as for bash?

Am I correct that

  • both bash and zsh use dynamic scoping, at least in the examples?
  • the option -g of typeset in both bash and zsh means to declare a nonexisting variable with the global scope, or to change an existing variable to have the global scope?

Thanks.

Best Answer

  • In a programming language with static scoping, such as most other programming languages (like C),

    There is a global scope and a local scope for each function. Variables appearing in a function are either private to the function or global.

    A function can only access either its local variables or the global variables. It cannot access the variables of another function (even its caller) other than via passing by reference.

  • In a programming language with dynamic scoping,

    a function sees the variables of its caller, and there's a scope for each function of your call tree of the function. Scoping is like Russian dolls where variables are stacked on top of the other.

    In that stack of scopes, the global scope is only special in that it is the bottom-most one, but functions don't necessarily see variables in that scope if they have been masked as a local variable by any of the function in their call tree. So there is not one global and one local scope.


It helps to know the history here.

1. ksh93

  1. In ksh93, a function, at least one declared with the ksh syntax (function f {...}), follows static scoping.

    Variables declared with typeset in a function are local variables of the function.

    a=global_a
    function f {
      typeset a=f_a
      g
    }
    
    function g {
      echo "$a"
    }
    
    f
    

    would output global_a.

    typeset -i var in a function changes the type of the local var variable, with instantiating it in the function scope if it wasn't already.

  2. In ksh93, a function declared with the Bourne syntax (f() {...}) doesn't do scoping at all. In that regard, the code of the function appears as if embedded in the caller of the function, so any variable appearing in it will have the same scope as in the caller so either global or local to the top-most function declared with the ksh syntax in its call tree. typeset would declare the variable in that top-most function (or in the global scope if there was no ksh-syntax function in its call tree).

    For example, since in ksh-syntax functions, all variables are either private or global, to implement our integer as in bash, we'd need to do it either like:

    integer() { typeset -i "$1"; }
    

    That is using the Bourne function syntax that doesn't do scoping at all.

    Or using the ksh syntax:

    function integer { typeset -i "$1"; }
    

    but to be invoked as:

    . integer var
    

    (that is, by using ., the code of integer is interpreted in the context of the caller like when you call . (source) on a script).

    Or using the ksh syntax:

    function integer { typeset -ni "$1"; }
    

    Where the variable is passed as a reference with -n like you would do in C or most other programming languages.

2. All the other Bourne-like shells

All the other Bourne-like shells (including ksh88, ksh93 was a complete rewrite and the change to static scoping was a (perceived at least) pre-requisite for the functionality to be ever included in the POSIX standard) implement dynamic scoping.

  1. A variable declared with typeset without -g in a function has the local scope of the function.

    For example, typeset -i var would declare the variable local (in the current function scope) and set the integer attribute.

    For example, in the code at the top, they would all output f_a. That is, g does see the local variable of f.

    For another example, if f calls g that calls h. If h does not declare the var variable local to its scope, it will see g's variable or possibly f's if g hasn't declared var local, or possibly the variable from the bottom-most scope.

  2. In all of bash, zsh, yash, mksh, one can change the type or value of a variable in a function without making it local to that function using typeset -g. Like with that integer function in the example you quote. But it's done differently depending on the shell.

    • In mksh, yash, zsh, typeset -g doesn't affect the variable at the bottom-most (global) scope, but at the scope it currently been defined.

      For example, while typeset -i var would declare the variable local (in the current function scope) and set the integer attribute, typeset -gi var just does the latter part (without affecting the scope of var).

      For example, when a function calls my integer function above to add the integer attribute to a variable, as in:

      f() {
       local myvar
       integer myvar
       ...
      }
      

      It does want integer to change the attributes of its myvar variable, not the one in the global scope which it has no knowledge of and may not be able to access.

    • In bash, typeset -g affects the instance of the variable in the global (bottom-most) scope. While that's what the g stands for, it's not useful in shells with dynamic scoping like bash.

      For example, in the first example in your question, 1+1 being output shows that the integer attribute has not been added to the variable. It has been added to a variable a in the global scope, but not the one that the f function has access to.

Related Question