Bash – Serialize shell variable in bash or zsh

bashshellvariablezsh

Is there any way to serialize a shell variable? Suppose I have a variable $VAR, and I want to be able to save it to a file or whatever, and then read it back later to get the same value back?

Is there a portable way of doing this? (I don't think so)

Is there a way to do it in bash or zsh?

Best Answer

Warning: With any of these solutions, you need to be aware that you are trusting the integrity of the data files to be safe as they will get executed as shell code in your script. Securing them is paramount to your script's security!

Simple inline implementation for serializing one or more variables

Yes, in both bash and zsh you can serialize the contents of a variable in a way that is easy to retrieve using the typeset builtin and the -p argument. The output format is such that you can simply source the output to get your stuff back.

 # You have variable(s) $FOO and $BAR already with your stuff
 typeset -p FOO BAR > ./serialized_data.sh

You can get your stuff back like this either later in your script or in another script altogether:

# Load up the serialized data back into the current shell
source serialized_data.sh

This will work for bash, zsh and ksh including passing data between different shells. Bash will translate this to its builtin declare function while zsh implements this with typeset but as bash has an alias for this to work either way for we use typeset here for ksh compatibility.

More complex generalized implementation using functions

The above implementation is really simple, but if you call it frequently you might want to give yourself a utility function to make it easier. Additionally if you ever try to include the above inside custom functions you will run into issues with variable scoping. This version should eliminate those issues.

Note for all of these, in order to maintain bash/zsh cross-compatibility we will be fixing both the cases of typeset and declare so the code should work in either or both shells. This adds some bulk and mess that could be eliminated if you were only doing this for one shell or another.

The main problem with using functions for this (or including the code in other functions) is that the typeset function generates code that, when sourced back into a script from inside a function, defaults to creating a local variable rather than a global one.

This can be fixed with one of several hacks. My initial attempt to to fix this was parse the output of the serialize process through sed to add the -g flag so the created code defines a global variable when sourced back in.

serialize() {
    typeset -p "$1" | sed -E '0,/^(typeset|declare)/{s/ / -g /}' > "./serialized_$1.sh"
}
deserialize() {
    source "./serialized_$1.sh"
}

Note that the funky sed expression is to only match the first occurrence of either 'typeset' or 'declare' and add -g as a first argument. It is necessary to only match the first occurrence because, as Stéphane Chazelas rightly pointed out in comments, otherwise it will also match cases where the serialized string contains literal newlines followed by the word declare or typeset.

In addition to correcting my initial parsing faux pas, Stéphane also suggested a less brittle way to hack this that not only side steps the issues with parsing the strings but could be a useful hook to add additional functionality by using a wrapper function to redefine the actions taken when sourcing the data back in. This assumes you are not playing any other games with the declare or typeset commands, but this technique would be easier to implement in a situation where you were including this functionality as part of another function of your own or you were not in control of the data being written and whether or not it had the -g flag added. Something similar could also be done with aliases, see Gilles's answer for an implementation.

To make the result even more useful, we can iterate over multiple variables passed to our functions by assuming that each word in the argument array is a variable name. The result becomes something like this:

serialize() {
    for var in $@; do
        typeset -p "$var" > "./serialized_$var.sh"
    done
}

deserialize() {
    declare() { builtin declare -g "$@"; }
    typeset() { builtin typeset -g "$@"; }
    for var in $@; do
        source "./serialized_$var.sh"
    done
    unset -f declare typeset
}

With either solution, usage would look like this:

# Load some test data into variables
FOO=(an array or something)
BAR=$(uptime)

# Save it out to our serialized data files
serialize FOO BAR

# For testing purposes unset the variables to we know if it worked
unset FOO BAR

# Load  the data back in from out data files
deserialize FOO BAR

echo "FOO: $FOO\nBAR: $BAR"

Output :

foobar_1
foobar_2
foobar_3

'dereference' :

$ for v in "${!foobar_@}"; do echo "${!v}"; done

Output² :

x
y
z

bash,shell,zsh,string – How to Split a String by ‘:’ Character in Bash/Zsh

The : there is any arbitrary character:

You can use:

parts=(${(s/:/)str})

Some common character pairs are also supported like:

parts=(${(s[:])str})

If you're going to use the @ flag to preserve empty elements, then you need to quote:

parts=("${(@s[:])str}")

Otherwise @ makes no difference.

If it's to process variables like $PATH/$LD_LIBRARY_PATH... see also typeset -T which ties an array variable to a scalar variable:

$ typeset -T str str_array
$ str='a::b'
$ typeset -p str
typeset -T str str_array=( a '' b )

zsh does tie $path to $PATH by default (like in csh/tcsh).

bash's

parts=(${str//:/ })

Is wrong as it applies split+glob after having replaced : with SPC.

You'd want:

IFS=:            # split on : instead of default SPC TAB NL
set -o noglob    # disable glob
parts=( $str"" ) # split+glob (leave expansion unquoted), preserve trailing
                 # empty part.

That code would also work in zsh, if it was in sh or ksh emulation mode. If your goal is to write code compatible to both bash and zsh, you may want to write it using ksh syntax and make sure that zsh is put in ksh emulation (possibly only locally to some function) when interpreting it.

To test whether the shell is bash or zsh, you'd test for the presence of the $BASH_VERSION/$BASH_VERSINFO or $ZSH_VERSION variables.

split() { # args: string delimiter result_var
  if
    [ -n "$ZSH_VERSION" ] &&
      autoload is-at-least &&
      is-at-least 5.0.8 # for ps:$var:
  then
    eval $3'=("${(@ps:$2:)1}")'
  elif
    [ "$BASH_VERSINFO" -gt 4 ] || {
      [ "$BASH_VERSINFO" -eq 4 ] && [ "${BASH_VERSINFO[1]}" -ge 4 ]
      # 4.4+ required for "local -"
    }
  then
    local - IFS="$2"
    set -o noglob
    eval "$3"'=( $1"" )'
  else
    echo >&2 "Your shell is not supported"
    exit 1
  fi
}

split "$str" : parts