Bash – Serialize shell variable in bash or zsh

bashshellvariablezsh

Is there any way to serialize a shell variable? Suppose I have a variable $VAR, and I want to be able to save it to a file or whatever, and then read it back later to get the same value back?

Is there a portable way of doing this? (I don't think so)

Is there a way to do it in bash or zsh?

Best Answer

Warning: With any of these solutions, you need to be aware that you are trusting the integrity of the data files to be safe as they will get executed as shell code in your script. Securing them is paramount to your script's security!

Simple inline implementation for serializing one or more variables

Yes, in both bash and zsh you can serialize the contents of a variable in a way that is easy to retrieve using the typeset builtin and the -p argument. The output format is such that you can simply source the output to get your stuff back.

 # You have variable(s) $FOO and $BAR already with your stuff
 typeset -p FOO BAR > ./serialized_data.sh

You can get your stuff back like this either later in your script or in another script altogether:

# Load up the serialized data back into the current shell
source serialized_data.sh

This will work for bash, zsh and ksh including passing data between different shells. Bash will translate this to its builtin declare function while zsh implements this with typeset but as bash has an alias for this to work either way for we use typeset here for ksh compatibility.

More complex generalized implementation using functions

The above implementation is really simple, but if you call it frequently you might want to give yourself a utility function to make it easier. Additionally if you ever try to include the above inside custom functions you will run into issues with variable scoping. This version should eliminate those issues.

Note for all of these, in order to maintain bash/zsh cross-compatibility we will be fixing both the cases of typeset and declare so the code should work in either or both shells. This adds some bulk and mess that could be eliminated if you were only doing this for one shell or another.

The main problem with using functions for this (or including the code in other functions) is that the typeset function generates code that, when sourced back into a script from inside a function, defaults to creating a local variable rather than a global one.

This can be fixed with one of several hacks. My initial attempt to to fix this was parse the output of the serialize process through sed to add the -g flag so the created code defines a global variable when sourced back in.

serialize() {
    typeset -p "$1" | sed -E '0,/^(typeset|declare)/{s/ / -g /}' > "./serialized_$1.sh"
}
deserialize() {
    source "./serialized_$1.sh"
}

Note that the funky sed expression is to only match the first occurrence of either 'typeset' or 'declare' and add -g as a first argument. It is necessary to only match the first occurrence because, as Stéphane Chazelas rightly pointed out in comments, otherwise it will also match cases where the serialized string contains literal newlines followed by the word declare or typeset.

In addition to correcting my initial parsing faux pas, Stéphane also suggested a less brittle way to hack this that not only side steps the issues with parsing the strings but could be a useful hook to add additional functionality by using a wrapper function to redefine the actions taken when sourcing the data back in. This assumes you are not playing any other games with the declare or typeset commands, but this technique would be easier to implement in a situation where you were including this functionality as part of another function of your own or you were not in control of the data being written and whether or not it had the -g flag added. Something similar could also be done with aliases, see Gilles's answer for an implementation.

To make the result even more useful, we can iterate over multiple variables passed to our functions by assuming that each word in the argument array is a variable name. The result becomes something like this:

serialize() {
    for var in $@; do
        typeset -p "$var" > "./serialized_$var.sh"
    done
}

deserialize() {
    declare() { builtin declare -g "$@"; }
    typeset() { builtin typeset -g "$@"; }
    for var in $@; do
        source "./serialized_$var.sh"
    done
    unset -f declare typeset
}

With either solution, usage would look like this:

# Load some test data into variables
FOO=(an array or something)
BAR=$(uptime)

# Save it out to our serialized data files
serialize FOO BAR

# For testing purposes unset the variables to we know if it worked
unset FOO BAR

# Load  the data back in from out data files
deserialize FOO BAR

echo "FOO: $FOO\nBAR: $BAR"
Related Question