Bash – Why won’t this bash command run when called by python

bashpythonquotingshell

The command

$ find ~/foo/ -type f -iname "*.txt" -print0 | parallel -0 cat

uses GNU Parallel to print all the .txt files under ~/foo/.

I have a python script where I want to call this bash command:

import subprocess, sys

def runBashCommand(my_command):
    process = subprocess.Popen(my_command.split(), stdout=subprocess.PIPE)
    output  = process.communicate()[0]
    return None

def makeCommand(my_path):
    return "find {} -type f -iname \"*.txt\" -print0 | parallel -0 cat".format(my_path)

Issuing

>>> makeCommand('~/foo/')

returns

'find ~/foo/ -type f -iname "*.txt" -print0 | parallel -0 cat'

but issuing

>>> runBashCommand(makeCommand('~/foo/'))

yields the error

find: paths must precede expression: |
Usage: find [-H] [-L] [-P] [-Olevel] [-D help|tree|search|stat|rates|opt|exec] [path...] [expression]

what's the problem with my script?

Best Answer

You're not actually running a bash command. What you're doing is running an executable directly and passing it arguments.

Try the following script to see what is happening:

import subprocess
p = subprocess.Popen(["echo", "a", "b", "|", "rev"], stdout=subprocess.PIPE)
print p.communicate()

The output will be:

('a b | rev\n', None)

There's no redirection happening, the "|" is being passed in literally. I.e., it is as if you had typed find ... \| parallel .... Thus the error.

There's two ways to fix it.

  • The easy way: pass shell=True to subprocess.Popen. That will run it through the shell, with all that entails. If you do that, you also need to pass in a string instead of an array:

    import subprocess
    p = subprocess.Popen("echo a b | rev", stdout=subprocess.PIPE, shell=True)
    print p.communicate()
    
    # Result: ('b a\n', None)
    

    If you do this, be very careful about argument substitution into your string.

  • The robust way: open two processes using Python and pipe them together.

    import subprocess
    # First command
    p1 = subprocess.Popen(["echo", "a", "b"], stdout=subprocess.PIPE)
    # Second command's input linked to the first one's output
    p2 = subprocess.Popen(["rev"], stdin=p1.stdout, stdout=subprocess.PIPE)
    # Read from p2 to get the output
    print p2.communicate()
    
    # Result: ('b a\n', None)
    

    This is more robust and doesn't spawn an extra shell, but on the other hand it is more typing. If you do this, note that no shell substitution happens. In your case it doesn't look like you need it, but if you'd want to use, say, ~, you'd have to get it through Python (e.g. os.getenv("HOME")).

Related Question