[Tutor] os.popen - using commands and input %

Danny Yoo dyoo at hashcollision.org
Tue Nov 17 21:06:45 EST 2015


On Mon, Nov 16, 2015 at 4:17 AM, Vusa Moyo <soweto at gmail.com> wrote:
> Hi Guys,
>
> My code is as follows
>
> # this list contains system process ID's
> pidst=[1232, 4543, 12009]
>
> pmap_str=[]
> command="pmap -d %s | grep private |awk '{print $1}' | awk -FK '{print $1}'"
>
> for i in range(len(pids)):
>     pmap_str.append(os.popen("(command) % pidlist_int[i])"))


Hi Vusa,

If you can, please try to avoid using os.popen() or subprocess.call()
with formatted strings: that's a common security risk that's known as
a "code injection" vulnerability.

    https://en.wikipedia.org/wiki/Code_injection


More generally, your program is over-using strings to do things that
they aren't meant to do.  One of the habits of the beginner is to try
to do everything in terms of strings, since strings are flexible.  But
this approach often ignores features of the language that might be
more appropriate.

I brought up that constructing the command pipeline string is one
thing to try to avoid with string formatting.  But it tries to use
strings in another way that is somewhat unnatural: it is trying to
encode the idea of re-using the command pipeline so you can feed it
different pids

    "(command) % pidlist_int[i])

This idea of reuse is admirable, but the execution with string
formatting can be improved.  An alternative approach can use the
language's natural construct for reusability: the function.  We can
write a function to construct a command that is parameterized on the
given pid.


The rest of this message will sketch out what this looks like.

---


What you're doing here:

    pmap -d <PID>| grep private |awk '{print $1}' | awk -FK '{print $1}'

can be translated to a sequence of operations that does not use any
string formatting to construct command lines.  See:
https://docs.python.org/2/library/subprocess.html#replacing-shell-pipeline.
If we apply this to your command, we should expect to see something
like this (untested)

#################################
def get_pmap(pid):
    p1 = Popen(['pmap', '-d', str(pid)], stdout=PIPE)
    p2 = Popen(['grep', 'private'], stdin=p1.stdout, stdout=PIPE)
    p3 = Popen(['awk', '{print $1}'], stdin=p2.stdout, stdout=PIPE)
    p4 = Popen(['awk', 'FK', '{print $1}'], stdin=p3.stdout, stdout=PIPE)
    p1.stdout.close()
    return p4.communicate()[0]
#################################

Here, we package up this collection of statements into a function that
takes in a pid and returns the result.  Note that we don't have to do
any string concatenation or string formatting.


After which, we can apply this function across your list directly.

    pmap_str = [get_pmap(pid) for pid in pidst]


More information about the Tutor mailing list