[Tutor] subprocess adds %0A to end of string

Martin Walsh mwalsh at mwalsh.org
Mon Dec 22 02:25:52 CET 2008


Hi David,

David wrote:
> Hi everyone.
> Just learning :) I have one program to parse a podcast feed and put it
> into a file.

Welcome!

<snip>
> 
> def getFeed():
>     url = raw_input("Please enter the feed: ")
>     data = feedparser.parse(url)
>     for entry in data.entries:
>         sys.stdout = open("podcast.txt", "a")

You should probably try to avoid reassigning sys.stdout. This is usually
a bad idea, and can cause odd behavior that is difficult to
troubleshoot, especially for a beginner. A reasonable approach is to
assign the open file object to a name of your own choosing...

.>>> podcast_file = open('podcast.txt', 'a')

... and then, use the write method of the file object ...

.>>> podcast_file.write('%s: %s' % (entry.updated, entry.link))

More info here:
http://www.python.org/doc/2.5.3/tut/node9.html#SECTION009200000000000000000


>         print '%s: %s' % (entry.updated, entry.link)
>         sys.stdout.close()
>     for entry in data.entries:
>         sys.stdout = open("podcast_links.txt", "a")
>         print '%s' % (entry.link)
>         sys.stdout.close()
> getFeed()
> 
> next to get the latest item;
> 
<snip>
> lname = "podcast_links.txt"
> L = open(lname, 'r')
> print "The Latest Link\n"
> download = L.readline()

The readline method returns a line from the file *including* the newline
 character(s) ('\n').

> print download
> 
> answer = raw_input("Do you want to download the podcast? ")
> if answer == "y":
>     wget = "wget"
>     subprocess.call([wget, download])
> else:
>     print "oops"

OK. There's the problem. Let's assume that after 'download =
L.readline()' that download equals this (you can confirm by adding a
'print repr(download)'):

'http://linuxcrazy.com/podcasts/LC-44-arne.mp3\n'

... then the call becomes (behind the scenes)

subprocess.call(['wget',
        'http://linuxcrazy.com/podcasts/LC-44-arne.mp3\n'])

... so the newline is passed as part of the first argument to the wget
command.

Not so coincidentally, the '%0A' represents a newline ('\n') in a
properly quoted/escaped URL.

.>>> import urllib2
.>>> urllib2.unquote('%0A')
'\n'

I suspect it is the wget command which is quoting the newline, not the
subprocess call, as subprocess doesn't know anything about valid
characters for urls.

You can work around this problem as you already have by dropping the
last character as in 'download[:-1]', or use the strip (or rstrip) str
method:

.>>> download.rstrip()
'http://linuxcrazy.com/podcasts/LC-44-arne.mp3'

More info here: http://www.python.org/doc/2.5.3/lib/string-methods.html

HTH,
Marty


More information about the Tutor mailing list