codecs / subprocess interaction: utf help requested

John Machin sjmachin at lexicon.net
Sun Jun 10 18:10:40 EDT 2007


On Jun 11, 7:17 am, smitty1e <smitt... at gmail.com> wrote:
> The first print statement does what you'd expect.
> The second print statement has rather a lot of rat in it.
> The goal here is to write a function that will return the man page for
> some command (mktemp used as a short example here) as text to client
> code, where the groff markup will be chopped to extract all of the
> command options.  Those options will eventually be used within an
> emacs mode, all things going swimmingly.
> I don't know what's going on with the piping in the second version.
> It looks like the output of p0 gets converted to unicode at some
> point,

Whatever gave you that idea?

> but I might be misunderstanding what's going on.  The 4.8
> codecs  module documentation doesn't really offer much enlightment,
> nor google.  About the only other place I can think to look would be
> the unit test cases shipped with python.

Get your head out of the red herring factory; unicode, "utf" (which
one?) and codecs have nothing to do with your problem. Think about
looking at your own code and at the bzip2 documentation.

> Sort of hoping one of the guru-level pythonistas can point to
> illumination, or write something to help out the next chap.  This
> might be one of those catalytic questions, the answer to which tackles
> five other questions you didn't really know you had.
> Thanks,
> Chris
> ---------------------------
> #!/usr/bin/python
> import subprocess
>
> p = subprocess.Popen(["bzip2", "-c", "-d", "/usr/share/man/man1/mktemp.
> 1.bz2"]
>                     , stdout=subprocess.PIPE)
> stdout, stderr = p.communicate()
> print stdout
>
> p0 = subprocess.Popen(["cat","/usr/share/man/man1/mktemp.1.bz2"],
> stdout=subprocess.PIPE)
> p1 = subprocess.Popen(["bzip2"], stdin=p0.stdout                ,
> stdout=subprocess.PIPE)
> stdout, stderr = p1.communicate()
> print stdout
> ---------------------------

You left out the command-line options for bzip2. The "rat" that you
saw was the result of compressing the already-compressed man page.
Read this:
http://www.bzip.org/docs.html
which is a bit obscure. The --help output from my copy of an antique
(2001, v1.02) bzip2 Windows port explains it plainly:
"""
   If invoked as `bzip2', default action is to compress.
              as `bunzip2',  default action is to decompress.
              as `bzcat', default action is to decompress to stdout.

   If no file names are given, bzip2 compresses or decompresses
   from standard input to standard output.
"""

HTH,
John




More information about the Python-list mailing list