Filtering through an external process

Raymond Hettinger vze4rx4y at verizon.net
Sun Nov 30 02:41:44 EST 2003


"Paul Rubin" <http://phr.cx@NOSPAM.invalid> wrote in message
news:7xllpyyl1s.fsf_-_ at ruckus.brouhaha.com...
> Anyone know if there's code around to filter text through an external
> process?  Sort of like the Emacs "filter-region" command.  For
> example, say I have a program that reads input in English and outputs
> it in Pig Latin.  I want my Python script to call the program, pipe
> some input into it and read the output:
>
>      english = "hello world"
>      pig_latin = ext_filter("pig_latin", english)
>
> should set pig_latin to "ellohay orldway".
>
> Note that you can't just call popen2, jam the english through it and
> then read the pig latin, because the subprocess can block if you give
> it too much input before reading the output, and in general there's no
> way to know how much buffering the subprocess is willing to do.  So a
> proper solution has to use asynchronous i/o and keep polling the
> output side, or else separate threads for reading and writing.
>
> This is something that really belongs in the standard library.  I've
> needed it several times and rather than going to the trouble of coding
> and debugging it, I've always ended up using a temp file instead,
> which is a kludge.

The time machine lives!

=========================
Add this file:  Lib/encodings/pig.py
----------------------------------------
"Pig Latin Codec -- Lib/encodings/pig.py"

import codecs, re

def encode(input, errors='strict'):
    output = re.sub( r'\b(th|ch|st|\w)(\w+)\b', r'\2\1ay', input)
    return (output, len(input))
def decode(input, errors='strict'):
    output = re.sub( r'(\b\w+?)(th|ch|st|\w)ay\b', r'\2\1', input)
    return (output, len(input))

def getregentry():
    return (encode,decode,codecs.StreamReader,codecs.StreamWriter)
-------------------------------------------


Now, fire-up Python:

>>> 'hello world'.encode('pig')
'ellohay orldway'
>>> 'ellohay orldway'.decode('pig')
'hello world'



Raymond Hettinger









More information about the Python-list mailing list