[Tutor] Regular expression to match \f in groff input?

Bill Campbell bill at celestial.net
Thu Aug 21 19:40:28 CEST 2008


I've been beating my head against the wall try to figure out how
to get regular expressions to match the font change sequences in
*roff input (e.g. \fB for bold, \fP to revert to previous font).
The re library maps r'\f' to the single form-feed character (as
it does other common single-character sequences like r'\n').

In perl, I can write something like this:

	s/\f[1NR]/</emphasis>/g;

This does not work in puthon:

	s = re.sub(r'\f[1NR]', '</emphasis>, sinput)

The string.replace() operator will handle the above replacement,
although it requires a separate replace for each of the possible
characters in the [1NR].

I have tried various options such as r'\\x66' and r'\\146', but
none of these work.

One work-around, assuming that the text does not contain control
characters, is to replace the \f characters with a control
character before doing the replacements, then replace that
control character with \f if any remain after processing:

import re, fileinput
for line in fileinput.input():
    line = line.rstrip()
    line = line.replace(r'\f', r'\001')
    # do something here to make substitutions
    line = line.replace(r'\001', r'\f')
    print line

Bill
-- 
INTERNET:   bill at celestial.com  Bill Campbell; Celestial Software LLC
URL: http://www.celestial.com/  PO Box 820; 6641 E. Mercer Way
Voice:          (206) 236-1676  Mercer Island, WA 98040-0820
Fax:            (206) 232-9186

It is our duty still to endeavor to avoid war; but if it shall actually
take place, no matter by whom brought on, we must defend ourselves. If our
house be on fire, without inquiring whether it was fired from within or
without, we must try to extinguish it.
    -- Thomas Jefferson to James Lewis, Jr., 1798.


More information about the Tutor mailing list