Stripping ASCII codes when parsing

Steve Holden steve at holdenweb.com
Mon Oct 17 05:04:37 EDT 2005


David Pratt wrote:
> I am working with a text format that advises to strip any ascii control 
> characters (0 - 30) as part of parsing data and also the ascii pipe 
> character (124) from the data. I think many of these characters are 
> from a different time. Since I have never seen most of these characters 
> in text I am not sure how these first 30 control characters are all 
> represented (other than say tab (\t), newline(\n), line return(\r) ) so 
> what should I do to remove these characters if they are ever 
> encountered. Many thanks.

You will find the ord() function useful: control characters all have 
ord(c) < 32.

You can also use the chr() function to return a character whose ord() is 
a specific value, and you can use hex escapes to include arbitrary 
control characters in string literals:

   myString = "\x00\x01\x02"

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/




More information about the Python-list mailing list