Stripping ASCII codes when parsing
David Pratt
fairwinds at eastlink.ca
Mon Oct 17 11:21:14 EDT 2005
Many thanks Steve. This is good information. I think this should work
fine. I was doing a string.replace in a cleanData() method with the
following characters but don't know if that would have done it. This
contains all the control characters that I really know about in normal
use. ord(c) < 32 sounds like a much better way to go and comprehensive.
So I guess instead of string.replace, I should do a ... for char
in ... and check evaluate each character, correct? - or is there a
better way of eliminating these other that reading a string in
character by character.
'\a','\b','\e','\f','\n','\r','\t','\v','|'
Regards,
David
On Monday, October 17, 2005, at 06:04 AM, Steve Holden wrote:
> David Pratt wrote:
>> I am working with a text format that advises to strip any ascii
>> control
>> characters (0 - 30) as part of parsing data and also the ascii pipe
>> character (124) from the data. I think many of these characters are
>> from a different time. Since I have never seen most of these
>> characters
>> in text I am not sure how these first 30 control characters are all
>> represented (other than say tab (\t), newline(\n), line return(\r) )
>> so
>> what should I do to remove these characters if they are ever
>> encountered. Many thanks.
>
> You will find the ord() function useful: control characters all have
> ord(c) < 32.
>
> You can also use the chr() function to return a character whose ord()
> is
> a specific value, and you can use hex escapes to include arbitrary
> control characters in string literals:
>
> myString = "\x00\x01\x02"
>
> regards
> Steve
> --
> Steve Holden +44 150 684 7255 +1 800 494 3119
> Holden Web LLC www.holdenweb.com
> PyCon TX 2006 www.python.org/pycon/
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
More information about the Python-list
mailing list