regular expresson for Unix and Dos Lineendings wanted

Steven D'Aprano steve at REMOVETHIScyber.com.au
Fri Feb 24 05:25:24 EST 2006


On Thu, 23 Feb 2006 15:13:01 +0100, Franz Steinhaeusler wrote:

>>why not use string methods strip, rstrip and lstrip
>>
> 
> because this removes only the last spaces,
>>>> r
> 'erewr    \r\nafjdskl     '
>>>> r.rstrip()
> 'erewr    \r\nafjdskl'
> 
> I want:
> 'erewr\r\nafjdskl'
> 
> or for unix line endings
> 'erewr\nafjdskl'


# Untested
def whitespace_cleaner(s):
    """Clean whitespace from string s, returning new string.

    Strips all trailing whitespace from the end of the string, including
    linebreaks. Removes whitespace except for linebreaks from everywhere
    in the string. Internal linebreaks are converted to whatever is
    appropriate for the current platform.
    """

    from os import linesep
    from string import whitespace
    s = s.rstrip()
    for c in whitespace:
        if c in '\r\n': 
            continue
        s = s.replace(c, '')
    if linesep == '\n': # Unix, Linux, Mac OS X, etc.
        # the order of the replacements is important
        s = s.replace('\r\n', '\n').replace('\r', '\n')
    elif linesep == '\r':  # classic Macintosh
        s = s.replace('\r\n', '\r').replace('\n', '\r')
    elif linesep == '\r\n':  # Windows
        s = s.replace('\r\n', '\r').replace('\n', '\r')
        s = s.replace('\r', '\r\n')
    else: # weird platforms?
        print "Unknown line separator, skipping."
    return s



-- 
Steven.




More information about the Python-list mailing list