[Tutor] Turning a string into a tuple? (Without eval?)

Blake Winton blakew@sonainnovations.com
Fri, 28 Sep 2001 14:39:12 -0400


> While I admire this effort, it seems to me that we could easily end up
with
> one heck of a complex program trying to avoid eval().

And here is my overly complex program...  :)
But first, some descriptions of what it does.

> I suppose you could look for '[' and ']' as the first and last characters
> of the input string as a way to avoid eval()ing something that wouldn't
> result in a list. But I'm not sure that would avoid all the potential
> evils of eval().

I agree with you, so what I did was to write my own parser for the various
things I wanted to handle.  Those being lists, strings, numbers, and None.
Each item is parsed in its own function, and some of the functions call
others.  I've used the "int", "long", and "float" functions to handle
the numbers, cause it makes my life a lot easier.

An example of how to use it:
-------------------------------------------------------------
>>> parse.parse( '123' ) # An integer.
123
>>> parse.parse( '"abc"' ) # A string.
'abc'
>>> parse.parse( 'None' ) # Nothing.
>>> print parse.parse( 'None' ) # Nothing.
None
>>> parse.parse( '123456789101112' ) # A long integer.
123456789101112L
>>> parse.parse( '1.234' ) # A float.
1.234
>>> parse.parse( '-1.234' ) # I handle negative numbers too.
-1.234
>>> parse.parse( '[]' ) # And finally, lists...
[]
>>> parse.parse( '[[None, "abc"],[ -1, "ab\\"d" ], -4972.]' ) # And lists
containing lists containg stuff...
[[None, 'abc'], [-1, 'ab"d'], -4972.0]
>>> parse.parse( '[   [     None,  "abc"  ]  ,   "   "]' ) # I ignore spaces
sometimes...
[[None, 'abc'], '   ']
-------------------------------------------------------------
So there you go.

Finally, the code in all its ugliness.  I put in some comments,
and the functions are reasonably named, I think.  As long as you
start from parse(), you should be okay.
-------------------------------------------------------------
# parse.py.  Like eval, but safer.

# Utility functions for treating strings like lists.

def peek( input ):
    return input[0]

def pull( input ):
    if input == "":
        return "\0", ""
    return input[0], input[1:]

def push( char, input ):
    if char == "\0":
        return input
    return char + input


# Functions that handle the types I want to.
#   Which is to say, Strings, None, Lists, and Numbers.
#   And, of course, Expressions, which could be any of them.

def parseString( input ):
    if peek( input ) != "\"":
        raise Exception( "Failed!  Next character isn't \"!  Input: '%s'." %
(input,) )

    # Remove the " char...
    switchChar, input = pull( input )

    result = ""
    while( 1 ):
        switchChar, input = pull( input )

        if switchChar == "\"":
            break
        elif switchChar == "\\":
            addChar, input = pull( input )
            result = result + addChar
        else:
            result = result + switchChar
    return result, input

def parseNone( input ):
    if input[:4] != "None":
        raise Exception( "Failed!  Next four characters aren't 'None'!
Input: '%s'." % (input,) )
    return None, input[4:]

def parseList( input ):
    if peek(input) != "[":
        raise Exception( "Failed!  Next character isn't '['!  Input: '%s'."
% (input) )

    switchChar, input = pull(input)
    result = [];

    input = input.lstrip()
    if peek(input) == "]":
        # We've got the empty list.
        closeBracket, input = pull(input)
        return result, input

    #otherwise, we've got >=1 expression.
    item, input = parseExpr( input )
    result.append( item )
    input = input.lstrip()

    switchChar, input = pull( input )
    while switchChar == ",":
        # we've got more elements.
        input = input.lstrip()
        item, input = parseExpr( input )
        result.append( item )
        input = input.lstrip()
        switchChar, input = pull( input )

    # Okay,that's all the elements.  Now we better have a ].
    if switchChar != "]":
        push( switchChar, input )
        print switchChar
        raise Exception( "Failed!  Next character isn't ']'!  Input: '%s'."
% (input) )

    return result, input

def parseNum( input ):
    result = ""

    while( 1 ):
        switchChar, input = pull( input )

        if (switchChar == "+") or (switchChar == "-") or
switchChar.isdigit() or (switchChar == "."):
            result = result + switchChar
        else:
            input = push( switchChar, input )
            break

    # Great, now we've got our number, let's try converting it!
    # First, try an int.  If that fails, try a long.  If that fails, try a
float.
    try:
        result = int( result )
    except ValueError:
        try:
            result = long( result )
        except ValueError:
            result = float( result )

    return result, input

def parseExpr( input ):
    switchChar = input[0]
    if switchChar == "\"":
        result, input = parseString( input )
    elif switchChar == "N":
        result, input = parseNone( input )
    elif switchChar == "[":
        result, input = parseList( input )
    elif (switchChar == "+") or (switchChar == "-") or switchChar.isdigit()
or (switchChar == "."):
        result, input = parseNum( input )
    else:
        raise Exception( "Failed!  Next character isn't 'N' or \" or '[' or
[0-9] or '+' or '-'!  Input: '%s'." % (input) )

    return result, input

# The main entry point.  And a main, in case someone wants to run us
directly.

def parse( input ):
    input = input.lstrip()
    result, input = parseExpr( input )
    input = input.lstrip()
    if input != "":
        raise Exception( "Garbage at the end of the input!  Result: %s
Input: '%s'." % (result, input) )
    return result

def main( args ):
    for arg in args:
        x = parse( arg )
        print x

if __name__ == "__main__":
    import sys
    main( sys.argv )
-------------------------------------------------------------

Later,
Blake.