Eval (was Re: Question about using python as a scripting language)

skip at pobox.com skip at pobox.com
Wed Aug 9 10:56:45 EDT 2006


    Brendon> Turns out that the website in question stores its data in the
    Brendon> format of a Python list
    Brendon> (http://quotes.nasdaq.com/quote.dll?page=nasdaq100, search the
    Brendon> source for "var table_body"). So, the part of my code that
    Brendon> extracts the data looks something like this:

    ...
    Brendon>      return eval(data[pos1+len(START_MARKER):END_MARKER])

    Brendon> My question is: what's the safe way to do this?

At the top level the lines look like a Python list.  On a line-by-line basis
they also have consistent structure.  Read it line-by-line, parse the lines
(using regular expressions or whatever), then append the parsed values to a
list, something like (untested):

    import re
    symbolinfo = []
    sympat = re.compile(
        r'\[',
        r'"(?P<sym>[^"]+)",'
        r' *"(?P<name>[^"]+)",'
        r' *(?<n1>[^,]+,'
        r' *(?<n2>[^,]+,'
        r' *(?<n3>[^,]+,'
        r' *(?<n4>[^,]+,'
        r' *(?<n5>[^,]+,'
        r' *"(?P<s1>[^"]*)"
        r' *"(?P<s2>[^"]*)"
        r'\]')
    for line in urllib.urlopen("http://..."):
        mat = sympat.match(line)
        if mat is not None:
            symbolinfo.append(mat.groupdict())

The regular expression is fairly fragile, but that's okay.  If their format
changed from a list of ten elements to a list of eight or twelve elements,
you'd probably be interested in knowing about that asap.  eval() probably
wouldn't fail unless they completely butchered the table syntax.

With a small amount of input massaging, you could do this more cleanly with
the csv module.  That's left as an exercise for the reader.

Skip



More information about the Python-list mailing list