Regular expressopn problem

Martin Bertolino moc.xlat at bxm
Wed May 9 11:43:44 EDT 2001


I'm using Python 2.0 to try to parse out some GUIDs from an XML document
returned from an HTTP server, but I'm having some problems with some regular
expressions that I believe should match what I'm looking for.

Consider the following input to the parse_token function:

<response version="1.0" server="test" code="0000" result_msg="success">
    <signon_user request_id="123"
response_id="E95FA6E0-3944-4945-93CD-4E37EDB807E3"
            code="0000" result_msg="success">
                <token>4CA064D0-5653-45E4-9581-9EE9A9B12A49</token>
                <pin_attributes pin_set_needed="N" grace_logins_left="3"/>
                <pin_control pin_set_enabled="N" pin_reset_enabled="N"/>
                <last_signon date_time="2001-05-09T15:30:06Z"/>
                <identification ssn="222222222" employee_id="222222222"/>
                <name title="sir" first="jokey" middle="dorcs" last="smokey"
suffix="sr."/>
                <division_id>central</division_id>
    </signon_user>
</response>

and the parse_token function, whick attempts to get the value of the <token>
node,  implemented as follows

import re

def parse_token(output_xml):
    # v0: token_re =
re.search(r"<token>([0-9A-Ea-e]+\-[0-9A-Ea-e]+\-[0-9A-Ea-e]+\-[0-9A-Ea-e]+\-
[0-9A-Ea-e]+)</token>", output_xml)
    # v1: token_re = re.search(r"<token>([0-9A-Ea-e\-]+)</token>",
output_xml)
    token_re = re.search(r"<token>(.+)</token>", output_xml)
    if token_re:
        return token_re.group(1)
    else:
        raise "unable to parse token out of output xml"

The re.search that is not commented correctly matches the node so it returns
4CA064D0-5653-45E4-9581-9EE9A9B12A49. The other two (v0 and v1) do not. I
tried those two RE on a perl script, and I do get the expected output.

Am I missing something? It seems as if the compile is getting thrown by the
'-'s in the RE.

Thanks for the help.

Martin Bertolino






More information about the Python-list mailing list