[XML-SIG] Handling large amouns of character data

Doyon, Jean-Francois Jean-Francois.Doyon@CCRS.NRCan.gc.ca
Mon, 8 Apr 2002 18:25:25 -0400


Hello,

I've run into a rather weird problem:

I'm using the xml.parsers.expat module to parse some GML, and have some
character data that contains a LOT of data ...

Here is the problem, it seems that the parser inserts some sort of =
"newline"
when lines are really long, which breaks a whole lot of things.

Sounds simple enough to fix, but I can't figure it out!!!

Here is my character data handler:

        def char_data(data):
                global nextchardata, xynodes
                if ( nextchardata =3D=3D 'coordinates' ):
                        print data
                        data =3D strip(data)
                        if ( data !=3D '' ):
                                if ( xynodes =3D=3D None ):
                                        xynodes =3D data
                                else:
                                        xynodes =3D xynodes+' '+data
                                nextchardata =3D None

if I do a "print data" RIGHT AFTER the "def" line, I see ALL the data =
dumped
to screen, with "newlines" appearing after a given size "chunk" it =
seems.

I've tried to remove those using the "replace" function, but to no =
avail.
I'm not even sure what they are ...

Here's the kicker:

If I put that "print data" as the first line within the first "if"
statement, boum, my data just got stripped, only the FIRST line of =
those now
"multiline" data sets is now available!!! I didn't even do anything to =
it!=20

What's going on? I'm sure I'm not the first to have come accross this, =
but I
can't seem to find any documentation on expat, or it's limits, or why =
this
is happening ....

Any help would be greatly appreaciated!!!

Thanks,

Jean-Fran=E7ois Doyon
Internet Service Development and Systems Support
GeoAccess Division
Canadian Center for Remote Sensing
Natural Resources Canada
http://atlas.gc.ca
Phone: (613) 992-4902
Fax: (613) 947-2410