How to parse multi-part content

Michael Foord fuzzyman at gmail.com
Mon Sep 27 05:40:22 EDT 2004


Dave Kuhlman <dkuhlman at rexx.com> wrote in message news:<2rp5dmF1cgkbkU1 at uni-berlin.de>...
> Tim Roberts wrote:
> 
> > Dave Kuhlman <dkuhlman at rexx.com> wrote:
> >>
> >>Suppose that I have content that looks like what I've included at
> >>the end of this message.  Is there something in the standard
> >>Python library that will help me parse it, break into the parts
> >>separated by the boundary strings, extract headers from each
> >>sub-part, etc?
> >>...
> >>In case you are curious, this is content posted to my Zope server
> >>when I include an element '<input type="file" .../>' in my form.
> > 
> > Actually, you get this because your <form> header has
> > enctype="multipart/form-data".  It happens that file upload only works
> > with that enctype, but you can use it without a file upload.
> > 
> > That's why cgi.py knows how to parse this.  Look at cgi.parse_multipart.
> 
> Ah. A clue.  I think you're telling me that it's the CGI
> specification that I need to be reading, right?  I'll read some of
> that.
> 
> Per your suggestion, I tried cgi.parse_multipart() and also
> class cgi.FieldStorage.  They don't work.  Or more correctly, I
> don't know how to use them.
> 
> I guess I'll have to concede defeat, which in Python-speak means:
> "It was easier to write it myself."
> 
> Basically, I wrote a little parser class ContentParser which
> exposes a method get_content_by_name.  This method returns the
> body (what follows two carriage returns, up to the next
> boundary line) for a given name, where name is the value of the
> "name" field in the line:
> 
>     Content-Disposition: form-data; name="xschemaFile"
> 
> I was in a bit of a hurry, so my solution (class ContentParser) is
> not very elegant.  But if anyone needs it, let me know.
> 
> And, thanks for the suggestions.
> 
> Dave

If you are receiving this data to a python script on a server from an
HTML form (i.e. a cgi) then it's striaghtforward to do.

import cgi
theform = cgi.FieldStorage()

parses the contents of the form into a dictionary like object.
The HTML form that posted the information will assign each file (or
element of the form) a name.
You can access the saved data ausing :

thedata = theform['name].value

Look under the cgi documentation for other attributes that uploaded
files will have. (Potential pitfall with 'list values' as well, where
several values have the same name - again see the docs to see ways
round this).

Regards,

Fuzzyman
http://www.voidspace.org.uk/atlantibots/pythonutils.html



More information about the Python-list mailing list