[Tutor] Feedback on Script for Pandas DataFrame Written into XML

Saran Ahluwalia ahlusar.ahluwalia at gmail.com
Mon Mar 30 11:32:37 CEST 2015


Good Morning Martin:

Thank you for your feedback.

I have attached a .html file (I would recommend downloading this first and
then opening the file), and my .py script. Here is the data
<https://github.com/ahlusar1989/IntroToPython/blob/master/cables_tiny.csv>.


My function (included in the prior message) and my schema is based off of
my interpretation that each row of the DataFrame is a message and each
field is within the message.

Each column's index and its corresponding field is nested within each
message (for example "date"). I gave this hypothetical example as one can
see one of the columns includes a data/timestamp of a correspondence.  My
question is as follows:

1. I this the correct translation/interpretation of the data set? Or am I
over thinking the schema and interpretation of the DataFrame?

I welcome your thoughts and feedback.

Sincerely,

Saran

On Sun, Mar 29, 2015 at 10:32 PM, Martin A. Brown <martin at linux-ip.net>
wrote:

>
> Good evening again,
>
> I'm replying to your second post, because I replied to the first. This may
> be a more specific request than is typically handled on Python tutor.  This
> involves specific knowledge of the xml.etree.ElementTree and
> pandas.DataFrame objects.
>
>  I would appreciate your feedback on whether I correctly wrote my XML. I
>> am exporting a DataFrame and writing into a XML file. I used the
>> ElementTree library. The DataFrame has 11 rows and 8 columns (excluding the
>> index column).
>>
>
> Side note:  Hard to know or give any advice without considerably more
> detail on the data involved.  But....
>
>  #My schema assumption:
>> #<list>
>> #[<message>
>> #<index>Some number row</index>
>> #<date>Sample text </data>
>> #</message>]
>> #</list>
>>
>
> That shows 6 (XML) elements.  This is neither 8 nor 11.
>
>  CODE: SELECT ALL <http://www.python-forum.org/viewtopic.php?f=6&t=15261#>
>>
>> document = ET.Element("list")
>>
>> def make_message(document, row):
>>    msg = ET.SubElement(document, "message")
>>    for field in row.index:
>>        field_element = ET.SubElement(msg, field)
>>        field_element.text = row[field]
>>    return msg
>>
>> def add_to_document(row):
>>    return make_message(document, row)
>>
>> #df.apply(add_to_document, axis=0) ---> if I were to import a DataFrame
>> stored in the variable
>> #"df", I would simply APPLY the add_to_document function and COMBINE this
>> into a document
>>
>> ET.dump(document)
>>
>> Thank you, in advance for your help.
>>
>
> This is a more general inquiry and is probably better suited for the lxml
> (ElementTree) mailing list ...
>
>   https://mailman-mail5.webfaction.com/listinfo/lxml
>
> ... or maybe the Pandas mailing list:
>
>   https://groups.google.com/forum/#!forum/pydata
>
> Best of luck,
>
> -Martin
>
> --
> Martin A. Brown
> http://linux-ip.net/
>


More information about the Tutor mailing list