XML parsing per record
Willem Ligtenberg
WLigtenberg at gmail.com
Fri Apr 22 07:48:15 EDT 2005
This is all the info I need from the xml file:
ID --> <Gene-track_geneid>320632</Gene-track_geneid>
Name --> <Gene-ref>
<Gene-ref_locus>Pzp</Gene-ref_locus>
Startbase --> <Gene-commentary_seqs>
<Seq-loc>
<Seq-loc_int>
<Seq-interval>
<Seq-interval_from>126957426</Seq-interval_from>
<Seq-interval_to>126989473</Seq-interval_to>
<Seq-interval_strand>
<Na-strand value="plus"/>
</Seq-interval_strand>
<Seq-interval_id>
<Seq-id>
<Seq-id_gi>51860766</Seq-id_gi>
</Seq-id>
</Seq-interval_id>
</Seq-interval>
</Seq-loc_int>
</Seq-loc>
</Gene-commentary_seqs>
Endbase
Function --> <Prot-ref_name>
<Prot-ref_name_E>U5 snRNP-specific protein, 200 kDa</Prot-ref_name_E>
<Prot-ref_name_E>U5 snRNP-specific protein, 200 kDa (DEXH RNA helicase
family)</Prot-ref_name_E>
</Prot-ref_name>
DBLink --> <Gene-ref_locus-tag>MGI:2444401</Gene-ref_locus-tag>
<Gene-commentary_source>
<Other-source>
<Other-source_src>
<Dbtag>
<Dbtag_db>GO</Dbtag_db>
<Dbtag_tag>
<Object-id>
<Object-id_id>5524</Object-id_id>
</Object-id>
</Dbtag_tag>
</Dbtag>
</Other-source_src>
<Other-source_anchor>ATP binding</Other-source_anchor>
<Other-source_post-text>evidence: ISS</Other-source_post-text>
</Other-source>
</Gene-commentary_source>
Product-type --> <Entrezgene_type value="protein-coding">6</Entrezgene_type>
gene-comment --> <Gene-ref_desc>activating signal cointegrator 1 complex subunit 3-like
1</Gene-ref_desc>
synonym --> <Gene-ref_syn>
<Gene-ref_syn_E>HELIC2</Gene-ref_syn_E>
<Gene-ref_syn_E>KIAA0788</Gene-ref_syn_E>
<Gene-ref_syn_E>U5-200KD</Gene-ref_syn_E>
<Gene-ref_syn_E>U5-200-KD</Gene-ref_syn_E>
<Gene-ref_syn_E>A330064G03Rik</Gene-ref_syn_E>
</Gene-ref_syn>
EC --> <Prot-ref_ec>
<Prot-ref_ec_E>1.5.1.5</Prot-ref_ec_E>
<Prot-ref_ec_E>3.5.4.9</Prot-ref_ec_E>
</Prot-ref_ec>
Chromosome: <SubSource>
<SubSource_subtype value="chromosome">1</SubSource_subtype>
<SubSource_name>6</SubSource_name>
</SubSource>
Some can happen more than once in a record.
On Fri, 22 Apr 2005 02:41:46 -0400, William Park wrote:
> Willem Ligtenberg <WLigtenberg at gmail.com> wrote:
>> On Sun, 17 Apr 2005 02:16:04 +0000, William Park wrote:
>> > Care to post more details?
>>
>> The XML file I need to parse contains information about genes.
>> So the first element is a gene and then there are a lot sub-elements with
>> sub-elements. I only need some of the informtion and want to store it in
>> my an object called gene. Lateron this information will be printed into a
>> file, which in it's turn will be fed into some other program.
>
> You have to help us a little more here. Which info do you want to
> extract from below example?
>
>> <Entrezgene-Set>
>> ...
>> </Entrezgene-Set>
More information about the Python-list
mailing list