[Tutor] Interpret the contents of a line

Prasad, Ramit ramit.prasad at jpmorgan.com
Mon Jul 22 23:41:11 CEST 2013


Makarand Datar wrote:
> Hi,
> 
> I am working on a parser and my input file is a description about bunch of things written line by
> line. So for instance, consider the following two lines and what the corresponding variables in python
> should look like.
> 
> example Line1: position = 1, 1, rotation = 90, 0, 0, mass = 120; Here I want the variables to be
> position = [1,1,0], rotation = [90,0,0], mass = [120]
> example Line2: position = 1, 1, 2, mass = 120, rotation = 90, 0; Here I want the variables to be
> position = [1,1,2], rotation = [90,0,0], mass = [120]
> example Line3: mass = 120, rotation = 90, 0; Here I want the variables to be position = [0,0,0],
> rotation = [90,0,0], mass = [120]
> 
> I know the maximum number of arguments possible for each variable. For example, in the first line
> above, only two numbers for position ares specified; that means that the third entry in position list
> is zero. So I need to handle these cases while reading in the file as well. Additionally, the text
> might not always be in the same order; like shown in the second line above. And finally, sometimes
> numbers for one variable might not exist at all; like shown in line 3. Here, the position is then read
> as position = [0,0,0].
> How do I implement such stuff? Is there some smart way of doing this? All I have in mind is this: read
> a line as comma or a space delimited list of words variable and then some how use if else + len(list)
> to figure out whats going on. But that seems way too tedious and I feel that there might be an easier
> way to read such things.
> 
> Any help is highly appreciated.
> Thank you

You can use the regular expression library to split. If you can delimit the keys ("position") that 
would work best but if you cannot change source that is fine too. Based on your sample lines,
you can split on text and convert to dictionary.

>>> import re
>>> line
'position = 1, 1, rotation = 90, 0, 0, mass = 120'
>>> re.split('([a-z]*)', line)
['', 'position', ' = 1, 1, ', 'rotation', ' = 90, 0, 0, ', 'mass', ' = 120']
>>> l = _
>>> dict(zip(l[1::2], l[2::2]) ) # skip first element
{'position': ' = 1, 1, ', 'rotation': ' = 90, 0, 0, ', 'mass': ' = 120'}

Now your key is the element name and the value is the string. Now is remove equals sign and commas 
and split on space (or however you want to do it). And if the returned list in split is not big 
enough you just add zero(es) to the end.



Ramit



This email is confidential and subject to important disclaimers and conditions including on offers for the purchase or sale of securities, accuracy and completeness of information, viruses, confidentiality, legal privilege, and legal entity disclaimers, available at http://www.jpmorgan.com/pages/disclosures/email.  


More information about the Tutor mailing list