[Tutor] python regex help
Arun Tomar
tomar.arun at gmail.com
Sun Sep 28 19:49:39 CEST 2008
On Sun, 2008-09-28 at 17:26 +0100, Alan Gauld wrote:
> "Arun Tomar" <tomar.arun at gmail.com> wrote
>
> > I've been using shell scripting & using sed & pipes i've solved it,
> > but with python, i need to practice more ;).
>
> Can you show us some output as you'd like irt?
> Can you show us the sed script that works?
sample data:
Contact Candidate
Jyoti Soni - 0 Year(s) 0 Month(s)
MCA
Keyskills:
C , C + + , Java , JSP , Oracle , S / W Testing
B.Sc Pt.Ravishanker University,Raipur
MCA Pt.Ravishanker University,Raipur
Currently in: Pune
CTC(p.a): Not Disclosed
Modified: 27 Sep 2007
Tel: 09975610476(M)
Account Information
Account Information
Contact Candidate
Minal - 0 Year(s) 0 Month(s)
MCA
Keyskills:
c , c + + , java , ASP . NET , VB , Oracle , Dimploma in Web Designing
B.Sc Shivaji University , Maharasthra
MCA Shivaji University , Maharashtra
Currently in: Pune
CTC(p.a): INR 0 Lac(s) 5 Thousand
Modified: 27 Jan 2006
Last Active: 06 Sep 2007
Tel: 9890498376(M)
011 02162 250553(R)
Account Information
Account Information
small shell scripts that works:
#!/bin/bash
print $1
sed -ne '/Contact/,+1p' -e '/Tel/p' $1 |sed -e '/Contact Candidate/d'|
sed -e 's/\-//'|sed -e '/^$/d'|sed -e 's/ *$//'|sed -e 's/Tel://g' -e
's/(M)//g' -e 's/0 Year(s) 0 Month(s)//g' -e 's/(R)//g' -e '/> Similar
Resumes/d'
sample output
Jyoti Soni
09975610476
Minal
9890498376
>
> Also can you show us the Python code that doesn't work
> and what went wrong? Its easier to fix what's broken than
> to guess at what might do what you want :-)
python code that works, after that i'm a bit lost ;)
import re
filename = "script.txt"
#regex pattern
p1 = re.compile("Contact Candidate",re.IGNORECASE)
p2 = re.compile ("Tel:", re.IGNORECASE)
#open the file
fh = open(filename,'r')
#read the contents of the file to an array.
file_array = fh.readlines()
#create an empty array
new_array = []
mod_array = []
for i in range(len(file_array)):
if p1.search(file_array[i]):
new_array.append(file_array[i+1])
if p2.search(file_array[i]):
new_array.append(file_array[i])
new_array.append(file_array[i+1])
basically i'm trying my hand with text manipulation with python. i'm
thorough with shell scripting, sed & awk.
after this data is extracted i would like to convert it to a csv file,
then i would like to insert the data into a database etc etc. i hope
this gives a good idea of what i'm trying to do.
regds,
arun.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/tutor/attachments/20080928/3ca6965d/attachment.pgp>
More information about the Tutor
mailing list