Look for a string on a file and get its line number

Horacius ReX horacius.rex at gmail.com
Tue Jan 8 09:29:58 EST 2008


Hi, thanks for the help. Then I got running the following code;

#!/usr/bin/env python

import os, sys, re, string, array, linecache, math

nlach = 12532

lach_list = sys.argv[1]
lach_list_file = open(lach_list,"r")
lach_mol2 = sys.argv[2] # name of the lachand mol2 file
lach_mol2_file = open(lach_mol2,"r")
n_lach_read=int(sys.argv[3])

# Do the following for the total number of lachands

# 1. read the list with the ranked lachands
for i in range(1,n_lach_read+1):
	line = lach_list_file.readline()
	ll = string.split (line)
	#print i, ll[0]
	lach = int(ll[0])
	# 2. for each lachand, print mol2 file
	# 2a. find lachand header in lachand mol2 file (example; kanaka)
	#     and return line number
	line_nr = 0
	for line in lach_mol2_file:
    		line_nr += 1
    		has_match = line.find('kanaka')
    		if has_match >= 0:
        		print 'Found in line %d' % (line_nr)
			# 2b. print on screen all the info for this lachand
			#   (but first need to read natoms and nbonds info)
			#    go to line line_nr + 1
			ltr=linecache.getline(lach_mol2, line_nr + 1)
			ll=ltr.split()
			#print ll[0],ll[1]
			nat=int(ll[0])
			nb=int(ll[1])
			# total lines to print:
			#   header, 8
			#   at, na
			#   b header, 1
			#   n
			#   lastheaders, 2
			#   so; nat + nb + 11
			ntotal_lines = nat + nb + 11
			# now we go to the beginning of the lachand
			# and print ntotal_lines
			for j in range(0,ntotal_lines):
				print linecache.getline(lach_mol2, line_nr - 1 + j )


which almost works. In the last "for j" loop, i expected to obtain an
output like:

sdsdsdsdsdsd
sdsdsfdgdgdgdg
hdfgdgdgdg

but instead of this, i get:

sdsdsdsdsdsd

sdsdsfdgdgdgdg

hdfgdgdgdg

and also the program is very slow. Do you know how could i solve
this ?

thanks

Tim Chase wrote:
> >> I have to search for a string on a big file. Once this string
> >> is found, I would need to get the number of the line in which
> >> the string is located on the file. Do you know how if this is
> >> possible to do in python ?
> >
> > This should be reasonable:
> >
> >>>> for num, line in enumerate(open("/python25/readme.txt")):
> > 	if "Guido" in line:
> > 		print "Found Guido on line", num
> > 		break
> >
> >
> > Found Guido on line 1296
>
> Just a small caveat here:  enumerate() is zero-based, so you may
> actually want add one to the resulting number:
>
>   s = "Guido"
>   for num, line in enumerate(open("file.txt")):
>     if s in line:
>       print "Found %s on line %i" % (s, num + 1)
>       break # optionally stop looking
>
> Or one could use a tool made for the job:
>
>   grep -n Guido file.txt
>
> or if you only want the first match:
>
>   sed -n '/Guido/{=;p;q}' file.txt
>
> -tkc



More information about the Python-list mailing list