[Tutor] deleting one line in multiple files

wormwood_3 wormwood_3 at yahoo.com
Fri Sep 14 07:14:28 CEST 2007


Thought I would do some more testing and get you a more finalized form this time.

So I took the mygrep.py script, and put it in a folder with 3 test files with content like this:
I am some
lines of text
yep I love text
435345
345345345
<script type="text/javascript" />

Then I ran:

sam at B74kb0x:~/test$ python mygrep.py "<script type" `ls`
found '<script type' 1 times in test1.txt on line 6.
found '<script type' 1 times in test2.txt on line 6.
found '<script type' 1 times in test3.txt on line 6.

This will work in your case quite well I think. Now for doing the actual delete... I could not find a way to succinctly delete a single line from the files in Python, but I am almost there. Sorry, it was late:-) 

import fileinput, sys, string
# take the first argument out of sys.argv and assign it to searchterm
searchterm, sys.argv[1:] = sys.argv[1], sys.argv[2:]
for line in fileinput.input():
   num_matches = string.count(line, searchterm)
   if num_matches:                     # a nonzero count means there was a match
       print "found '%s' %d times in %s on line %d." % (searchterm, num_matches,
           fileinput.filename(), fileinput.filelineno())
       thisfile = open(fileinput.filename(), "r")
       linelist = thisfile.readlines()
       del linelist[(fileinput.filelineno() -1)]
       print linelist
       thisfile.close()
       print "Deleted %s line(s) containing pattern in %s" % (num_matches, fileinput.filename())

So this will do the search on the file you specify at runtime, look for the pattern you specify, and print out a list of the lines with the matching line removed. Now I need to write these lines back to the original file. Don't have that part yet...:-(

-Sam
___________________________________________
----- Original Message ----
From: wormwood_3 <wormwood_3 at yahoo.com>
To: Python Tutorlist <tutor at python.org>
Sent: Thursday, September 13, 2007 11:33:48 PM
Subject: [Tutor]  deleting one line in multiple files

I think the problem is that the original script you borrowed looks at the file passed to input, and iterates over the lines in that file, removing them if they match your pattern. What you actually want to be doing is iterating over the lines of your list file, and for each line (which represents a file), you want to open *that* file, do the check for your pattern, and delete appropriately.

Hope I am not completely off:-)

If I am right so far, you want to do something like:

import fileinput

for file in fileinput.input("filelist.list", inplace=1):
    curfile = file.open()
    for line in curfile:
        line = line.strip()
        if not '<script type'in line:
            print line

BUT, fileinput was made (if I understand the documentation) to avoid having to do this. This is where the sys.argv[1:] values come in. The example on this page (look under "Processing Each Line of One or More Files:
The fileinput Module") helped clarify it to me: http://www.oreilly.com/catalog/lpython/chapter/ch09.html. If you do:

% python myscript.py "<script type" `ls`
This should pass in all the items in the folder you run this in (be sure it only contains the files you want to edit!), looking for "<script type". Continuing with the O'Reilly example:

import fileinput, sys, string
# take the first argument out of sys.argv and assign it to searchterm
searchterm, sys.argv[1:] = sys.argv[1], sys.argv[2:]
for line in fileinput.input():
   num_matches = string.count(line, searchterm)
   if num_matches:                     # a nonzero count means there was a match
       print "found '%s' %d times in %s on line %d." % (searchterm, num_matches, 
           fileinput.filename(), fileinput.filelineno())

To test this, I put the above code block in "mygrep.py", then made a file "test.txt" in the same folder, with some trash lines, and 1 line with the string you said you want to match on. Then I did:

sam at B74kb0x:~$ python mygrep.py "<script type" test.txt 
found '<script type' 1 times in test.txt on line 3.

So you could use the above block, and edit the print line to also edit the file as you want, maybe leaving the print to confirm it did what you expect.

Hope this helps!
-Sam

_____________________________________
I have a directory of files, and I've created a file list
of the files I want to work on:

$ ls > file.list

Each file in file.list needs to have a line removed,
leaving the rest of the file intact.

I found this snippet on the Net, and it works fine for one file:

# the lines with '<script type' are deleted.
import fileinput

for line in fileinput.input("file0001.html", inplace=1):
    line = line.strip()
    if not '<script type'in line:
        print line

The docs say:
This iterates over the lines of all files listed in sys.argv[1:]...
I'm not sure how to implement the argv stuff.

However, the documentation also states:
To specify an alternative list of filenames,
pass it as the first argument to input().
A single file name is also allowed.

So, when I replace file0001.html with file.list (the alternative list
of filenames, nothing happens.

# the lines with '<script type' are deleted.
import fileinput

for line in fileinput.input("file.list", inplace=1):
    line = line.strip()
    if not '<script type'in line:
        print line

file.list has one filename on each line, ending with a newline.
file0001.html
file0002.html
:::
:::
file0175.html

Have I interpreted the documentation wrong?
The goal is to delete the line that has '<script type' in it.
I can supply more information if needed.
TIA.
-- 
bhaaluu at gmail dot com
_______________________________________________
Tutor maillist  -  Tutor at python.org
http://mail.python.org/mailman/listinfo/tutor






_______________________________________________
Tutor maillist  -  Tutor at python.org
http://mail.python.org/mailman/listinfo/tutor





More information about the Tutor mailing list