Python script that does batch find and replace in txt files
Syed Khalid
khalidness at gmail.com
Sun Nov 9 16:20:09 EST 2014
Code after adding path of .txt files :
import glob, codecs, re, os
regex = re.compile(r"Age: |Sex: |House No: ") # etc etc
for txt in glob.glob("D:/Python/source/*.txt"):
with codecs.open(txt, encoding="utf-8") as f:
oldlines = f.readlines()
for i, line in enumerate(oldlines):
if "Elector's Name:" in line:
break
newlines = [regex.sub("", line).strip().replace("-", "_") for line in
oldlines[i:]
with codecs.open(txt + "_out.txt", "wb", encoding="utf-8") as w:
w.write(os.linesep.join(newlines))
I executed code in edit rocket.
Error message :
File "EamClean.log", line 12
with codecs.open(txt + "_out.txt", "wb", encoding="utf-8") as w:
^
SyntaxError: invalid syntax
On Mon, Nov 10, 2014 at 2:22 AM, Syed Khalid <khalidness at gmail.com> wrote:
> Hi Albert,
>
> Thank you for script.
>
> I am getting the below error :
>
> File "EamClean.log", line 12
> with codecs.open(txt + "_out.txt", "wb", encoding="utf-8") as w:
> ^
> SyntaxError: invalid syntax
>
> Kindly do the needful.
>
> On Mon, Nov 10, 2014 at 1:53 AM, Albert-Jan Roskam <fomcl at yahoo.com>
> wrote:
>
>>
>>
>>
>>
>> ----- Original Message -----
>> > From: Syed Khalid <khalidness at gmail.com>
>> > To: python-list at python.org
>> > Cc:
>> > Sent: Sunday, November 9, 2014 8:58 PM
>> > Subject: Python script that does batch find and replace in txt files
>> >
>> > Python script that does batch find and replace in txt files Need a
>> python script
>> > that opens all .txt files in a folder find replace/delete text and save
>> files.
>> >
>> > I have text files and I need to perform below steps for each file.
>> >
>> > Step 1: Put cursor at start of file and Search for "Contact's
>> > Name:". Delete all the rows before it.
>> > Step 2: Put cursor at end of file, Search for "Contact's Name:"
>> > select option UP.
>> > Step 3: Search for "Photo of the" Replace with blanks
>> > Step 4: Search for "Contact is" Replace with blanks
>> > Step 5: Search for "Contact's Name:" Replace with blanks
>> > Step 6: Search for "Age:" Replace with blanks
>> > Step 7: Search for "Sex:" Replace with blanks
>> > Step 8: Search for "House No:" Replace with blanks
>> > Step 9: Search for "available" Replace with blanks
>> > Step 10: Remove Empty Lines Containing Blank Characters from file
>> > Step 11: Trim Leading Space for each line
>> > Step 12: Trim Trailing Space after each line
>> > Step 13: Search for - (hyphen) Replace with _ (underscore)
>>
>> > Step 14: Save file.
>>
>> something like (untested)
>>
>>
>> import glob, codecs, re, os
>>
>> regex = re.compile(r"Age: |Sex: |House No: ") # etc etc
>>
>> for txt in glob.glob("/some/path/*.txt"):
>> with codecs.open(txt, encoding="utf-8") as f:
>> oldlines = f.readlines()
>> for i, line in enumerate(oldlines):
>> if "Contact's Name: " in line:
>> break
>> newlines = [regex.sub("", line).strip().replace("-", "_") for line in
>> oldlines[i:]
>> with codecs.open(txt + "_out.txt", "wb", encoding="utf-8") as w:
>> w.write(os.linesep.join(newlines))
>>
>>
>>
>>
>> >
>> > Currently I have recorded a macro in Notepad++.
>> > I open each file, run macro and save file.
>> > As there are many files I was looking for a program to automate the
>> process.
>> >
>> > I posted the same query in Notepad++ forum. I got a reply that it can
>> be done by
>> > using Python script.
>> >
>> > Kindly do the needful.
>> >
>> > Thank you.
>> > khalidness
>> >
>> > --
>> > https://mail.python.org/mailman/listinfo/python-list
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20141110/5ffab889/attachment.html>
More information about the Python-list
mailing list