FW: [Tutor] deleting CR within files

David Talaga dtalaga at novodynamics.com
Thu Apr 15 11:23:38 EDT 2004


-----Original Message-----
From: David Talaga [mailto:dtalaga at novodynamics.com]
Sent: Thursday, April 15, 2004 10:49 AM
To: Roger Merchberger
Subject: RE: [Tutor] deleting CR within files


Ok, here is what we have now:

import os
import Tkinter
import dialog
import sys
import re

root=Tkinter.Tk()
f=Tkinter.Button(root, text="Find Files", command=dialog.dialog).pack()
x=Tkinter.Button(root, text="Close", command=sys.exit).pack()

sub = re.sub
fileName= dialog.dialog
in_file = open(fileName,'r')
out_file = open(fileName + '_cleaned', 'w')
for i in in_file.readlines():
    out_file.write(i[:1] + '\n')
    print 'File cleaned'
in_file.close()
out_file.close()

Tkinter.mainloop()

Now when I run it is says this:
Traceback (most recent call last):
  File "<pyshell#0>", line 1, in ?
    execfile('crgui.py')
  File "crgui.py", line 13, in ?
    in_file = open(fileName,'r')
TypeError: coercing to Unicode: need string or buffer, function found


I did not know that I was using unicode up there...  I am realy at a loss.
The program runs fine as far ad the dialog box coming up and the print
statement executing.  I just dont know what the error is saying. Do I need a
string in place of fileName and if I do how would I go about putting a
string in there and still trying to call dialog.dialog?  Any help or
direction would be greatly appreciated and also the option to take my first
born son. (Not realy, well, OK, if you realy thikn it's a fair trade...)

David Talaga




-----Original Message-----
From: Roger Merchberger [mailto:zmerch at 30below.com]
Sent: Thursday, April 15, 2004 10:23 AM
To: David Talaga
Subject: Re: [Tutor] deleting CR within files


Rumor has it that David Talaga may have mentioned these words:
>Hi all!
>Here is my next problem.  I am trying to rid some files of pesky <CR> from
>files.  Here is my code.  Any help would be great!

What platform? If it's *nix, you're using the wrong tool for the job.
There's command line utilities that can do that, usually named 'dos2unix'
and 'unix2dos' to either remove or insert \r characters.

If you're in windows, there are unix-ish utilities you can download (google
for cygwin) that give you a bashprompt on WinNT4/2K/XP and methinks 98+,
but don't quote me on that last part...

Also, it would help us to know what type of files you're editing - are they
basic text files where the \r is either the last or next-to-last character
in the line? If so, you're working too hard... Try this:

>import os, dialog #dialog is a dialog box function
>
>fileName = dialog
>
>in_file = open(fileName,'r').readlines() #I read the whole file at once
>out_file = open(fileName + '_cleaned', 'w') #I opne the file and rename it
>as filename_cleaned.
>for i in in_file: ## I thikn you get the rest
>     out_file.write(i[:-2] + '\n')  # Use this line if you're removing
> \r\n or \n\r
>     out_file.write(i[:-1]) # use this line if you're removing \n\r
>     out_file.write(i[:-1] + '\n') # use this line if you're removing \r
only
>out_file.close()

If you need to find a \r in the middle of a string, regular expressions
(the re module) is still overkill - try this:

import string
teststr = "here's the string with \r in it"
wherecr = string.index(teststr,'\r')
newstr = teststr[:wherecr] + [wherecr+1:]

=-=-=-=-=-=-=-=-=-=-=

if you have multiple \r's in a line, you'd need to add a flag & a while
loop - I'll leave that up as an exercise to the reader... ;-)

HTH,
Roger "Merch" Merchberger

--
Roger "Merch" Merchberger   | A new truth in advertising slogan
sysadmin, Iceberg Computers | for MicroSoft: "We're not the oxy...
zmerch at 30below.com          |                         ...in oxymoron!"

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/tutor/attachments/20040415/8899c79b/attachment.html


More information about the Tutor mailing list