[Tutor] Re: Punctuation

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Mon Dec 2 11:34:07 2002


> On Mon, Dec 02, 2002 at 10:57:51AM -0200, Diego Prestes wrote:
>
> | Im trying to make a program that remove the punctuations of a txt
> | file or string because I want to do a count of words later. But when
> | a make my program it give Out of memory error.  Someone know a
> | better way to do this program?
>
> | from string import *
    ^^^^^^^^^^^^^^^^^^^^

You may want to avoid this shortcut and just use the standard:

    import string

The reason is because 'from string import *' indiscriminately pulls names
out of the string module, and someone unfamiliar with the string module
may not know, reading down in the code, where 'punctuation' or 'find'
would have come from.

It's also a little dangerous: many of the Python modules in the Standard
Library are not 'from foo import *'-safe.  For example, the 'os' module
has its own version of open() that munges up the builtin open()  that we
know about.


If we really want to save keystrokes, we can do something like this:

###
import string as s
###

This says that 's' is now an abbreviation for the string module.


Once we do this, then the rest of the code would look like:

###
text = raw_input("Text :")
d1 = s.punctuation
d = s.split(join(d," "))
for x in range(len(d)):
    n = s.find(text,d[x])
    text = text[:n] + text[n+1:]
print text
###

where it becomes easier to see where 'punctuation', 'split', and 'find'
are coming from.



On Mon, 2 Dec 2002, Derrick 'dman' Hudson wrote:

> What's happening is you're creating lots of copies of strings. (namely
> where you add them together)  You end up with almost 4 copies of the
> string in memory at one time before 3 of them are freed.  If the input
> is long, that will use a lot of memory.

I can see that the above code will cause a memory-stressing situation, but
the system would have to be pretty stressed already to raise an
OutOfMemoryError.

Diego, what does the rest of your program look like?  Are you reading your
whole text file into memory using a read()?


Good luck!