[Tutor] Writing to a file

Steven D'Aprano steve at pearwood.info
Sat Jan 20 02:09:46 EST 2018


On Fri, Jan 19, 2018 at 12:58:10PM -0500, Bob Gailer wrote:
> =
> 
> On Jan 18, 2018 5:45 PM, "Devansh Rastogi" <devanshr at gmail.com> wrote:
> >
> > Hello,
> >
> > I'm new to python and programming as
> >
> > from collections import Counter
> > import json
> >
> I don't see any value for having a class. All you need are functions and
> global variables

Except for the simplest scripts, global variables are a good way to 
have fragile, buggy, hard to debug code.

If there's ever a chance that you will need to read two or more files at 
the same time, a class is a much better solution than trying to juggle 
global variables.

If I never have to do the:

    old_foo = foo
    calculate_foo()
    print(foo)
    foo = old_foo

dance again, it won't be too soon.


> > class Files:
> >     def __init__(self, filename):
> 
> I don't see any need for a function or"with". Just write file_input_string
> = open(filename, 'r', encoding='utf-16').read().replace('\n', ' ')

Again, that doesn't scale beyond quick and dirty scripts. Best practice 
(even when not strictly needed) is to use

with open(filename) as f:
    do_something_with(f.read())

in order to guarantee that even if an error occurs while reading, the 
file will be closed. Otherwise, you run the risk of running out of file 
handles in a long-running program.


> >         with open(filename, 'r', encoding='utf-16') as file_input:
> >             self.file_input_string = file_input.read().replace('\n', ' ')
> >
> You are assuming that all words are separated by blanks which is rarely the
> case in natural language.

Surelyyoumeanthatitisusuallythecasethatwordsareseparatedbyblanksinmostnaturallanguages?

I think that Thai is one of the few exceptions to the rule that most 
languages separate words with a blank space.

In English, there are a small number of compound words that contain 
spaces (as opposed to the far more common hyphen), such as "ice cream" 
(neither a form of ice, nor cream) or "attorney general" but most people 
don't bother distinguishing such compound words and just treating them 
as a pair of regular words. But I can't think of any English grammatical 
construct where words are run together while still treating them as 
separate words (apart from simple mistakes, e.g. accidentally writing 
"runtogether" as a typo).


> Your program is creating lists of ones. Rather than counting them all you
> need to do is take the length of each list.. e;g;: lowercase_letters =
> len(1 for c in self.file_input_string if c.islower())

That won't work. You are trying to take the length of a generator 
expression:

py> len(1 for c in "abcDe" if c.islower())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'generator' has no len()

 
> However there is a much better way to do the counting: translate the text
> using the string translate method into various characters that identify the
> class of each letter in the file. Then count the occurrences of each of
> those characters. Example: counting Upper Lower, Nunber, and punctuation
> Single, Double stroke):
> 
> txt=  "THIS is 123 ,./ :*(" # input file text
> 
> transtable =
> str.maketrans("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
> ,./:*(",
>   "L"*26 + "U"*26 + "N"*10 + "S"*4 + "D"*3) # maps input characters to
> corresponding class characters

That lists only 68 out of the many, many thousands of characters 
supported by Python. It doesn't scale very well beyond ASCII.


-- 
Steve


More information about the Tutor mailing list