[Tutor] Writing to a file

Alan Gauld alan.gauld at yahoo.co.uk
Thu Jan 18 18:11:05 EST 2018


On 18/01/18 20:51, Devansh Rastogi wrote:

> When do you actually use json or pickle, I understand that with data
> written to .json files can be used by programs written in other languages,
> and pickle is for python specific objects.

Yes, that's correct.

>  So are there specific objects
> for which .json is used or pickle is preferred? 

No, its more about whether you are just looking to persist
an objects state (pickle) or share the data with another
program(json).

> write() then I'm just writing to a file, 

Correct. write() just send a string to a file.

> Also is it possible to append data to a already existing file? 

Yes open in with mode 'a' instead of 'w'
If no file exists it creates a new one, if one exists
it appends to the end.

> far it seems that everytime I'm calling my write function, its re-writing
> the whole file with just the last variable called.

Presumably because you open it with 'w' as the mode?

> Ive added my code below, and am currently using json.dump() as  I would
> like to send the file to a friend who is writing a similar program but with
> a gui, and it would be nice if his program can read the data without
> problems.

If the GUI is in Python he can use pickle but
otherwise you need json (or some other portable
format like csv or xml.


> I realize these are pretty basic questions and am missing some basic
> fundamentals. I'd be grateful if someone could point me in the right
> direction, any tips would be highly appreciated.

You could try reading the Handling Files topic in my tutorial

http://www.alan-g.me.uk/l2p2/tutfiles.htm

> from collections import Counter
> import json
> 
> class Files:
>     def __init__(self, filename):
>         with open(filename, 'r', encoding='utf-16') as file_input:
>             self.file_input_string = file_input.read().replace('\n', ' ')

Any particular reason you use utf-16 instead of the much more common
utf-8? Just curious...

>     def num_of_words(self):
>         """ Return number of words in the file"""
>         return str(len(self.file_input_string.split()))

Actually you return the string representation of the number not the
actual number.

>     def num_of_keystrokes(self):
>         """ Total number of keystrokes
>         # abcde.. = 1 stroke
>         # ABCDE.. = 2 strokes
>         # '.,-/;[]=\ = 1 stroke
>         # !@#$%^&*()_+|}{":?>< = 2 strokes """
> 
>         lowercase_letters = sum(1 for c in self.file_input_string if
> c.islower())
>         uppercase_letters = sum(2 for c in self.file_input_string if
> c.isupper())
>         one_keystroke_punc = ".,-=[]\;'/ "  # space included
>         puncuation_one = sum(1 for c in self.file_input_string if c in
> one_keystroke_punc)
>         two_keystroke_punc = '!@#$%^&*()_+|}{":?><'
>         puncuation_two = sum(2 for c in self.file_input_string if c in
> two_keystroke_punc)
> 
>         return str(lowercase_letters + uppercase_letters +
> puncuation_one + puncuation_two)

Again you are returning the string not the number.

>     def num_of_char(self):
>         """ Return number of characters in the string without spaces"""
>         return str(len(self.file_input_string) -
> self.file_input_string.count(" "))

And again...

>     def frequency_of_char(self):
>         """ Frequency of characters in the file """
>         count = Counter(self.file_input_string)
>         dict_count = dict(count)
>         print("{:<12} {:<10}".format('Character', 'Frequency'))
>         for k, v in dict_count.items():
>             print("{:<12} {:<10}".format(k, v))

While this prints the valuers you don;t store them
since dict_count is a local variable that gets thrown
away. It might be better to store it as a classs attribute?

>     def frequency_of_words(self):
>         """ Frequency of words in the file"""
>         # word_count = Counter()
>         # for word in self.file_input_string.replace(' ', '\n'): ###
> macht wider char. sollte fuer line funktioniern
>         #     word_count.update(word)
>         # print("{:<15} {:15}".format("Word", "Frequency"))
>         # for k, v in word_count.items():
>         #     print("{:<15} {:<15}".format(k, v))
> 
>         word_list = self.file_input_string.split()
>         word_frequecy = [word_list.count(w) for w in word_list]  ##
> funktioniert mit string.count!!
>         word_frequecy_dict = dict(zip(word_list, word_frequecy))
>         print("{:<15} {:15}".format("Word", "Frequency"))
>         for k, v in word_frequecy_dict.items():
>             print("{:<15} {:<15}".format(k, v))
> 
>     def average_len_of_words(self):
>         """ calculate the averge length of the words"""
>         word_list = self.file_input_string.split()
>         average = sum(len(word) for word in word_list) / len(word_list)
>         return str(average)

Once again you return the string rather than the value.

>     def write_to_file(self, data):
>         """ collect all data for Morgen_Kinder.txt in a file"""
>         with open('data.json', 'w') as f:
>             json.dump(data, f, sort_keys=True, indent=4)

Here you create the file with 'w' mode so it always
overwrites the previous copy.


> 
> #test
> x = Files('Morgen_Kinder.txt')
> a = Files.num_of_char(x)
> Files.write_to_file(x,a)
> print(a)
> b = Files.num_of_words(x)
> Files.write_to_file(x,b)
> print(b)

Note that here you are writing strings to the files
albeit using JSON. But there really is no need for
JSON here, the format is simply strings.

The pooint of complex formats like pickle and JSON
is to save complex data structures all at once. You
are getting the worst of both worlds. You are doing
all the work to break down the complexity to single
strings and then writing them out with the complexity
of JSON.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos




More information about the Tutor mailing list