[Tutor] Creating class / OOP structure

Johan Martinez jmartiee at gmail.com
Wed Nov 6 19:27:20 CET 2013


On Tue, Nov 5, 2013 at 6:19 PM, Steven D'Aprano <steve at pearwood.info> wrote:

> On Tue, Nov 05, 2013 at 03:55:05PM -0800, Johan Martinez wrote:
> > I need help in modifying my program. Right now it looks as follows:
> [snip code]
> > Can someone help me in improving my code?
>
> Yes! The first thing to do is get rid of the unnecessary class. This is
> not Java where you have to write classes for everything. From the sample
> code that you show below, using OOP here accomplishes nothing except
> making the code more complicated and less efficient.
>
> Please don't think I am *against* OOP. But it is a tool, nothing more,
> and you should always use the right tool for the job. And in this case,
> I think the right tool is to create two functions:
>
> def parse(filename):
>     with open(filename, 'r') as f:
>         ...
>     return parsed_file
>
> def process(parsed_file):
>     for k in parsed_file:  # Iteration automatically uses keys.
>         ...
>     return processed_file
>
>
> And then simply call them like this:
>
> processed = process(parse('sales.txt'))
>
> Why is this better?
>
> In the code you show, you don't use the object-oriented structure, so
> why waste time building an object-ordiented structure? Similarly, after
> you have read the file once, the class solution holds onto the contents
> forever. But you don't need it forever, once you have parsed the
> contents they are no longer needed. So why hold on to them longer than
> necessary?
>
> Classes should only be used when you need to encapsulate state plus
> behaviour, in other words, when they represent a *thing*. Classes should
> be nouns: BankAccount, User, DecimalNumber all make good classes. Your
> class *needs* only behaviour -- the class doesn't need to store either
> self.fname nor self.parsed_file, since they are both only used once. In
> effect, they are temporary variables that are given names and made
> long-living:
>
> # no need for this
> filename = "sales.txt"
> parsed_file = parse(filename)
> processed = process(parsed_file)
>
> # now go on to use processed but not filename or parsed_file
> ...
>
> Since those two variables are only used once, there is no need to keep
> them as named variables. Get rid of them!
>
> processed = process(parse('sales.txt'))
>
>
> Now, obviously there are parts of the code you haven't shown us. Perhaps
> you do have good reason for storing parsed_file and filename for later
> use. If so, then my objection to using a class will be reduced, or even
> eliminated.
>
> But even then, there is one last problem with your class: it has a
> dangerously misleading name, which hints that it is not a well-designed
> class.
>
> "Filedict" is neither a file nor a dict -- it doesn't inherit from file,
> or behave like a file; it doesn't inherit from dict, or behave like a
> dict. It is, in fact, *not* a file/dict at all. What you really have is
> something with a nasty compound name:
>
> FileParserAndDictProcessor
>
>
> When your classes have nasty compound names like this, it is a very
> strong clue that the class doesn't actually represent a thing and
> probably shouldn't exist. Don't use classes just as a wrapper around a
> few functions and unrelated variables. (What does the file name have to
> do with the process() method? Why are they stored together?)
>
> For a very entertaining -- although quite long -- look at this issue,
> have a read of this:
>
>
> http://steve-yegge.blogspot.com.au/2006/03/execution-in-kingdom-of-nouns.html
>
>
> You don't need to be a Java developer to get value from it.
>
>
> > Should I store parsed_file as an object attribute as follows?
> >
> > class Filedict():
> >     def __init__(self, fname):
> >         self.fname =fname
> >         self.parsed_file = self.parse()
>
> Should you? No. Can you? Yes. But follow the logic -- you could continue
> the process of pushing more code into the __init__ method:
>
> class Filedict():  # Terrible name...
>     def __init__(self, fname):
>         self.fname =fname
>         self.parsed_file = self.parse()
>         self.processed = self.process()
>
> f = Filedict("sales.txt")
> processed = f.processed
>
> But now the instance f has no reason to exist, so dump it:
>
> processed = Filedict("sales.txt").processed
>
> which basically means you are using __init__ just as a wrapper to hide
> away a couple of function calls.
>
>
>
> --
> Steven



Thanks for the detailed reply Steven. I am trying to learn class/OOP
structure and selected file parsing and processing as a use case. I wanted
to implement something as follows:

# Create Filedict object - which will return parsed file in dict format
# Where, dict keys are entries in second column and
# dict values are matching lines and number of matching lines

f = Filedict('sales.txt')

The dict representation contains everything I want to know from that file,
however, it's still in raw format. So I need to process it further.

f.process()

The process method splits gives dictionary object in four distinct dict
objects based on some conditions. These four dicts can then be dumped in
JSON format or we can perform more select/filter actions on them.

I was doing it using simple functions, but I added class/OOP for learning.
I thought parsed file object should be created when Filedict was
initialized and then 'process' or other methods can be called on the object
directly.

I am going through suggested resources and hopefully come up with a better
code structure.

-jM
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20131106/4a46d654/attachment.html>


More information about the Tutor mailing list