data validation when creating an object

Thu Jan 16 11:18:39 EST 2014

On Thursday, January 16, 2014 10:46:10 AM UTC-5, Robert Kern wrote:

> I prefer to keep my __init__() methods as dumb as possible to retain the 
> flexibility to construct my objects in different ways. Sure, it's convenient to, 
> say, pass a filename and have the __init__() open() it for me. But then I'm 
> stuck with only being able to create this object with a true, named file on 
> disk. I can't create it with a StringIO for testing, or by opening a file and 
> seeking to a specific spot where the relevant data starts, etc. I can keep the 
> flexibility and convenience by keeping __init__() dumb and relegating various 
> smarter and more convenient ways to instantiate the object to classmethods.

There's two distinct things being discussed here.

The idea of passing a file-like object vs. a filename gives you flexibility, that's for sure.  But, that's orthogonal to how much work should be done in the constructor.  Consider this class:

class DataSlurper:
    def __init__(self):
        self.slurpee = None

    def attach_slurpee(self, slurpee):
        self.slurpee = slurpee

    def slurp(self):
        for line in self.slurpee:
            # whatever

This exhibits the nice behavior you describe; you can pass it any iterable, not just a file, so you have a lot more flexibility.  But, it's also exhibiting what many people call the "two-phase constructor" anti-pattern.  When you construct an instance of this class, it's not usable until you call attach_slurpee(), so why not just do that in the constructor?