Copy construction of class instance object

Bengt Richter bokr at oz.net
Wed May 28 20:54:22 EDT 2003


On Wed, 28 May 2003 21:25:53 GMT, "Bror Johansson" <bjohan at telia.com> wrote:

>
>"Steven Taschuk" <staschuk at telusplanet.net> wrote in message
>news:mailman.1054138672.2279.python-list at python.org...
>> Quoth Bror Johansson:
>> > Is there a good/recommended way to emulate the copy constructor
>> > classinstance creation (a la C++) in Python?
>>
>> Why do you want to?
>
>I have to parse a hierarchy of files having binary contents. All files have
>a common header structure. The remainder of the file have content that is
>structured differently depending on the header content. Out-of-header data
>may denote other files - having same header structure.
>
>I want a corresponding class hierarchy with a super class that recognizes
>the header and sub classes that recognizes remaining content depending on
>header values.
>
>Thus, I want to be able to say:
>
>fp = file("topfile", 'rb')
>ftop = super(fp) # bind ftop to an instance of the 'correct' subclass
I assume this is for the purpose of accessing content in specialized ways.
BTW, "super" already has built-in meaning, so it's better to use another symbol.
>fp.close()

Why not just say

    ftop = MyHierFile('topfile')

I.e., why not just define a class for the top thing? E.g., (untested! and
very speculative wrt the structure of your files)

class MyHierContentUnrecognizedError(Exception): pass

# classes for specialized sub-content
class MyHierPlainContent(object):
    def __init__(self, parent, offset, fp):
        self.parent = parent            # to get, e.g., self.parent.fname to open later
        self.offset = offset            # to seek to later
        ...
        # (fp is pre-positioned for an attempt to recognize content)
        discriminfo = fp.read(self.DISCRIMLENGTH) # sufficient to figure if this is MyHierPlainContent
        # check discrim info
        if ok:
            # populate this instance as desired
        else:
            raise MyHierContentUnrecognizedError

    def open(self)
        fp = file(self.parent.fname, 'rb')
        fp.seek(self.offset)
        return fp
    ...

class MyHierCherryContent(object):
    def __init__(self,*args): raise NotImplementedError
class MyHierVanillaContent(object):
    def __init__(self,*args): raise NotImplementedError

class MyHierFile(object):
    contentClasses = [MyHierPlainContent, MyHierCherryContent, MyHierVanillaContent]
    def __init__(self, fname):
        self.fname = fname
        self.fileRefs = {}  # init with None values for ref names, later optionally MyHierFile instances
        self.content = None # later a class instance that knows about this file's data content
        self.hdrInfo = []   # or maybe {}
        self.children = []  # corresponding to fileRefs if pursued
        # open the file and parse header and get file references
        fp = file(fname, 'rb')
        ...
        # after determining where content type info is (ctoffset) from the header info,
        # give each subclass a chance to recognize it (might require a tweak or two, depending ;-)
        for contentClass in self.contentClasses:
            try:
                fp.seek(ctoffset)
                self.content = contentClass(self, offset, fp) # try until successful
            except MyHierContentUnrecognizedError:
                pass
        if not self.content:
            # you could also set self.content to a dummy class that would raise an exception only
            # if use was attempted. Leaving it None would also do that, a little less helpfully msg-wise.
            raise MyHierContentUnrecognizedError(
                'Unrecognized content at offset %s in file %r' % (fname, ctoffset))
        # selected class instance which will know where to get the data in the
        # containing file and how to decode it
        self.content = self.contentClasses[<id determined from header info>](fname, offset, length)
        ...
        fp.close()


>for fname in ftop.namesofreferredfiles():
>    fp = file(fname, 'rb')
>    sub = super(fp) # bind sub to an instance of the 'correct' subclass
     # ISTM the above 2 lines could as well be
     sub = MyHierFile(fname)
>    ftop.addsubfile(sub)
>    ...
but you might also want to make this more lazy. I.e., why open the entire tree of files
if you might wind up only wanting to access one leaf?

But if you did eagerly want to populate the first level, you could write (untested!)
    ftop = MyHierFile('topfile')
    for fname, value in ftop.fileRefs.items():
        if value: continue # already done 
        ftop.fileRefs[fname] = MyHierFile(fname)

but then what about the children's children? You'd need a recursive tree walker
    def doProgeny(ftop): # (untested!)
        for fname, value in in ftop.fileRefs.items():
            if not value:
                value = MyHierFile(fname)
                ftop.fileRefs[fname] = value
            doProgeny(value)

If there's a possible cycle in references, you'd need to detect that.
>
>One of my ideas was to have class super recognize the common header and
>defining classes subx to take an instance of super as argument to __init__
>and 'copy' the header info before reading rest of file.
See above, you could just follow the parent reference to get to the header info there.

>
>After having considered follow up comments to my first posting, and thinking
AFAICS the first posting (unless I missed an earlier one) and comments were
about an implementation idea you had, not about good ways to tackle your real problem
with Python. I.e., how could anyone guess how best to help from

"""
Is there a good/recommended way to emulate the copy constructor
classinstance creation (a la C++) in Python?
"""

Whereas it's always fun and satisfying to think up your own solutions, it is not
necessarily the best approach to keep the problem hidden and just ask about the
obstacles you encounter in your first go.

>a little more, I have given up my original thoughts. Another - and simpler -
>design will be chosen.

If your binary file is one of a number of standard structured files (e.g., .tif, .gif, .png,
.wav, .pdf, .ps, .xml, .exe, .obj, etc. etc.) there is probably already a solution to
at least part of your problem available.

PS. Check for endian-ness issues too.
Just a little curious what the actual data is ;-)

The above is just OTTOMH speculations without knowing what the real problem is,
so please ignore except for general ideas that may emerge/remain after others
point out better things. Which they probably will if you let on more about the problem.

Regards,
Bengt Richter




More information about the Python-list mailing list