What about a Python Tree container?

Roc Zhou chowroc.z+l at gmail.com
Tue Nov 6 05:50:44 EST 2007


Hello,

Recently I'm working on a new data structure written in python: Tree,
by operator overloading. I want it can act as a builtin type, such as
list and dict, but do something like a tree, to make those things need
trees can be more convenient.

For example,
1. to create a Tree, there are several ways:
(1) A empty Tree:
>>> root = Tree(None)
>>> print root
  ** Tree **
    () : None,

(2) A tree like this:
    root('zero')
      \--trunk('a')
      \--branch(1)
>>> root = Tree('zero', trunk='a', branch=1)
>>> print root
  ** Tree **
    ('trunk',) : 'a',
    () : 'zero',
    ('branch',) : 1,

(3) With items:
>>> root = Tree('zero', trunk='a', branch=1, _node_items={'try' : 'x'})
>>> print root
  ** Tree **
    ('trunk',) : 'a',
    () : 'zero',
    (('try',),) : 'x',
    ('branch',) : 1,
or like this:
>>> root = Tree('value', {'x' : 'v1', 'y' : 2}, trunk='v3', branch=4)
>>> print root
  ** Tree **
    ('trunk',) : 'v3',
    () : 'value',
    (('y',),) : 2,
    ('branch',) : 4,
    (('x',),) : 'v1',

The items can be considered as attributes of a Tree node, something like
XML attr? '')

(4) A more complicated one:
>>> root = Tree('value', {'x' : 'v1', 'y' : Tree(2, {'z' : 'v3'}, br=4)},
trunk='v5', branch=Tree(6, {'a' : 'v7'}, br=8))
>>> print root
  ** Tree **
    ('branch', ('a',)) : 'v7',
    (('y',),) : 2,
    ('branch',) : 6,
    ('branch', 'br') : 8,
    ('trunk',) : 'v5',
    (('x',),) : 'v1',
    (('y', 'z'),) : 'v3',
    () : 'value',
    (('y',), 'br') : 4,

2. Or create a Tree with setattr syntax :
>>> root = Tree(None)
>>> root = Tree('value')
>>> root['x'] = 'v1'
>>> root['y'] = 2
>>> root.trunk = 'v5'
>>> root.branch = 6
>>> root['x']['z'] = 'v3'
>>> root['x'].br = 4
>>> root.branch['a'] = 'v7'
>>> root.branch.br = 8
>>> print root
  ** Tree **
    ('branch', ('a',)) : 'v7',
    () : 'value',
    (('y',),) : 2,
    ('branch',) : 6,
    ('branch', 'br') : 8,
    ('trunk',) : 'v5',
    (('x', 'z'),) : 'v3',
    (('x',),) : 'v1',
    (('x',), 'br') : 4,

3. It's very simple to retrive the value of any given Tree node:
>>> print root
  ** Tree **
    (('x', 'y'),) : 'xy',
    (('y',),) : 3,
    ('branch',) : 7,
    ('trunk', ('a',)) : 5,
    ('trunk',) : 4,
    (('x',),) : 1,
    ('trunk', ('a', 'b'), 'data') : ['trunk', 'ab', 6],
    () : 0,
    (('x',), 'data') : ['x', 2],
    ('trunk', ('a', 'b')) : 'trunk ab',
    ('trunk', ('a',), 'data') : 'a data',
    ('branch', 'data') : 8,
>>> root()
0
>>> root.trunk()
4
>>> root.branch()
7
>>> root['x']
<tree.Tree instance at 0xb7eab16c>
>>> root['x']()
1
>>> root['x'].data()
['x', 2]

4. As you have seen, the "print" operation of course have traversed the
Tree. Traverse return a path sequences to nodes map(dict) --
Tree<http://crablfs.sourceforge.net/tree.html#Tree>
 dict, every
item represents a single Tree<http://crablfs.sourceforge.net/tree.html#Tree>
 node. It't better to be called by
__call__ <http://crablfs.sourceforge.net/tree.html#Tree-__call__>
('traverse'), for example:
root('traverse') will return:
treedict = {
{
        () : Tree <http://crablfs.sourceforge.net/tree.html#Tree>(0),
        (('x',),) : Tree <http://crablfs.sourceforge.net/tree.html#Tree>(1),
        (('x',), 'data') : Tree<http://crablfs.sourceforge.net/tree.html#Tree>
(['x', 2]),
        (('x', 'y'),) : Tree <http://crablfs.sourceforge.net/tree.html#Tree>
('xy'),
        (('y',),) : Tree <http://crablfs.sourceforge.net/tree.html#Tree>(3),
        ('trunk',) : Tree <http://crablfs.sourceforge.net/tree.html#Tree>
(4),
        ('trunk', ('a',)) : Tree<http://crablfs.sourceforge.net/tree.html#Tree>
(5),
        ('trunk', ('a',), 'data') :
Tree<http://crablfs.sourceforge.net/tree.html#Tree>
("a data"),
        ('trunk', ('a', 'b')) :
Tree<http://crablfs.sourceforge.net/tree.html#Tree>
("trunk ab"),
        ('trunk', ('a', 'b'), 'data') :
Tree<http://crablfs.sourceforge.net/tree.html#Tree>
(['trunk', 'ab', 6]),
        ('branch',) : Tree <http://crablfs.sourceforge.net/tree.html#Tree>
(7),
        ('branch', 'data') : Tree<http://crablfs.sourceforge.net/tree.html#Tree>
(8)
}

5. Update a Tree with another one:
>>> def build_tree1():
        root['x'] = '1x'
...     root = Tree(0)
        root.trunk['a'] = 5
...     root['x'] = '1x'
        root.branch.data.extra = ['extra', 'info']
...     root['x'].data = 1
...     root['x']['y'] = '1xy'
...     root['x']['y'].data = 2
...     root['y'] = 3
...     root.trunk = 4
...     root.trunk['a'] = 5
...     root.trunk['a'].data = '5'
...     root.trunk['x'] = '2x'
...     root.trunk ['x'].data = 6
...     root.trunk['x']['y'] = '2xy'
...     root.trunk['x']['y'].data = 7
...     root.branch = 8
...     root.branch.data = 9
...     root.branch.data.extra = ['extra', 'info']
...     return root
...
>>> def build_tree2():
        root['x'] = '1.x'
...     root = Tree('0')
        root.trunk['x'] = ' two.x'
...     root['x'] = '1.x'
        root.branch.new = 'attr.new'
        return root...     root['x'].data = '1'
...     root['new'] = '1.new'
...     root.trunk = '4'
...     root.trunk['a'] = "a test"
...     root.trunk['x'] = None
...     root.trunk['b'] = 'b'
...     root.trunk['x'] = ' two.x'
...     root.trunk['x']['y'] = '2.xy'
...     root.trunk['x']['y'].data = '7'
...     root.trunk['new'] = '2.new'
...     root.trunk['new'].data = ' 2.new.data'
...     root.branch = '8'
...     root.branch.new = 'attr.new'
...     return root
...
>>> tree1 = build_tree1()
>>> tree2 = build_tree2()

Then you can update tree1 like these to build a new Tree:
>>> tree1('update', tree2)
>>> print tree1
  ** Tree **
    ('branch', 'data', 'extra') : ['extra', 'info'],
    ('branch', 'new') : 'attr.new',
    ('trunk', ('x', 'y')) : '2.xy',
    ('trunk', ('x',), 'data') : 6,
    ('branch',) : '8',
    ('trunk', ('x', 'y'), 'data') : '7',
    (('x',), 'data') : '1',
    ('trunk',) : '4',
    ('trunk', ('new',), 'data') : ' 2.new.data',
    ('trunk', ('new',)) : '2.new',
    ('trunk', ('x',)) : 'two.x',
    (('x',),) : '1.x',
    ('trunk', ('a',)) : 'a test',
    () : '0',
    (('y',),) : 3,
    (('x', 'y'),) : '1xy',
    ('trunk', ('a',), 'data') : '5',
    ('trunk', ('b',)) : 'b',
    ('branch', 'data') : 9,
    (('new',),) : '1.new',
    (('x', 'y'), 'data') : 2,

You can also use "+=" operator to do this:
>>> tree1 = build_tree1()
>>> tree1 += tree2
>>> print tree1
  ** Tree **
    ('branch', 'data', 'extra') : ['extra', 'info'],
    ('branch', 'new') : 'attr.new ',
    ('trunk', ('x', 'y')) : '2.xy',
    ('trunk', ('x',), 'data') : 6,
    ('branch',) : '8',
    ('trunk', ('x', 'y'), 'data') : '7',
    (('x',), 'data') : '1',
    ('trunk',) : '4',
    ('trunk', ('new',), 'data') : '2.new.data',
    ('trunk', ('new',)) : ' 2.new',
    ('trunk', ('x',)) : 'two.x',
    (('x',),) : '1.x',
    ('trunk', ('a',)) : 'a test',
    () : '0',
    (('y',),) : 3,
    (('x', 'y'),) : '1xy',
    ('trunk', ('a',), 'data') : '5',
    ('trunk', ('b',)) : 'b',
    ('branch', 'data') : 9,
    (('new',),) : '1.new',
    (('x', 'y'), 'data') : 2,

Or use "+" to build a new Tree:
>>> tree3 = tree1 + tree2

6. Compare 2 Trees:
Compare the root node value first, and if equal, compare their sub nodes,
attributes or items ...

7. Some other functions that have not been implemented ...

============
So far I use this Tree as a direct configuration method. For instance,
in my another project cutils which has a tool to do realtime file system
mirroring, the configuration file is like this:
sh# cat /etc/python/fs_mirror
"""
The configuration file for fs_mirror
"""

import tree
options = tree.Tree('fs_mirror')
o = options

#o.mode = 'daemon'
# There can be other mode, such as "run_once"

#o.host = 'localhost'
#o.port = 2123
# The server runs mirrord
#o.timeout = 60

#o.daemon = None
#o.daemon.datadir = "/var/fs_mirror"
# The mirrored dirs/files will be put into this directory

#o.fssync = 'Report'
# The synchronization mechanism you choose, it can be one of:
#    FTP, NFS, SMB/CIFS, SSH, RSYNC, ...
#    REPORT only reports the actions in wmLog,
#        does not do actual dirs creating or files transport
###o.fssync.host = 'localhost'
# Must be the same as o.host?
#o.fssync.port = 0
#o.fssync.user = 'root'
#o.fssync.passwd = ''
# Since it contains passord, you'd better keep this configuration file
secret.
# You can put the id/password pair to a secret file, such as
#    ~/.fs_mirror/secret
#    $host:$password
#    and use --password-from ~/.fs_mirror/secret:$host
#        option of fs_mirror script to appoint the password
# By this way, another user can not view the passwords by run "ps"
#        when you appointed the it from the command line option.
# The reason a command line options is used is for multi-mirroring

And the script read and analyze this configuration file can be written
like this:
...
CONFIG_MODULE_PATH = "/etc/python"
...
    options = tree.Tree('fs_mirror')
    options.mode = 'daemon'
    options.host = 'localhost'
    options.port = 2123
    options.timeout = 60
    options.daemon = None
    options.daemon.datadir = "/var/fs_mirror"
    options.fssync = 'Report'
    options.fssync.port = 0
    options.fssync.user = 'root'
    options.fssync.passwd = ''

    # Update options from a configuration file:
    try:
        sys.path.insert (0, CONFIG_MODULE_PATH)
        try:
            import fs_mirror_config as config
            options('update', config.options)
        except ImportError:
            pass
        sys.path.pop(0)
    except tree.TreeExc, trexc:
        strerr = "Invalid option because of: %s" % trexc.msg
        print >> sys.stderr, strerr
        syslog.syslog(syslog.LOG_ERR, strerr)
        sys.exit (1)

    # Update options from command line parameters:
    opts, args = getopt.gnu_getopt(sys.argv[1:], 'm:h:p:t:o:v',
        ['mode=', 'host=', 'port=', 'timeout=', 'option=', 'datadir=',
'help', 'verbose', 'password-from='])
    if args:
        options.identities = args
    ov = []
    for o, v in opts:
        if o == '-m' or o == '--mode':
            options.mode = v
        elif o == '-h' or o == '--host':
            options.host = v
        elif o == '-p' or o == '--port':
            options.port = int(v)
        elif o == '-t' or o == '--timeout':
            options.timeout = int(v)
        elif o == "--datadir":
            options.daemon.datadir = v
        elif o == '-o' or o == '--option':
            ov.append(v)
        elif o == '--help':
            print usage
            sys.exit(0)
        elif o == '-v' or o == '--verbose':
            fs_mirror.debug = 1
            fs_sync.debug = 1
        elif o == '--password-from':
            try:
                fname, hostid = v.split(':')
            except ValueError:
                raise ValueError, "Invalid secret-file:id pair for
--passwd-from"
            for line in open(fname, 'r'):
                line = line.strip()
                try:
                    id, passwd = line.split(':')
                    if id == hostid:
                        options.fssync.passwd = passwd
                        break
                except ValueError:
                    raise ValueError, "Invalid id:password pair record '%s'"
% line
    options('update', oparse(ov))

You can find the document of "cutils" project at:
http://crablfs.sourceforge.net <http://crablfs.sourceforge.net/tree.html>
/#ru_data_man
and download it from:
http://sourceforge.net/projects/crablfs
 <http://crablfs.sourceforge.net/tree.html>
But I think this Tree structure can also be used in other ways. For
example, I think it's possible to implement an API to read XML into such
structure ...

============
Luckily the basic definition has been finished now, you can also download
it from:
http://sourceforge.net/projects/crablfs
the document is in the code and unittest. And I have put the HTML at:
http://crablfs.sourceforge.net/tree.html

The source code is in SVN:
https://crablfs.svn.sourceforge.net/svnroot/crablfs/caxes/trunk/<https://crablfs.svn.sourceforge.net/svnroot/crablfs/caxes/trunk/lib/>
the files are tree.py and test_tree.py.

I hope this can be useful to you. Maybe I can write a PEP? Looking
forward to your suggestions and advices...

I'm not so expericenced, and this is only a partial finished work, since
I can only squash my limited spare time to do this and there is several
projects I have started. There are many things to learn, and now I'm
learning from you '')

Thank you.

-- 
------------------------------------------------------------------------
My Projects:
http://sourceforge.net/projects/crablfs
http://crablfs.sourceforge.net/
http://crablfs.sourceforge.net/#ru_data_man
http://crablfs.sourceforge.net/tree.html
http://cralbfs.sourceforge.net/sysadm_zh_CN.html
My Blog:
http://chowroc.blogspot.com/
http://hi.baidu.com/chowroc_z/
Looking for a space and platform to exert my originalities (for my
projects)...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20071106/5706807f/attachment.html>


More information about the Python-list mailing list