[Python-Dev] Google Summer of Code proposal: New class for work with binary trees AVL and RB as with the standard dictionary.

Josiah Carlson jcarlson at uci.edu
Mon Apr 24 20:51:30 CEST 2006


There exists various C and Python implementations of both AVL and
Red-Black trees.  For users of Python who want to use AVL and/or
Red-Black trees, I would urge them to use the Python implementations. 
In the case of *needing* the speed of a C extension, there already
exists a CPython extension module for AVL trees:
    http://www.python.org/pypi/pyavl/1.1

I would suggest you look through some suggested SoC projects in the
wiki:
    http://wiki.python.org/moin/SummerOfCode

 - Josiah

"Vladimir 'Yu' Stepanov" <vys at renet.ru> wrote:
> 
> I would like to participate in Google Summer of Code. The idea consists in
> creation of a specialized class for work with binary trees AVL and RB. The
> part of ideas is taken from Parrot (Perl6) where for pair values the
> specialized type is stipulated. As advantages it is possible to note:
>   * High speed. On simple types at quantity of elements up to 10000 speed of
>     work concedes to the standard dictionary of all in 2.7 times.
>   * Safety of work in a multithread operating mode.
>   * The method for frosts of object is stipulated.
>   * The data storage is carried out in the sorted kind.
>   * A high overall performance `item' and `iteritem' methods as it is not
>     necessary to create new objects.
>   * At lots of objects in a tree memory is used much less intensively.
>   * Absence of collisions. As consequence, it is impossible to generate bad
>     data for creation DoS of the attacks influencing the dictionary of
>     transferred arguments.
>   * For objects existence of a method __ hash __ is not necessary.
>   * The same kind of the dictionary by means of the overloaded operations
>     probably change of its behaviour so that it supported the 
> multivariational
>     representation based on stratification given and subsequent recoils.
> 
> Additional requirements to key objects:
>   * The opportunity of comparison of key objects function cmp is necessary.
> 
> There are the reasons, braking development of this project:
>   * Compulsion of heading of a module `gc' for each compound object (this
>     structure is unessential to some objects, but updating `gc' the module
>     is required).
>   * Absence of standard object the pair, probably depends on the 
> previous item.
> 
> Lacks of a binary tree:
>   * Average time of search for hash - O (1), and for trees - O (log2 N).
>   * A lot of memory under a single element of a tree:
>     (3*sizeof (void *) + sizeof (int))*2,
>     one element is used rather - the pair, the second site of memory is
>     allocated under node of a tree.
> 
> In protection of object "pair":
>   * The logic of methods of the given object noticeably is easier than 
> tuple,
>     that as a result can affect speed of work of the program in the best 
> party.
>   * Alignment on 8 bytes has no big sense at present architecture where in
>     cache sample a minimum on 64 bytes is made. Use of type "pair" will give
>     an appreciable prize at use of alignment in 4 bytes. Otherwise on 64-bit
>     platforms it is much more favourable to use tuple object.
> 
> The given project can demand processing of the module `gcmodule.c' and
> `tupleobject.c'. It is necessary to reduce the size of static objects, 
> for this
> purpose the opportunity is necessary is transparent to pass objects not 
> having
> direct support from the module `gcmodule.c'.
> 
> Also it will be partially necessary to process the module `obmalloc.c' 
> for more
> effective distribution of memory.
> 
> I shall be glad to answer questions on this theme.
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jcarlson%40uci.edu



More information about the Python-Dev mailing list