[pypy-commit] pypy default: Added rstrategies (from https://github.com/antongulenko/rstrategies).

anton_gulenko noreply at buildbot.pypy.org
Wed Apr 22 12:20:40 CEST 2015


Author: Anton Gulenko <anton.gulenko at googlemail.com>
Branch: 
Changeset: r76867:25ca52e41849
Date: 2015-02-15 19:15 +0100
http://bitbucket.org/pypy/pypy/changeset/25ca52e41849/

Log:	Added rstrategies (from
	https://github.com/antongulenko/rstrategies).

diff --git a/rpython/rlib/rstrategies/README.md b/rpython/rlib/rstrategies/README.md
new file mode 100644
--- /dev/null
+++ b/rpython/rlib/rstrategies/README.md
@@ -0,0 +1,101 @@
+# rstrategies
+
+A library to implement storage strategies in VMs based on the RPython toolchain.
+rstrategies can be used in VMs for any language or language family.
+
+This library has been developed as part of a Masters Thesis by [Anton Gulenko](https://github.com/antongulenko).
+
+The original paper describing the optimization "Storage Strategies for collections in dynamically typed languages" by C.F. Bolz, L. Diekmann and L. Tratt can be found [here](http://stups.hhu.de/mediawiki/images/3/3b/Pub-BoDiTr13_246.pdf).
+
+So far, this library has been adpoted by 3 VMs: [RSqueak](https://github.com/HPI-SWA-Lab/RSqueak), [Topaz](https://github.com/topazproject/topaz) ([Forked here](https://github.com/antongulenko/topaz/tree/rstrategies)) and [Pycket](https://github.com/samth/pycket) ([Forked here](https://github.com/antongulenko/pycket/tree/rstrategies)).
+
+#### Concept
+
+Collections are often used homogeneously, i.e. they contain only objects of the same type.
+Primitive numeric types like ints or floats are especially interesting for optimization.
+These cases can be optimized by storing the unboxed data of these objects in consecutive memory.
+This is done by letting a special "strategy" object handle the entire storage of a collection.
+The collection object holds two separate references: one to its strategy and one to its storage.
+Every operation on the collection is delegated to the strategy, which accesses the storage when needed.
+The strategy can be switched to a more suitable one, which might require converting the storage array.
+
+## Usage
+
+The following are the steps needed to integrated rstrategies in an RPython VM.
+Because of the special nature of this library it is not enough to simply call some API methods; the library must be integrated within existing VM classes using a metaclass, mixins and other meta-programming techniques.
+
+The sequence of steps described here is something like a "setup walkthrough", and might be a bit abstract.
+To see a concrete example, look at [AbstractShadow](https://github.com/HPI-SWA-Lab/RSqueak/blob/d5ff2572106d23a5246884de6f8b86f46d85f4f7/spyvm/storage.py#L73), [StrategyFactory](https://github.com/HPI-SWA-Lab/RSqueak/blob/d5ff2572106d23a5246884de6f8b86f46d85f4f7/spyvm/storage.py#L126) and [W_PointersObject](https://github.com/HPI-SWA-Lab/RSqueak/blob/d5ff2572106d23a5246884de6f8b86f46d85f4f7/spyvm/model.py#L565) from the [RSqueak VM](https://github.com/HPI-SWA-Lab/RSqueak).
+The code is also well commented.
+
+#### Basics
+
+Currently the rstrategies library supports fixed sized and variable sized collections.
+This can be used to optimize a wide range of primitive data structures like arrays, lists or regular objects.
+Any of these are called 'collections' in this context.
+The VM should have a central class or class hierarchy for collections.
+In order to extend these classes and use strategies, the library needs accessor methods for two attributes of collection objects: strategy and storage.
+The easiest way is adding the following line to the body of the root collection class:
+```
+rstrategies.make_accessors(strategy='strategy', storage='storage')
+```
+This will generate the 4 accessor methods ```_[get/set]_[storage/strategy]()``` for the respective attributes.
+Alternatively, implement these methods manually or overwrite the getters/setters in ```StrategyFactory```.
+
+Next, the strategy classes must be defined. This requires a small class hierarchy with a dedicated root class.
+In the definition of this root class, include the following lines:
+```
+    __metaclass__ = rstrategies.StrategyMetaclass
+    import_from_mixin(rstrategies.AbstractStrategy)
+    import_from_mixin(rstrategies.SafeIndexingMixin)
+```
+
+```import_from_mixin``` can be found in ```rpython.rlib.objectmodel```.
+If index-checking is performed safely at other places in the VM, you can use ```rstrategies.UnsafeIndexingMixin``` instead.
+If you need your own metaclass, you can combine yours with the rstrategies one using multiple inheritance [like here](https://github.com/HPI-SWA-Lab/RSqueak/blob/d5ff2572106d23a5246884de6f8b86f46d85f4f7/spyvm/storage_contexts.py#L24).
+Also implement a ```storage_factory()``` method, which returns an instance of ```rstrategies.StorageFactory```, which is described below.
+
+#### Strategy classes
+
+Now you can create the actual strategy classes, subclassing them from the single root class.
+The following list summarizes the basic strategies available.
+* ```EmptyStrategy```
+    A strategy for empty collections; very efficient, but limited. Does not allocate anything.
+* ```SingleValueStrategy```
+    A strategy for collections containing the same object ```n``` times. Only allocates memory to store the size of the collection.
+* ```GenericStrategy```
+    A non-optimized strategy backed by a generic python list. This is the fallback strategy, since it can store everything, but is not optimized.
+* ```WeakGenericStrategy```
+    Like ```GenericStrategy```, but uses ```weakref``` to hold on weakly to its elements.
+* ```SingleTypeStrategy```
+    Can store a single unboxed type like int or float. This is the main 
+* ```TaggingStrategy```
+    Extension of SingleTypeStrategy. Uses a specific value in the value range of the unboxed type to represent
+    one additional, arbitrary object.
+
+There are also intermediate classes, which allow creating new, more customized strategies. For this, you should get familiar with the code.
+
+Include one of these mixin classes using ```import_from_mixin```.
+The mixin classes contain comments describing methods or fields which are also required in the strategy class in order to use them.
+Additionally, add the @rstrategies.strategy(generalize=alist) decorator to all strategy classes.
+The ```alist``` parameter must contain all strategies, which the decorated strategy can switch to, if it can not represent a new element anymore.
+[Example](https://github.com/HPI-SWA-Lab/RSqueak/blob/d5ff2572106d23a5246884de6f8b86f46d85f4f7/spyvm/storage.py#L87) for an implemented strategy.
+See the other strategy classes behind this link for more examples.
+
+#### Strategy Factory
+
+The last part is subclassing ```rstrategies.StrategyFactory```, overwriting the method ```instantiate_strategy``` if necessary and passing the strategies root class to the constructor.
+The factory provides the methods ```switch_strategy```, ```set_initial_strategy```, ```strategy_type_for``` which can be used by the VM code to use the mechanism behind strategies.
+See the comments in the source code.
+
+The strategy mixins offer the following methods to manipulate the contents of the collection:
+* basic API
+    * ```size```
+* fixed size API
+    * ```store```, ```fetch```, ```slice```, ```store_all```, ```fetch_all```
+* variable size API
+    * ```insert```, ```delete```, ```append```, ```pop```
+
+If the collection has a fixed size, simply never use any of the variable size methods in the VM code.
+Since the strategies are singletons, these methods need the collection object as first parameter.
+For convenience, more fitting accessor methods should be implemented on the collection class itself.
diff --git a/rpython/rlib/rstrategies/__init__.py b/rpython/rlib/rstrategies/__init__.py
new file mode 100644
--- /dev/null
+++ b/rpython/rlib/rstrategies/__init__.py
@@ -0,0 +1,1 @@
+# Empy
diff --git a/rpython/rlib/rstrategies/logger.py b/rpython/rlib/rstrategies/logger.py
new file mode 100644
--- /dev/null
+++ b/rpython/rlib/rstrategies/logger.py
@@ -0,0 +1,54 @@
+
+class LogEntry(object):
+    def __init__(self):
+        self.slots = 0
+        self.objects = 0
+        self.element_typenames = {}
+        
+    def add(self, size, element_typename):
+        self.slots += size
+        self.objects += 1
+        if element_typename:
+            self.element_typenames[element_typename] = None
+    
+    def classnames(self):
+        return self.element_typenames.keys()
+
+class Logger(object):
+    _attrs_ = ["active", "aggregate", "logs"]
+    _immutable_fields_ = ["active?", "aggregate?", "logs"]
+    
+    def __init__(self):
+        self.active = False
+        self.aggregate = False
+        self.logs = {}
+    
+    def activate(self, aggregate=False):
+        self.active = True
+        self.aggregate = self.aggregate or aggregate
+    
+    def log(self, new_strategy, size, cause="", old_strategy="", typename="", element_typename=""):
+        if self.aggregate:
+            key = (cause, old_strategy, new_strategy, typename)
+            if key not in self.logs:
+                self.logs[key] = LogEntry()
+            entry = self.logs[key]
+            entry.add(size, element_typename)
+        else:
+            element_typenames = [ element_typename ] if element_typename else []
+            self.output(cause, old_strategy, new_strategy, typename, size, 1, element_typenames)
+    
+    def print_aggregated_log(self):
+        if not self.aggregate:
+            return
+        for key, entry in self.logs.items():
+            cause, old_strategy, new_strategy, typename = key
+            slots, objects, element_typenames = entry.slots, entry.objects, entry.classnames()
+            self.output(cause, old_strategy, new_strategy, typename, slots, objects, element_typenames)
+    
+    def output(self, cause, old_strategy, new_strategy, typename, slots, objects, element_typenames):
+        old_strategy_string = "%s -> " % old_strategy if old_strategy else ""
+        classname_string = " of %s" % typename if typename else ""
+        element_string = (" elements: " + " ".join(element_typenames)) if element_typenames else ""
+        format = (cause, old_strategy_string, new_strategy, classname_string, slots, objects, element_string)
+        print "%s (%s%s)%s size %d objects %d%s" % format
diff --git a/rpython/rlib/rstrategies/logparser.py b/rpython/rlib/rstrategies/logparser.py
new file mode 100644
--- /dev/null
+++ b/rpython/rlib/rstrategies/logparser.py
@@ -0,0 +1,685 @@
+
+import re, os, sys, operator
+
+"""
+This script parses a log produced by rstrategies_logger.py into a graph and converts it to various outputs.
+The most useful outputs are the dot* commands producing a visualization of the log using the dot-command of graphviz.
+Every strategy is a node in the graph, and the edges are collections or objects that transition between
+two strategies at some point during the log.
+Artificial nodes are created for log entries without an explicit source node. These are the events when a
+collection is created.
+The input to this script is a logfile, a command and optional flags.
+If the name of the logfile includes one of the AVAILABLE_VMS as a substring, the first three global variables
+are automatically configured.
+The script should work without these configurations, but the output will probably not be that pretty.
+To avoid errors, use the -a flag when running without proper configuration.
+"""
+
+# This should contain a full list of storage nodes (strategies).
+# All strategies not included here will be combined into a single "Other"-node, if the -a flag is not given.
+STORAGE_NODES = []
+
+# This allows arbitrary renamings of storage strategy nodes
+NODE_RENAMINGS = {}
+
+# Artificial storage-source nodes are automatically named like the associated operation.
+# This dict allows customizing the names of these nodes.
+STORAGE_SOURCES = {}
+
+def SET_VM(vm_name):
+    global STORAGE_NODES
+    global NODE_RENAMINGS
+    global STORAGE_SOURCES
+    if vm_name == 'RSqueak':
+        STORAGE_NODES = ['List', 'WeakList', 'SmallIntegerOrNil', 'FloatOrNil', 'AllNil']
+        NODE_RENAMINGS = dict((x+'Strategy', x) for x in STORAGE_NODES)
+        STORAGE_SOURCES = {'Filledin': 'Image Loading', 'Initialized': 'Object Creation'}
+    elif vm_name == 'Pycket':
+        STORAGE_SOURCES = {'Created': 'Array Creation'}
+        # TODO
+    elif vm_name == 'Topaz':
+        # TODO
+        pass
+    else:
+        raise Exception("Unhandled vm name %s" % vm_name)
+
+AVAILABLE_VMS = ['RSqueak', 'Pycket', 'Topaz']
+
+# ====================================================================
+# ======== Logfile parsing
+# ====================================================================
+
+def percent(part, total):
+    if total == 0:
+        return 0
+    return float(part)*100 / total
+
+def parse(filename, flags, callback):
+    parsed_entries = 0
+    if filename == "-":
+        opener = lambda: sys.stdin
+    else:
+        opener = lambda: open(filename, 'r', 1)
+    with opener() as file:
+        while True:
+            line = file.readline()
+            if len(line) == 0:
+                break
+            entry = parse_line(line, flags)
+            if entry:
+                parsed_entries += 1
+                callback(entry)
+    return parsed_entries
+
+line_pattern = re.compile("^(?P<operation>\w+) \(((?P<old>\w+) -> )?(?P<new>\w+)\)( of (?P<classname>.+))? size (?P<size>[0-9]+)( objects (?P<objects>[0-9]+))?( elements: (?P<classnames>.+( .+)*))?$")
+
+def parse_line(line, flags):
+    result = line_pattern.match(line)
+    if result is None:
+        if flags.verbose:
+            print "Could not parse line: %s" % line[:-1]
+        return None
+    operation = str(result.group('operation'))
+    old_storage = result.group('old')
+    new_storage = str(result.group('new'))
+    classname = str(result.group('classname'))
+    size = int(result.group('size'))
+    objects = result.group('objects')
+    objects = int(objects) if objects else 1
+    classnames = result.group('classnames')
+    if classnames is not None:
+        classnames = classnames.split(' ')
+        classnames = set(classnames)
+    else:
+        classnames = set()
+    
+    is_storage_source = old_storage is None
+    if is_storage_source:
+        if operation in STORAGE_SOURCES:
+            old_storage = STORAGE_SOURCES[operation]
+        else:
+            print "Using operation %s as storage source." % operation
+    old_storage = str(old_storage)
+    
+    if new_storage in NODE_RENAMINGS:
+        new_storage = NODE_RENAMINGS[new_storage]
+    if old_storage in NODE_RENAMINGS:
+        old_storage = NODE_RENAMINGS[old_storage]
+    
+    return LogEntry(operation, old_storage, new_storage, classname, size, objects, classnames, is_storage_source)
+
+class LogEntry(object):
+    
+    def __init__(self, operation, old_storage, new_storage, classname, size, objects, classnames, is_storage_source):
+        self.operation = operation
+        self.old_storage = old_storage
+        self.new_storage = new_storage
+        self.classname = classname
+        self.size = size
+        self.objects = objects
+        self.classnames = classnames
+        self.is_storage_source = is_storage_source
+        assert old_storage != new_storage, "old and new storage identical in log entry: %s" % self
+    
+    def full_key(self):
+        return (self.operation, self.old_storage, self.new_storage)
+    
+    def __lt__(self, other):
+        return self.classname < other.classname
+    
+    def __repr__(self):
+        return "%s(%s)" % (self.__str__(), object.__repr__(self))
+    
+    def __str__(self):
+        old_storage_string = "%s -> " % self.old_storage if self.old_storage else ""
+        classname_string = " of %s" % self.classname if self.classname else ""
+        objects_string = " objects %d" % self.objects if self.objects > 1 else ""
+        return "%s (%s%s)%s size %d%s" % (self.operation, old_storage_string, self.new_storage, classname_string, self.size, objects_string)
+
+# ====================================================================
+# ======== Graph parsing
+# ====================================================================
+
+class Operations(object):
+    
+    def __init__(self, objects=0, slots=0, element_classnames=[]):
+        self.objects = objects
+        self.slots = slots
+        self.element_classnames = set(element_classnames)
+    
+    def __str__(self, total=None):
+        if self.objects == 0:
+            avg_slots = 0
+        else:
+            avg_slots = float(self.slots) / self.objects
+        if total is not None and total.slots != 0:
+            percent_slots = " (%.1f%%)" % percent(self.slots, total.slots)
+        else:
+            percent_slots = ""
+        if total is not None and total.objects != 0:
+            percent_objects = " (%.1f%%)" % percent(self.objects, total.objects)
+        else:
+            percent_objects = ""
+        slots = format(self.slots, ",d")
+        objects = format(self.objects, ",d")
+        classnames = (" [ elements: %s ]" % ' '.join([str(x) for x in self.element_classnames])) \
+                                    if len(self.element_classnames) else ""
+        return "%s%s slots in %s%s objects (avg size: %.1f)%s" % (slots, percent_slots, objects, percent_objects, avg_slots, classnames)
+    
+    def __repr__(self):
+        return "%s(%s)" % (self.__str__(), object.__repr__(self))
+    
+    def add_log_entry(self, entry):
+        self.slots = self.slots + entry.size
+        self.objects = self.objects + entry.objects
+        self.element_classnames |= entry.classnames
+    
+    def __sub__(self, other):
+        return Operations(self.objects - other.objects, self.slots - other.slots)
+    
+    def __add__(self, other):
+        return Operations(self.objects + other.objects, self.slots + other.slots)
+    
+    def __lt__(self, other):
+        return self.slots < other.slots
+    
+    def empty(self):
+        return self.objects == 0 and self.slots == 0
+    
+    def prefixprint(self, key="", total=None):
+        if not self.empty():
+            print "%s%s" % (key, self.__str__(total))
+    
+class ClassOperations(object):
+    
+    def __init__(self):
+        self.classes = {}
+    
+    def cls(self, name):
+        if name not in self.classes:
+            self.classes[name] = Operations()
+        return self.classes[name]
+    
+    def total(self):
+        return reduce(operator.add, self.classes.values(), Operations())
+    
+    def __str__(self):
+        return "ClassOperations(%s)" % self.classes
+    
+    def __repr__(self):
+        return "%s(%s)" % (self.__str__(), object.__repr__(self))
+    
+    def __add__(self, other):
+        result = ClassOperations()
+        result.classes = dict(self.classes)
+        for classname, other_class in other.classes.items():
+            result.cls(classname) # Make sure exists.
+            result.classes[classname] += other_class
+        return result
+    
+    def __sub__(self, other):
+        result = ClassOperations()
+        result.classes = dict(self.classes)
+        for classname, other_class in other.classes.items():
+            result.cls(classname) # Make sure exists.
+            result.classes[classname] -= other_class
+        return result
+    
+class StorageEdge(object):
+    
+    def __init__(self, operation="None", origin=None, target=None):
+        self.operation = operation
+        self.classes = ClassOperations()
+        self.origin = origin
+        self.target = target
+        self.is_storage_source = False
+    
+    def full_key(self):
+        return (self.operation, self.origin.name, self.target.name)
+    
+    def cls(self, classname):
+        return self.classes.cls(classname)
+    
+    def total(self):
+        return self.classes.total()
+    
+    def notify_nodes(self):
+        self.origin.note_outgoing(self)
+        self.target.note_incoming(self)
+    
+    def add_log_entry(self, entry):
+        self.cls(entry.classname).add_log_entry(entry)
+        if entry.is_storage_source:
+            self.is_storage_source = True
+    
+    def as_log_entries(self):
+        entries = []
+        for classname, ops in self.classes.classes.items():
+            origin = None if self.is_storage_source else self.origin.name
+            entry = LogEntry(self.operation, origin, self.target.name, classname,
+                            ops.slots, ops.objects, ops.element_classnames, self.is_storage_source)
+            entries.append(entry)
+        return entries
+    
+    def __lt__(self, other):
+        return self.full_key() < other.full_key()
+    
+    def __str__(self):
+        return "[%s %s -> %s]" % (self.operation, self.origin, self.target)
+    
+    def __repr__(self):
+        return "%s(%s)" % (self.__str__(), object.__repr__(self))
+    
+    def __add__(self, other):
+        origin = self.origin if self.origin is not None else other.origin
+        target = self.target if self.target is not None else other.target
+        result = StorageEdge(self.operation, origin, target)
+        result.classes += self.classes + other.classes
+        return result
+    
+    def __sub__(self, other):
+        origin = self.origin if self.origin is not None else other.origin
+        target = self.target if self.target is not None else other.target
+        result = StorageEdge(self.operation, origin, target)
+        result.classes += self.classes - other.classes
+        return result
+    
+class StorageNode(object):
+    
+    def __init__(self, name):
+        self.name = name
+        self.incoming = set()
+        self.outgoing = set()
+    
+    def note_incoming(self, edge):
+        assert edge.target is self
+        if edge not in self.incoming:
+            self.incoming.add(edge)
+        
+    def note_outgoing(self, edge):
+        assert edge.origin is self
+        if edge not in self.outgoing:
+            self.outgoing.add(edge)
+        
+    def incoming_edges(self, operation):
+        return filter(lambda x: x.operation == operation, self.incoming)
+    
+    def outgoing_edges(self, operation):
+        return filter(lambda x: x.operation == operation, self.outgoing)
+    
+    def sum_incoming(self, operation):
+        return reduce(operator.add, self.incoming_edges(operation), StorageEdge(operation))
+        
+    def sum_outgoing(self, operation):
+        return reduce(operator.add, self.outgoing_edges(operation), StorageEdge(operation))
+    
+    def sum_all_incoming(self):
+        return reduce(operator.add, self.incoming, StorageEdge())
+    
+    def sum_all_outgoing(self):
+        return reduce(operator.add, self.outgoing, StorageEdge())
+    
+    def __str__(self):
+        return self.name
+    
+    def __repr__(self):
+        return "%s(%s)" % (self.__str__(), object.__repr__(self))
+    
+    def merge_edge_sets(self, set1, set2, key_slot):
+        getter = lambda edge: edge.__dict__[key_slot]
+        set_dict = dict([(getter(edge), edge) for edge in set1])
+        for edge in set2:
+            key = getter(edge)
+            if key not in set_dict:
+                set_dict[key] = edge
+            else:
+                set_dict[key] += edge
+        return set(set_dict.values())
+    
+    def __add__(self, other):
+        result = StorageNode("%s %s" % (self.name, other.name))
+        result.incoming = self.merge_edge_sets(self.incoming, other.incoming, "origin")
+        # TODO bad code
+        for edge in result.incoming:
+            edge.target = result
+        result.outgoing = self.merge_edge_sets(self.outgoing, other.outgoing, "target")
+        for edge in result.outgoing:
+            edge.origin = result
+        return result
+    
+    def __lt__(self, other):
+        return self.name < other.name
+    
+    def is_artificial(self):
+        for outgoing in self.outgoing:
+            if outgoing.is_storage_source:
+                return True
+        return False
+    
+    def is_storage_node(self):
+        return self.is_artificial() or self.name in STORAGE_NODES
+    
+    def dot_name(self):
+        return self.name.replace(" ", "_")
+    
+class StorageGraph(object):
+    
+    def __init__(self):
+        self.nodes = {}
+        self.edges = {}
+        self.operations = set()
+    
+    def node(self, name):
+        if str(name) == 'None':
+            import pdb; pdb.set_trace()
+        if name not in self.nodes:
+            self.nodes[name] = StorageNode(name)
+        return self.nodes[name]
+    
+    def assert_sanity(self):
+        visited_edges = set()
+        for node in self.nodes.values():
+            for edge in node.incoming:
+                assert edge in self.edges.values(), "Edge not in graph's edges: %s" % edge
+                visited_edges.add(edge)
+                if not edge.target is node:
+                    print "Wrong edge target: %s\nIncoming edge: %s\nIn node: %s" % (edge.target, edge, node)
+                    assert False
+                if not edge in edge.origin.outgoing:
+                    print "Edge not in origin's outgoing: %s\nIncoming edge: %s\nIn node: %s" % (edge.origin.outgoing, edge, node)
+                    assert False
+            for edge in node.outgoing:
+                assert edge in self.edges.values(), "Edge not in graph's edges: %s" % edge
+                visited_edges.add(edge)
+                if not edge.origin is node:
+                    print "Wrong edge origin: %s\nOutgoing edge: %s\nIn node: %s" % (edge.origin, edge, node)
+                    assert False
+                if not edge in edge.target.incoming:
+                    print "Edge not in origin's incoming: %s\nOutgoing edge: %s\nIn node: %s" % (edge.target.incoming, edge, node)
+                    assert False
+        assert len(visited_edges) == len(self.edges.values()), "Not all of graph's edges visited."
+    
+    def add_log_entry(self, log_entry):
+        self.operations.add(log_entry.operation)
+        key = log_entry.full_key()
+        if key not in self.edges:
+            edge = StorageEdge(log_entry.operation, self.node(log_entry.old_storage), self.node(log_entry.new_storage))
+            self.edges[key] = edge
+            edge.notify_nodes()
+        self.edges[key].add_log_entry(log_entry)
+    
+    def collapse_nodes(self, collapsed_nodes, new_name=None):
+        if len(collapsed_nodes) == 0:
+            return
+        for node in collapsed_nodes:
+            del self.nodes[node.name]
+            for edge in node.incoming:
+                del self.edges[edge.full_key()]
+            for edge in node.outgoing:
+                del self.edges[edge.full_key()]
+        new_node = reduce(operator.add, collapsed_nodes)
+        if new_name is not None:
+            new_node.name = new_name
+        self.nodes[new_node.name] = new_node
+        # TODO bad code
+        for node in collapsed_nodes:
+            for edge in node.incoming:
+                edge.origin.outgoing.remove(edge)
+                new_edges = filter(lambda filtered: filtered.origin == edge.origin, new_node.incoming)
+                assert len(new_edges) == 1
+                edge.origin.outgoing.add(new_edges[0])
+            for edge in node.outgoing:
+                edge.target.incoming.remove(edge)
+                new_edges = filter(lambda filtered: filtered.target == edge.target, new_node.outgoing)
+                assert len(new_edges) == 1
+                edge.target.incoming.add(new_edges[0])
+        for edge in new_node.incoming:
+            self.edges[edge.full_key()] = edge
+        for edge in new_node.outgoing:
+            self.edges[edge.full_key()] = edge
+        self.assert_sanity()
+    
+    def collapse_nonstorage_nodes(self, new_name=None):
+        nodes = filter(lambda x: not x.is_storage_node(), self.nodes.values())
+        self.collapse_nodes(nodes, new_name)
+    
+    def sorted_nodes(self):
+        nodes = self.nodes.values()
+        nodes.sort()
+        return nodes
+    
+def make_graph(logfile, flags):
+    graph = StorageGraph()
+    def callback(entry):
+        graph.add_log_entry(entry)
+    parse(logfile, flags, callback)
+    graph.assert_sanity()
+    return graph
+
+# ====================================================================
+# ======== Command - Summarize log content
+# ====================================================================
+
+def command_summarize(logfile, flags):
+    graph = make_graph(logfile, flags)
+    if not flags.allstorage:
+        graph.collapse_nonstorage_nodes()
+    for node in graph.sorted_nodes():
+        node.print_summary(flags, graph.operations)
+
+def StorageNode_print_summary(self, flags, all_operations):
+    print "\n%s:" % self.name
+    sum = StorageEdge()
+    total_incoming = self.sum_all_incoming().total() if flags.percent else None
+    
+    print "\tIncoming:"
+    for operation in all_operations:
+        if flags.detailed:
+            edges = [ (edge.origin.name, edge) for edge in self.incoming_edges(operation) ]
+        else:
+            edges = [ (operation, self.sum_incoming(operation)) ]
+        for edgename, edge in edges:
+            edge.print_with_name("\t\t\t", edgename, total_incoming, flags)
+            sum += edge
+    
+    print "\tOutgoing:"
+    for operation in all_operations:
+        if flags.detailed:
+            edges = [ (edge.target.name, edge) for edge in self.outgoing_edges(operation) ]
+        else:
+            edges = [ (operation, self.sum_outgoing(operation)) ]
+        for edgename, edge in edges:
+            edge.print_with_name("\t\t\t", edgename, total_incoming, flags)
+            sum -= edge
+    
+    sum.print_with_name("\t", "Remaining", total_incoming, flags)
+
+StorageNode.print_summary = StorageNode_print_summary
+
+def StorageEdge_print_with_name(self, prefix, edgename, total_reference, flags):
+    if flags.classes:   
+        print "%s%s:" % (prefix, edgename)
+        prefix += "\t\t"
+        operations = self.classes.classes.items()
+        operations.sort(reverse=True, key=operator.itemgetter(1))
+    else:
+        operations = [ (edgename, self.total()) ]
+    for classname, classops in operations:
+        classops.prefixprint("%s%s: " % (prefix, classname), total_reference)
+    
+StorageEdge.print_with_name = StorageEdge_print_with_name
+
+# ====================================================================
+# ======== Command - DOT output
+# ====================================================================
+
+# Output is valid dot code and can be parsed by the graphviz dot utility.
+def command_print_dot(logfile, flags):
+    graph = make_graph(logfile, flags)
+    print "/*"
+    print "Storage Statistics (dot format):"
+    print "================================"
+    print "*/"
+    print dot_string(graph, flags)
+
+def run_dot(logfile, flags, output_type):
+    import subprocess
+    dot = dot_string(make_graph(logfile, flags), flags)
+    command = ["dot", "-T%s" % output_type, "-o%s.%s" % (flags.logfile, output_type)]
+    print "Running:\n%s" % " ".join(command)
+    p = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
+    output = p.communicate(input=dot)[0]
+    print output
+
+def command_dot(logfile, flags):
+    run_dot(logfile, flags, "jpg")
+def command_dot_ps(logfile, flags):
+    run_dot(logfile, flags, "ps")
+def command_dot_pdf(logfile, flags):
+    run_dot(logfile, flags, "pdf")
+def command_dot_svg(logfile, flags):
+    run_dot(logfile, flags, "svg")
+
+def dot_string(graph, flags):
+    result = "digraph G {"
+    incoming_cache = {}
+    if not flags.allstorage:
+        graph.collapse_nonstorage_nodes("Other")
+    
+    def make_label(edge, prefix="", total_edge=None, slots_per_object=False):
+        object_suffix = " objects"
+        slots_suffix = " slots"
+        if not flags.objects or not flags.slots:
+            object_suffix = slots_suffix = ""
+        if total_edge and flags.percent and total_edge.objects != 0:
+            percent_objects = " (%.1f%%)" % percent(edge.objects, total_edge.objects)
+            percent_slots = " (%.1f%%)" % percent(edge.slots, total_edge.slots)
+        else:
+            percent_objects = percent_slots = ""
+        label = ""
+        if flags.objects:
+            label += "%s%s%s%s<BR/>" % (prefix, format(edge.objects, ",.0f"), object_suffix, percent_objects)
+        if flags.slots:
+            label += "%s%s%s%s<BR/>" % (prefix, format(edge.slots, ",.0f"), slots_suffix, percent_slots)
+        if slots_per_object and flags.slotsPerObject:
+            label += "%.1f slots/object<BR/>" % (float(total.slots) / total.objects)
+        return label
+    
+    for node in graph.nodes.values():
+        incoming = node.sum_all_incoming().total()
+        outgoing = node.sum_all_outgoing().total()
+        remaining = incoming - outgoing
+        if node.is_artificial():
+            incoming_cache[node.name] = outgoing
+            shape = ",shape=box"
+            label = make_label(outgoing)
+        else:
+            incoming_cache[node.name] = incoming
+            shape = ""
+            label = make_label(incoming, "Incoming: ")
+            if remaining.objects != incoming.objects:
+                label += make_label(remaining, "Remaining: ", incoming)
+        result += "%s [label=<<B><U>%s</U></B><BR/>%s>%s];" % (node.dot_name(), node.name, label, shape)
+    
+    for edge in graph.edges.values():
+        total = edge.total()
+        incoming = incoming_cache[edge.origin.name]
+        label = make_label(total, "", incoming, slots_per_object=True)
+        target_node = edge.target.dot_name()
+        source_node = edge.origin.dot_name()
+        result += "%s -> %s [label=<%s>];" % (source_node, target_node, label)
+    
+    result += "}"
+    return result
+
+# ====================================================================
+# ======== Other commands
+# ====================================================================
+
+def command_aggregate(logfile, flags):
+    graph = make_graph(logfile, flags)
+    edges = graph.edges.values()
+    edges.sort()
+    for edge in edges:
+        logentries = edge.as_log_entries()
+        logentries.sort()
+        for entry in logentries:
+            print entry
+
+def command_print_entries(logfile, flags):
+    def callback(entry):
+        print entry
+    parse(logfile, flags, callback)
+
+# ====================================================================
+# ======== Main
+# ====================================================================
+
+class Flags(object):
+    
+    def __init__(self, flags):
+        self.flags = {}
+        for name, short in flags:
+            self.__dict__[name] = False
+            self.flags[short] = name
+    
+    def handle(self, arg):
+        if arg in self.flags:
+            self.__dict__[self.flags[arg]] = True
+            return True
+        else:
+            return False
+    
+    def __str__(self):
+        descriptions = [ ("%s (%s)" % description) for description in self.flags.items() ]
+        return "[%s]" % " | ".join(descriptions)
+    
+def usage(flags, commands):
+    print "Arguments: logfile command %s" % flags
+    print "Available commands: %s" % commands
+    exit(1)
+
+def main(argv):
+    flags = Flags([
+        # General
+        ('verbose', '-v'),
+        
+        # All outputs
+        ('percent', '-p'),
+        ('allstorage', '-a'),
+        
+        # Text outputs
+        ('detailed', '-d'),
+        ('classes', '-c'),
+        
+        # dot outputs
+        ('slots', '-s'),
+        ('objects', '-o'),
+        ('slotsPerObject', '-S'),
+    ])
+    
+    command_prefix = "command_"
+    module = sys.modules[__name__].__dict__
+    commands = [ a[len(command_prefix):] for a in module.keys() if a.startswith(command_prefix) ]
+    
+    if len(argv) < 2:
+        usage(flags, commands)
+    logfile = argv[0]
+    flags.logfile = logfile
+    for vm_name in AVAILABLE_VMS:
+        if vm_name in logfile:
+            print "Using VM configuration %s" % vm_name
+            SET_VM(vm_name)
+            break
+    command = argv[1]
+    for flag in argv[2:]:
+        if not flags.handle(flag):
+            usage(flags, commands)
+    if command not in commands:
+        usage(flags, commands)
+    
+    func = module[command_prefix + command]
+    func(logfile, flags)
+
+if __name__ == "__main__":
+    main(sys.argv[1:])
diff --git a/rpython/rlib/rstrategies/rstrategies.py b/rpython/rlib/rstrategies/rstrategies.py
new file mode 100644
--- /dev/null
+++ b/rpython/rlib/rstrategies/rstrategies.py
@@ -0,0 +1,570 @@
+
+import weakref, sys
+from rpython.rlib.rstrategies import logger
+from rpython.rlib import jit, objectmodel, rerased
+from rpython.rlib.objectmodel import specialize
+
+def make_accessors(strategy='strategy', storage='storage'):
+    """
+    Instead of using this generator, the methods can be implemented manually.
+    A third way is to overwrite the getter/setter methods in StrategyFactory.
+    """
+    def make_getter(attr):
+        def getter(self): return getattr(self, attr)
+        return getter
+    def make_setter(attr):
+        def setter(self, val): setattr(self, attr, val)
+        return setter
+    classdef = sys._getframe(1).f_locals
+    classdef['_get_strategy'] = make_getter(strategy)
+    classdef['_set_strategy'] = make_setter(strategy)
+    classdef['_get_storage'] = make_getter(storage)
+    classdef['_set_storage'] = make_setter(storage)
+
+class StrategyMetaclass(type):
+    """
+    A metaclass is required, because we need certain attributes to be special
+    for every single strategy class.
+    """
+    def __new__(self, name, bases, attrs):
+        attrs['_is_strategy'] = False
+        attrs['_is_singleton'] = False
+        attrs['_specializations'] = []
+        # Not every strategy uses rerased-pairs, but they won't hurt
+        erase, unerase = rerased.new_erasing_pair(name)
+        def get_storage(self, w_self):
+            erased = self.strategy_factory().get_storage(w_self)
+            return unerase(erased)
+        def set_storage(self, w_self, storage):
+            erased = erase(storage)
+            self.strategy_factory().set_storage(w_self, erased)
+        attrs['get_storage'] = get_storage
+        attrs['set_storage'] = set_storage
+        return type.__new__(self, name, bases, attrs)
+    
+def strategy(generalize=None, singleton=True):
+    """
+    Strategy classes must be decorated with this.
+    generalize is a list of other strategies, that can be switched to from the decorated strategy.
+    If the singleton flag is set to False, new strategy instances will be created,
+    instead of always reusing the singleton object.
+    """
+    def decorator(strategy_class):
+        # Patch strategy class: Add generalized_strategy_for and mark as strategy class.
+        if generalize:
+            @jit.unroll_safe
+            def generalized_strategy_for(self, value):
+                # TODO - optimize this method
+                for strategy in generalize:
+                    if self.strategy_factory().strategy_singleton_instance(strategy).check_can_handle(value):
+                        return strategy
+                raise Exception("Could not find generalized strategy for %s coming from %s" % (value, self))
+            strategy_class.generalized_strategy_for = generalized_strategy_for
+            for generalized in generalize:
+                generalized._specializations.append(strategy_class)
+        strategy_class._is_strategy = True
+        strategy_class._generalizations = generalize
+        strategy_class._is_singleton = singleton
+        return strategy_class
+    return decorator
+
+class StrategyFactory(object):
+    _immutable_fields_ = ["strategies[*]", "logger", "strategy_singleton_field"]
+    factory_instance_counter = 0
+    
+    def __init__(self, root_class, all_strategy_classes=None):
+        if all_strategy_classes is None:
+            all_strategy_classes = self.collect_subclasses(root_class)
+        self.strategies = []
+        self.logger = logger.Logger()
+        
+        # This is to avoid confusion between multiple factories existing simultaneously (e.g. in tests)
+        self.strategy_singleton_field = "__singleton_%i" % StrategyFactory.factory_instance_counter
+        StrategyFactory.factory_instance_counter += 1
+        
+        self.create_strategy_instances(root_class, all_strategy_classes)
+    
+    def create_strategy_instances(self, root_class, all_strategy_classes):
+        for strategy_class in all_strategy_classes:
+            if strategy_class._is_strategy:
+                setattr(strategy_class, self.strategy_singleton_field, self.instantiate_strategy(strategy_class))
+                self.strategies.append(strategy_class)
+            self.patch_strategy_class(strategy_class, root_class)
+        self.order_strategies()
+    
+    # =============================
+    # API methods
+    # =============================
+    
+    def switch_strategy(self, w_self, new_strategy_type, new_element=None):
+        """
+        Switch the strategy of w_self to the new type.
+        new_element can be given as as hint, purely for logging purposes.
+        It should be the object that was added to w_self, causing the strategy switch.
+        """
+        old_strategy = self.get_strategy(w_self)
+        if new_strategy_type._is_singleton:
+            new_strategy = self.strategy_singleton_instance(new_strategy_type)
+        else:
+            size = old_strategy.size(w_self)
+            new_strategy = self.instantiate_strategy(new_strategy_type, w_self, size)
+        self.set_strategy(w_self, new_strategy)
+        old_strategy.convert_storage_to(w_self, new_strategy)
+        new_strategy.strategy_switched(w_self)
+        self.log(w_self, new_strategy, old_strategy, new_element)
+        return new_strategy
+    
+    def set_initial_strategy(self, w_self, strategy_type, size, elements=None):
+        """
+        Initialize the strategy and storage fields of w_self.
+        This must be called before switch_strategy or any strategy method can be used.
+        elements is an optional list of values initially stored in w_self.
+        If given, then len(elements) == size must hold.
+        """
+        assert self.get_strategy(w_self) is None, "Strategy should not be initialized yet!"
+        if strategy_type._is_singleton:
+            strategy = self.strategy_singleton_instance(strategy_type)
+        else:
+            strategy = self.instantiate_strategy(strategy_type, w_self, size)
+        self.set_strategy(w_self, strategy)
+        strategy.initialize_storage(w_self, size)
+        element = None
+        if elements:
+            strategy.store_all(w_self, elements)
+            if len(elements) > 0: element = elements[0]
+        strategy.strategy_switched(w_self)
+        self.log(w_self, strategy, None, element)
+        return strategy
+    
+    @jit.unroll_safe
+    def strategy_type_for(self, objects):
+        """
+        Return the best-fitting strategy to hold all given objects.
+        """
+        specialized_strategies = len(self.strategies)
+        can_handle = [True] * specialized_strategies
+        for obj in objects:
+            if specialized_strategies <= 1:
+                break
+            for i, strategy in enumerate(self.strategies):
+                if can_handle[i] and not self.strategy_singleton_instance(strategy).check_can_handle(obj):
+                    can_handle[i] = False
+                    specialized_strategies -= 1
+        for i, strategy_type in enumerate(self.strategies):
+            if can_handle[i]:
+                return strategy_type
+        raise Exception("Could not find strategy to handle: %s" % objects)
+    
+    def decorate_strategies(self, transitions):
+        """
+        As an alternative to decorating all strategies with @strategy,
+        invoke this in the constructor of your StrategyFactory subclass, before
+        calling __init__. transitions is a dict mapping all strategy classes to
+        their 'generalize' list parameter (see @strategy decorator).
+        """
+        "NOT_RPYTHON"
+        for strategy_class, generalized in transitions.items():
+            strategy(generalized)(strategy_class)
+    
+    # =============================
+    # The following methods can be overwritten to customize certain aspects of the factory.
+    # =============================
+    
+    def instantiate_strategy(self, strategy_type, w_self=None, initial_size=0):
+        """
+        Return a functional instance of strategy_type.
+        Overwrite this if you need a non-default constructor.
+        The two additional parameters should be ignored for singleton-strategies.
+        """
+        return strategy_type()
+    
+    def log(self, w_self, new_strategy, old_strategy=None, new_element=None):
+        """
+        This can be overwritten into a more appropriate call to self.logger.log
+        """
+        if not self.logger.active: return
+        new_strategy_str = self.log_string_for_object(new_strategy)
+        old_strategy_str = self.log_string_for_object(old_strategy)
+        element_typename = self.log_string_for_object(new_element)
+        size = new_strategy.size(w_self)
+        typename = ""
+        cause = "Switched" if old_strategy else "Created"
+        self.logger.log(new_strategy_str, size, cause, old_strategy_str, typename, element_typename)
+    
+    @specialize.call_location()
+    def log_string_for_object(self, obj):
+        """
+        This can be overwritten instead of the entire log() method.
+        Keep the specialize-annotation in order to handle different kinds of objects here.
+        """
+        return obj.__class__.__name__ if obj else ""
+    
+    # These storage accessors are specialized because the storage field is 
+    # populated by erased-objects which seem to be incompatible sometimes.
+    @specialize.call_location()
+    def get_storage(self, obj):
+        return obj._get_storage()
+    @specialize.call_location()
+    def set_storage(self, obj, val):
+        return obj._set_storage(val)
+    
+    def get_strategy(self, obj):
+        return obj._get_strategy()
+    def set_strategy(self, obj, val):
+        return obj._set_strategy(val)
+    
+    # =============================
+    # Internal methods
+    # =============================
+    
+    def patch_strategy_class(self, strategy_class, root_class):
+        "NOT_RPYTHON"
+        # Patch root class: Add default handler for visitor
+        def convert_storage_from_OTHER(self, w_self, previous_strategy):
+            self.convert_storage_from(w_self, previous_strategy)
+        funcname = "convert_storage_from_" + strategy_class.__name__
+        convert_storage_from_OTHER.func_name = funcname
+        setattr(root_class, funcname, convert_storage_from_OTHER)
+        
+        # Patch strategy class: Add polymorphic visitor function
+        def convert_storage_to(self, w_self, new_strategy):
+            getattr(new_strategy, funcname)(w_self, self)
+        strategy_class.convert_storage_to = convert_storage_to
+    
+    def collect_subclasses(self, cls):
+        "NOT_RPYTHON"
+        subclasses = []
+        for subcls in cls.__subclasses__():
+            subclasses.append(subcls)
+            subclasses.extend(self.collect_subclasses(subcls))
+        return subclasses
+    
+    def order_strategies(self):
+        "NOT_RPYTHON"
+        def get_generalization_depth(strategy, visited=None):
+            if visited is None:
+                visited = set()
+            if strategy._generalizations:
+                if strategy in visited:
+                    raise Exception("Cycle in generalization-tree of %s" % strategy)
+                visited.add(strategy)
+                depth = 0
+                for generalization in strategy._generalizations:
+                    other_depth = get_generalization_depth(generalization, visited)
+                    depth = max(depth, other_depth)
+                return depth + 1
+            else:
+                return 0
+        self.strategies.sort(key=get_generalization_depth, reverse=True)
+    
+    @jit.elidable
+    def strategy_singleton_instance(self, strategy_class):
+        return getattr(strategy_class, self.strategy_singleton_field)
+    
+    def _freeze_(self):
+        # Instance will be frozen at compile time, making accesses constant.
+        # The constructor does meta stuff which is not possible after translation.
+        return True
+
+class AbstractStrategy(object):
+    """
+    == Required:
+    strategy_factory(self) - Access to StorageFactory
+    """
+    
+    def strategy_switched(self, w_self):
+        # Overwrite this method for a hook whenever the strategy
+        # of w_self was switched to self.
+        pass
+    
+    # Main Fixedsize API
+    
+    def store(self, w_self, index0, value):
+        raise NotImplementedError("Abstract method")
+    
+    def fetch(self, w_self, index0):
+        raise NotImplementedError("Abstract method")
+    
+    def size(self, w_self):
+        raise NotImplementedError("Abstract method")
+    
+    # Fixedsize utility methods
+    
+    def slice(self, w_self, start, end):
+        return [ self.fetch(w_self, i) for i in range(start, end)]
+    
+    def fetch_all(self, w_self):
+        return self.slice(w_self, 0, self.size(w_self))
+    
+    def store_all(self, w_self, elements):
+        for i, e in enumerate(elements):
+            self.store(w_self, i, e)
+    
+    # Main Varsize API
+    
+    def insert(self, w_self, index0, list_w):
+        raise NotImplementedError("Abstract method")
+    
+    def delete(self, w_self, start, end):
+        raise NotImplementedError("Abstract method")
+    
+    # Varsize utility methods
+    
+    def append(self, w_self, list_w):
+        self.insert(w_self, self.size(w_self), list_w)        
+    
+    def pop(self, w_self, index0):
+        e = self.fetch(w_self, index0)
+        self.delete(w_self, index0, index0+1)
+        return e
+
+    # Internal methods
+    
+    def initialize_storage(self, w_self, initial_size):
+        raise NotImplementedError("Abstract method")
+    
+    def check_can_handle(self, value):
+        raise NotImplementedError("Abstract method")
+    
+    def convert_storage_to(self, w_self, new_strategy):
+        # This will be overwritten in patch_strategy_class
+        new_strategy.convert_storage_from(w_self, self)
+    
+    @jit.unroll_safe
+    def convert_storage_from(self, w_self, previous_strategy):
+        # This is a very unefficient (but most generic) way to do this.
+        # Subclasses should specialize.
+        storage = previous_strategy.fetch_all(w_self)
+        self.initialize_storage(w_self, previous_strategy.size(w_self))
+        for i, field in enumerate(storage):
+            self.store(w_self, i, field)
+    
+    def generalize_for_value(self, w_self, value):
+        strategy_type = self.generalized_strategy_for(value)
+        new_instance = self.strategy_factory().switch_strategy(w_self, strategy_type, new_element=value)
+        return new_instance
+        
+    def cannot_handle_store(self, w_self, index0, value):
+        new_instance = self.generalize_for_value(w_self, value)
+        new_instance.store(w_self, index0, value)
+        
+    def cannot_handle_insert(self, w_self, index0, list_w):
+        # TODO - optimize. Prevent multiple generalizations and slicing done by callers.
+        new_strategy = self.generalize_for_value(w_self, list_w[0])
+        new_strategy.insert(w_self, index0, list_w)
+
+# ============== Special Strategies with no storage array ==============
+
+class EmptyStrategy(AbstractStrategy):
+    # == Required:
+    # See AbstractStrategy
+    
+    def initialize_storage(self, w_self, initial_size):
+        assert initial_size == 0
+        self.set_storage(w_self, None)
+    def convert_storage_from(self, w_self, previous_strategy):
+        self.set_storage(w_self, None)
+    def fetch(self, w_self, index0):
+        raise IndexError
+    def store(self, w_self, index0, value):
+        self.cannot_handle_insert(w_self, index0, [value])
+    def insert(self, w_self, index0, list_w):
+        self.cannot_handle_insert(w_self, index0, list_w)
+    def delete(self, w_self, start, end):
+        self.check_index_range(w_self, start, end)
+    def size(self, w_self):
+        return 0
+    def check_can_handle(self, value):
+        return False
+
+class SingleValueStrategyStorage(object):
+    """Small container object for a size value."""
+    _attrs_ = ['size']
+    def __init__(self, size=0):
+        self.size = size
+
+class SingleValueStrategy(AbstractStrategy):
+    # == Required:
+    # See AbstractStrategy
+    # check_index_*(...) - use mixin SafeIndexingMixin or UnsafeIndexingMixin
+    # value(self) - the single value contained in this strategy. Should be constant.
+    
+    def initialize_storage(self, w_self, initial_size):
+        storage_obj = SingleValueStrategyStorage(initial_size)
+        self.set_storage(w_self, storage_obj)
+    def convert_storage_from(self, w_self, previous_strategy):
+        self.initialize_storage(w_self, previous_strategy.size(w_self))
+    
+    def fetch(self, w_self, index0):
+        self.check_index_fetch(w_self, index0)
+        return self.value()
+    def store(self, w_self, index0, value):
+        self.check_index_store(w_self, index0)
+        if self.check_can_handle(value):
+            return
+        self.cannot_handle_store(w_self, index0, value)
+    
+    @jit.unroll_safe
+    def insert(self, w_self, index0, list_w):
+        storage_obj = self.get_storage(w_self)
+        for i in range(len(list_w)):
+            if self.check_can_handle(list_w[i]):
+                storage_obj.size += 1
+            else:
+                self.cannot_handle_insert(w_self, index0 + i, list_w[i:])
+                return
+    
+    def delete(self, w_self, start, end):
+        self.check_index_range(w_self, start, end)
+        self.get_storage(w_self).size -= (end - start)
+    def size(self, w_self):
+        return self.get_storage(w_self).size
+    def check_can_handle(self, value):
+        return value is self.value()
+    
+# ============== Basic strategies with storage ==============
+
+class StrategyWithStorage(AbstractStrategy):
+    # == Required:
+    # See AbstractStrategy
+    # check_index_*(...) - use mixin SafeIndexingMixin or UnsafeIndexingMixin
+    # default_value(self) - The value to be initially contained in this strategy
+    
+    def initialize_storage(self, w_self, initial_size):
+        default = self._unwrap(self.default_value())
+        self.set_storage(w_self, [default] * initial_size)
+    
+    @jit.unroll_safe
+    def convert_storage_from(self, w_self, previous_strategy):
+        size = previous_strategy.size(w_self)
+        new_storage = [ self._unwrap(previous_strategy.fetch(w_self, i))
+                        for i in range(size) ]
+        self.set_storage(w_self, new_storage)
+    
+    def store(self, w_self, index0, wrapped_value):
+        self.check_index_store(w_self, index0)
+        if self.check_can_handle(wrapped_value):
+            unwrapped = self._unwrap(wrapped_value)
+            self.get_storage(w_self)[index0] = unwrapped
+        else:
+            self.cannot_handle_store(w_self, index0, wrapped_value)
+    
+    def fetch(self, w_self, index0):
+        self.check_index_fetch(w_self, index0)
+        unwrapped = self.get_storage(w_self)[index0]
+        return self._wrap(unwrapped)
+    
+    def _wrap(self, value):
+        raise NotImplementedError("Abstract method")
+    
+    def _unwrap(self, value):
+        raise NotImplementedError("Abstract method")
+    
+    def size(self, w_self):
+        return len(self.get_storage(w_self))
+    
+    @jit.unroll_safe
+    def insert(self, w_self, start, list_w):
+        if start > self.size(w_self):
+            start = self.size(w_self)
+        for i in range(len(list_w)):
+            if self.check_can_handle(list_w[i]):
+                self.get_storage(w_self).insert(start + i, self._unwrap(list_w[i]))
+            else:
+                self.cannot_handle_insert(w_self, start + i, list_w[i:])
+                return
+    
+    def delete(self, w_self, start, end):
+        self.check_index_range(w_self, start, end)
+        assert start >= 0 and end >= 0
+        del self.get_storage(w_self)[start : end]
+        
+class GenericStrategy(StrategyWithStorage):
+    # == Required:
+    # See StrategyWithStorage
+    
+    def _wrap(self, value):
+        return value
+    def _unwrap(self, value):
+        return value
+    def check_can_handle(self, wrapped_value):
+        return True
+    
+class WeakGenericStrategy(StrategyWithStorage):
+    # == Required:
+    # See StrategyWithStorage
+    
+    def _wrap(self, value):
+        return value() or self.default_value()
+    def _unwrap(self, value):
+        assert value is not None
+        return weakref.ref(value)
+    def check_can_handle(self, wrapped_value):
+        return True
+    
+# ============== Mixins for index checking operations ==============
+
+class SafeIndexingMixin(object):
+    def check_index_store(self, w_self, index0):
+        self.check_index(w_self, index0)
+    def check_index_fetch(self, w_self, index0):
+        self.check_index(w_self, index0)
+    def check_index_range(self, w_self, start, end):
+        if end < start:
+            raise IndexError
+        self.check_index(w_self, start)
+        self.check_index(w_self, end)
+    def check_index(self, w_self, index0):
+        if index0 < 0 or index0 >= self.size(w_self):
+            raise IndexError
+
+class UnsafeIndexingMixin(object):
+    def check_index_store(self, w_self, index0):
+        pass
+    def check_index_fetch(self, w_self, index0):
+        pass
+    def check_index_range(self, w_self, start, end):
+        pass
+
+# ============== Specialized Storage Strategies ==============
+
+class SpecializedStrategy(StrategyWithStorage):
+    # == Required:
+    # See StrategyWithStorage
+    # wrap(self, value) - Return a boxed object for the primitive value
+    # unwrap(self, value) - Return the unboxed primitive value of value
+    
+    def _unwrap(self, value):
+        return self.unwrap(value)
+    def _wrap(self, value):
+        return self.wrap(value)
+    
+class SingleTypeStrategy(SpecializedStrategy):
+    # == Required Functions:
+    # See SpecializedStrategy
+    # contained_type - The wrapped type that can be stored in this strategy
+    
+    def check_can_handle(self, value):
+        return isinstance(value, self.contained_type)
+    
+class TaggingStrategy(SingleTypeStrategy):
+    """This strategy uses a special tag value to represent a single additional object."""
+    # == Required:
+    # See SingleTypeStrategy
+    # wrapped_tagged_value(self) - The tagged object
+    # unwrapped_tagged_value(self) - The unwrapped tag value representing the tagged object
+    
+    def check_can_handle(self, value):
+        return value is self.wrapped_tagged_value() or \
+                (isinstance(value, self.contained_type) and \
+                self.unwrap(value) != self.unwrapped_tagged_value())
+    
+    def _unwrap(self, value):
+        if value is self.wrapped_tagged_value():
+            return self.unwrapped_tagged_value()
+        return self.unwrap(value)
+    
+    def _wrap(self, value):
+        if value == self.unwrapped_tagged_value():
+            return self.wrapped_tagged_value()
+        return self.wrap(value)
diff --git a/rpython/rlib/rstrategies/test.py b/rpython/rlib/rstrategies/test.py
new file mode 100644
--- /dev/null
+++ b/rpython/rlib/rstrategies/test.py
@@ -0,0 +1,434 @@
+
+import py
+from rpython.rlib.rstrategies import rstrategies as rs
+from rpython.rlib.objectmodel import import_from_mixin
+
+# === Define small model tree
+
+class W_AbstractObject(object):
+    pass
+
+class W_Object(W_AbstractObject):
+    pass
+
+class W_Integer(W_AbstractObject):
+    def __init__(self, value):
+        self.value = value
+    def __eq__(self, other):
+        return isinstance(other, W_Integer) and self.value == other.value
+
+class W_List(W_AbstractObject):
+    rs.make_accessors()
+    def __init__(self, strategy=None, size=0, elements=None):
+        self.strategy = None
+        if strategy:
+            factory.set_initial_strategy(self, strategy, size, elements)
+    def fetch(self, i):
+        assert self.strategy
+        return self.strategy.fetch(self, i)
+    def store(self, i, value):
+        assert self.strategy
+        return self.strategy.store(self, i, value)
+    def size(self):
+        assert self.strategy
+        return self.strategy.size(self)
+    def insert(self, index0, list_w):
+        assert self.strategy
+        return self.strategy.insert(self, index0, list_w)
+    def delete(self, start, end):
+        assert self.strategy
+        return self.strategy.delete(self, start, end)
+    def append(self, list_w):
+        assert self.strategy
+        return self.strategy.append(self, list_w)
+    def pop(self, index0):
+        assert self.strategy
+        return self.strategy.pop(self, index0)
+    def slice(self, start, end):
+        assert self.strategy
+        return self.strategy.slice(self, start, end)
+    def fetch_all(self):
+        assert self.strategy
+        return self.strategy.fetch_all(self)
+    def store_all(self, elements):
+        assert self.strategy
+        return self.strategy.store_all(self, elements)
+
+w_nil = W_Object()
+
+# === Define concrete strategy classes
+
+class AbstractStrategy(object):
+    __metaclass__ = rs.StrategyMetaclass
+    import_from_mixin(rs.AbstractStrategy)
+    import_from_mixin(rs.SafeIndexingMixin)
+    def __init__(self, factory, w_self=None, size=0):
+        self.factory = factory
+    def strategy_factory(self):
+        return self.factory
+
+class Factory(rs.StrategyFactory):
+    switching_log = []
+    
+    def __init__(self, root_class):
+        self.decorate_strategies({
+            EmptyStrategy: [GenericStrategy],
+            NilStrategy: [IntegerOrNilStrategy, GenericStrategy],
+            GenericStrategy: [],
+            WeakGenericStrategy: [],
+                IntegerStrategy: [IntegerOrNilStrategy, GenericStrategy],
+            IntegerOrNilStrategy: [GenericStrategy],
+        })
+        rs.StrategyFactory.__init__(self, root_class)
+    
+    def instantiate_strategy(self, strategy_type, w_self=None, size=0):
+        return strategy_type(self, w_self, size)
+    
+    def set_strategy(self, w_list, strategy): 
+        old_strategy = self.get_strategy(w_list)
+        self.switching_log.append((old_strategy, strategy))
+        super(Factory, self).set_strategy(w_list, strategy)
+    
+    def clear_log(self):
+        del self.switching_log[:]
+
+class EmptyStrategy(AbstractStrategy):
+    import_from_mixin(rs.EmptyStrategy)
+
+class NilStrategy(AbstractStrategy):
+    import_from_mixin(rs.SingleValueStrategy)
+    def value(self): return w_nil
+
+class GenericStrategy(AbstractStrategy):
+    import_from_mixin(rs.GenericStrategy)
+    import_from_mixin(rs.UnsafeIndexingMixin)
+    def default_value(self): return w_nil
+
+class WeakGenericStrategy(AbstractStrategy):
+    import_from_mixin(rs.WeakGenericStrategy)
+    def default_value(self): return w_nil
+    
+class IntegerStrategy(AbstractStrategy):
+    import_from_mixin(rs.SingleTypeStrategy)
+    contained_type = W_Integer
+    def wrap(self, value): return W_Integer(value)
+    def unwrap(self, value): return value.value
+    def default_value(self): return W_Integer(0)
+
+class IntegerOrNilStrategy(AbstractStrategy):
+    import_from_mixin(rs.TaggingStrategy)
+    contained_type = W_Integer
+    def wrap(self, value): return W_Integer(value)
+    def unwrap(self, value): return value.value
+    def default_value(self): return w_nil
+    def wrapped_tagged_value(self): return w_nil
+    def unwrapped_tagged_value(self): import sys; return sys.maxint
+    
+ at rs.strategy(generalize=[], singleton=False)
+class NonSingletonStrategy(GenericStrategy):
+    def __init__(self, factory, w_list=None, size=0):
+        super(NonSingletonStrategy, self).__init__(factory, w_list, size)
+        self.w_list = w_list
+        self.size = size
+
+class NonStrategy(NonSingletonStrategy):
+    pass
+
+factory = Factory(AbstractStrategy)
+
+def check_contents(list, expected):
+    assert list.size() == len(expected)
+    for i, val in enumerate(expected):
+        assert list.fetch(i) == val
+
+def teardown():
+    factory.clear_log()
+
+# === Test Initialization and fetch
+
+def test_setup():
+    pass
+
+def test_factory_setup():
+    expected_strategies = 7
+    assert len(factory.strategies) == expected_strategies
+    assert len(set(factory.strategies)) == len(factory.strategies)
+    for strategy in factory.strategies:
+        assert isinstance(factory.strategy_singleton_instance(strategy), strategy)
+
+def test_factory_setup_singleton_instances():
+    new_factory = Factory(AbstractStrategy)
+    s1 = factory.strategy_singleton_instance(GenericStrategy)
+    s2 = new_factory.strategy_singleton_instance(GenericStrategy)
+    assert s1 is not s2
+    assert s1.strategy_factory() is factory
+    assert s2.strategy_factory() is new_factory
+
+def test_metaclass():
+    assert NonStrategy._is_strategy == False
+    assert IntegerOrNilStrategy._is_strategy == True
+    assert IntegerOrNilStrategy._is_singleton == True
+    assert NonSingletonStrategy._is_singleton == False
+    assert NonStrategy._is_singleton == False
+    assert NonStrategy.get_storage is not NonSingletonStrategy.get_storage
+
+def test_singletons():
+    def do_test_singletons(cls, expected_true):
+        l1 = W_List(cls, 0)
+        l2 = W_List(cls, 0)
+        if expected_true:
+            assert l1.strategy is l2.strategy
+        else:
+            assert l1.strategy is not l2.strategy
+    do_test_singletons(EmptyStrategy, True)
+    do_test_singletons(NonSingletonStrategy, False)
+    do_test_singletons(NonStrategy, False)
+    do_test_singletons(GenericStrategy, True)
+
+def do_test_initialization(cls, default_value=w_nil, is_safe=True):
+    size = 10
+    l = W_List(cls, size)
+    s = l.strategy
+    assert s.size(l) == size
+    assert s.fetch(l,0) == default_value
+    assert s.fetch(l,size/2) == default_value
+    assert s.fetch(l,size-1) == default_value
+    py.test.raises(IndexError, s.fetch, l, size)
+    py.test.raises(IndexError, s.fetch, l, size+1)
+    py.test.raises(IndexError, s.fetch, l, size+5)
+    if is_safe:
+        py.test.raises(IndexError, s.fetch, l, -1)
+    else:
+        assert s.fetch(l, -1) == s.fetch(l, size - 1)
+
+def test_init_Empty():
+    l = W_List(EmptyStrategy, 0)
+    s = l.strategy
+    assert s.size(l) == 0
+    py.test.raises(IndexError, s.fetch, l, 0)
+    py.test.raises(IndexError, s.fetch, l, 10)
+    
+def test_init_Nil():
+    do_test_initialization(NilStrategy)
+
+def test_init_Generic():
+    do_test_initialization(GenericStrategy, is_safe=False)
+    
+def test_init_WeakGeneric():
+    do_test_initialization(WeakGenericStrategy)
+    
+def test_init_Integer():
+    do_test_initialization(IntegerStrategy, default_value=W_Integer(0))
+    
+def test_init_IntegerOrNil():
+    do_test_initialization(IntegerOrNilStrategy)
+    
+# === Test Simple store
+
+def do_test_store(cls, stored_value=W_Object(), is_safe=True, is_varsize=False):
+    size = 10
+    l = W_List(cls, size)
+    s = l.strategy
+    def store_test(index):
+        s.store(l, index, stored_value)
+        assert s.fetch(l, index) == stored_value
+    store_test(0)
+    store_test(size/2)
+    store_test(size-1)
+    if not is_varsize:
+        py.test.raises(IndexError, s.store, l, size, stored_value)
+        py.test.raises(IndexError, s.store, l, size+1, stored_value)
+        py.test.raises(IndexError, s.store, l, size+5, stored_value)
+    if is_safe:
+        py.test.raises(IndexError, s.store, l, -1, stored_value)
+    else:
+        store_test(-1)
+
+def test_store_Nil():
+    do_test_store(NilStrategy, stored_value=w_nil)
+
+def test_store_Generic():
+    do_test_store(GenericStrategy, is_safe=False)
+    
+def test_store_WeakGeneric():
+    do_test_store(WeakGenericStrategy, stored_value=w_nil)
+    
+def test_store_Integer():
+    do_test_store(IntegerStrategy, stored_value=W_Integer(100))
+    
+def test_store_IntegerOrNil():
+    do_test_store(IntegerOrNilStrategy, stored_value=W_Integer(100))
+    do_test_store(IntegerOrNilStrategy, stored_value=w_nil)
+
+# === Test Insert
+
+def do_test_insert(cls, values):
+    l = W_List(cls, 0)
+    assert len(values) >= 6
+    values1 = values[0:2]
+    values2 = values[2:4]
+    values3 = values[4:6]
+    l.insert(0, values1+values3)
+    check_contents(l, values1+values3)
+    l.insert(2, values2)
+    check_contents(l, values)
+
+def test_insert_Nil():
+    do_test_insert(NilStrategy, [w_nil]*6)
+
+def test_insert_Generic():
+    do_test_insert(GenericStrategy, [W_Object() for _ in range(6)])
+    
+def test_insert_WeakGeneric():
+    do_test_insert(WeakGenericStrategy, [W_Object() for _ in range(6)])
+    
+def test_insert_Integer():
+    do_test_insert(IntegerStrategy, [W_Integer(x) for x in range(6)])
+    
+def test_insert_IntegerOrNil():
+    do_test_insert(IntegerOrNilStrategy, [w_nil]+[W_Integer(x) for x in range(4)]+[w_nil])
+    do_test_insert(IntegerOrNilStrategy, [w_nil]*6)
+    
+# === Test Delete
+
+def do_test_delete(cls, values):
+    assert len(values) >= 6
+    l = W_List(cls, len(values), values)
+    l.delete(2, 4)
+    del values[2: 4]
+    check_contents(l, values)
+    l.delete(1, 2)
+    del values[1: 2]
+    check_contents(l, values)
+
+def test_delete_Nil():
+    do_test_delete(NilStrategy, [w_nil]*6)
+
+def test_delete_Generic():
+    do_test_delete(GenericStrategy, [W_Object() for _ in range(6)])
+    
+def test_delete_WeakGeneric():
+    do_test_delete(WeakGenericStrategy, [W_Object() for _ in range(6)])
+    
+def test_delete_Integer():
+    do_test_delete(IntegerStrategy, [W_Integer(x) for x in range(6)])
+    
+def test_delete_IntegerOrNil():
+    do_test_delete(IntegerOrNilStrategy, [w_nil]+[W_Integer(x) for x in range(4)]+[w_nil])
+    do_test_delete(IntegerOrNilStrategy, [w_nil]*6)
+
+# === Test Transitions
+
+def test_CheckCanHandle():
+    def assert_handles(cls, good, bad):
+        s = cls(0)
+        for val in good:
+            assert s.check_can_handle(val)
+        for val in bad:
+            assert not s.check_can_handle(val)
+    obj = W_Object()
+    i = W_Integer(0)
+    nil = w_nil
+    
+    assert_handles(EmptyStrategy, [], [nil, obj, i])
+    assert_handles(NilStrategy, [nil], [obj, i])
+    assert_handles(GenericStrategy, [nil, obj, i], [])
+    assert_handles(WeakGenericStrategy, [nil, obj, i], [])
+    assert_handles(IntegerStrategy, [i], [nil, obj])
+    assert_handles(IntegerOrNilStrategy, [nil, i], [obj])
+
+def do_test_transition(OldStrategy, value, NewStrategy, initial_size=10):
+    w = W_List(OldStrategy, initial_size)
+    old = w.strategy
+    w.store(0, value)
+    assert isinstance(w.strategy, NewStrategy)
+    assert factory.switching_log == [(None, old), (old, w.strategy)]
+
+def test_AllNil_to_Generic():
+    do_test_transition(NilStrategy, W_Object(), GenericStrategy)
+
+def test_AllNil_to_IntegerOrNil():
+    do_test_transition(NilStrategy, W_Integer(0), IntegerOrNilStrategy)
+
+def test_IntegerOrNil_to_Generic():
+    do_test_transition(IntegerOrNilStrategy, W_Object(), GenericStrategy)
+
+def test_Integer_to_IntegerOrNil():
+    do_test_transition(IntegerStrategy, w_nil, IntegerOrNilStrategy)
+
+def test_Integer_Generic():
+    do_test_transition(IntegerStrategy, W_Object(), GenericStrategy)
+
+def test_TaggingValue_not_storable():
+    tag = IntegerOrNilStrategy(10).unwrapped_tagged_value() # sys.maxint
+    do_test_transition(IntegerOrNilStrategy, W_Integer(tag), GenericStrategy)
+
+# TODO - Test transition from varsize back to Empty
+
+# === Test helper methods
+
+def generic_list():
+    values = [W_Object() for _ in range(6)]
+    return W_List(GenericStrategy, len(values), values), values
+
+def test_slice():
+    l, v = generic_list()
+    assert l.slice(2, 4) == v[2:4]
+
+def test_fetch_all():
+    l, v = generic_list()
+    assert l.fetch_all() == v
+
+def test_append():
+    l, v = generic_list()
+    o1 = W_Object()
+    o2 = W_Object()
+    l.append([o1])
+    assert l.fetch_all() == v + [o1]
+    l.append([o1, o2])
+    assert l.fetch_all() == v + [o1, o1, o2]
+
+def test_pop():
+    l, v = generic_list()
+    o = l.pop(3)
+    del v[3]
+    assert l.fetch_all() == v
+    o = l.pop(3)
+    del v[3]
+    assert l.fetch_all() == v
+
+def test_store_all():
+    l, v = generic_list()
+    v2 = [W_Object() for _ in range(4) ]
+    v3 = [W_Object() for _ in range(l.size()) ]
+    assert v2 != v
+    assert v3 != v
+    
+    l.store_all(v2)
+    assert l.fetch_all() == v2+v[4:]
+    l.store_all(v3)
+    assert l.fetch_all() == v3
+    
+    py.test.raises(IndexError, l.store_all, [W_Object() for _ in range(8) ])
+
+# === Test Weak Strategy
+# TODO
+
+# === Other tests
+
+def test_optimized_strategy_switch(monkeypatch):
+    l = W_List(NilStrategy, 5)
+    s = l.strategy
+    s.copied = 0
+    def convert_storage_from_default(self, w_self, other):
+        assert False, "The default convert_storage_from() should not be called!"
+    def convert_storage_from_special(self, w_self, other):
+        s.copied += 1
+    
+    monkeypatch.setattr(AbstractStrategy, "convert_storage_from_NilStrategy", convert_storage_from_special)
+    monkeypatch.setattr(AbstractStrategy, "convert_storage_from", convert_storage_from_default)
+    try:
+        factory.switch_strategy(l, IntegerOrNilStrategy)
+    finally:
+        monkeypatch.undo()
+    assert s.copied == 1, "Optimized switching routine not called exactly one time."


More information about the pypy-commit mailing list