[pypy-commit] pypy default: Added rstrategies (from https://github.com/antongulenko/rstrategies).
anton_gulenko
noreply at buildbot.pypy.org
Wed Apr 22 12:20:40 CEST 2015
Author: Anton Gulenko <anton.gulenko at googlemail.com>
Branch:
Changeset: r76867:25ca52e41849
Date: 2015-02-15 19:15 +0100
http://bitbucket.org/pypy/pypy/changeset/25ca52e41849/
Log: Added rstrategies (from
https://github.com/antongulenko/rstrategies).
diff --git a/rpython/rlib/rstrategies/README.md b/rpython/rlib/rstrategies/README.md
new file mode 100644
--- /dev/null
+++ b/rpython/rlib/rstrategies/README.md
@@ -0,0 +1,101 @@
+# rstrategies
+
+A library to implement storage strategies in VMs based on the RPython toolchain.
+rstrategies can be used in VMs for any language or language family.
+
+This library has been developed as part of a Masters Thesis by [Anton Gulenko](https://github.com/antongulenko).
+
+The original paper describing the optimization "Storage Strategies for collections in dynamically typed languages" by C.F. Bolz, L. Diekmann and L. Tratt can be found [here](http://stups.hhu.de/mediawiki/images/3/3b/Pub-BoDiTr13_246.pdf).
+
+So far, this library has been adpoted by 3 VMs: [RSqueak](https://github.com/HPI-SWA-Lab/RSqueak), [Topaz](https://github.com/topazproject/topaz) ([Forked here](https://github.com/antongulenko/topaz/tree/rstrategies)) and [Pycket](https://github.com/samth/pycket) ([Forked here](https://github.com/antongulenko/pycket/tree/rstrategies)).
+
+#### Concept
+
+Collections are often used homogeneously, i.e. they contain only objects of the same type.
+Primitive numeric types like ints or floats are especially interesting for optimization.
+These cases can be optimized by storing the unboxed data of these objects in consecutive memory.
+This is done by letting a special "strategy" object handle the entire storage of a collection.
+The collection object holds two separate references: one to its strategy and one to its storage.
+Every operation on the collection is delegated to the strategy, which accesses the storage when needed.
+The strategy can be switched to a more suitable one, which might require converting the storage array.
+
+## Usage
+
+The following are the steps needed to integrated rstrategies in an RPython VM.
+Because of the special nature of this library it is not enough to simply call some API methods; the library must be integrated within existing VM classes using a metaclass, mixins and other meta-programming techniques.
+
+The sequence of steps described here is something like a "setup walkthrough", and might be a bit abstract.
+To see a concrete example, look at [AbstractShadow](https://github.com/HPI-SWA-Lab/RSqueak/blob/d5ff2572106d23a5246884de6f8b86f46d85f4f7/spyvm/storage.py#L73), [StrategyFactory](https://github.com/HPI-SWA-Lab/RSqueak/blob/d5ff2572106d23a5246884de6f8b86f46d85f4f7/spyvm/storage.py#L126) and [W_PointersObject](https://github.com/HPI-SWA-Lab/RSqueak/blob/d5ff2572106d23a5246884de6f8b86f46d85f4f7/spyvm/model.py#L565) from the [RSqueak VM](https://github.com/HPI-SWA-Lab/RSqueak).
+The code is also well commented.
+
+#### Basics
+
+Currently the rstrategies library supports fixed sized and variable sized collections.
+This can be used to optimize a wide range of primitive data structures like arrays, lists or regular objects.
+Any of these are called 'collections' in this context.
+The VM should have a central class or class hierarchy for collections.
+In order to extend these classes and use strategies, the library needs accessor methods for two attributes of collection objects: strategy and storage.
+The easiest way is adding the following line to the body of the root collection class:
+```
+rstrategies.make_accessors(strategy='strategy', storage='storage')
+```
+This will generate the 4 accessor methods ```_[get/set]_[storage/strategy]()``` for the respective attributes.
+Alternatively, implement these methods manually or overwrite the getters/setters in ```StrategyFactory```.
+
+Next, the strategy classes must be defined. This requires a small class hierarchy with a dedicated root class.
+In the definition of this root class, include the following lines:
+```
+ __metaclass__ = rstrategies.StrategyMetaclass
+ import_from_mixin(rstrategies.AbstractStrategy)
+ import_from_mixin(rstrategies.SafeIndexingMixin)
+```
+
+```import_from_mixin``` can be found in ```rpython.rlib.objectmodel```.
+If index-checking is performed safely at other places in the VM, you can use ```rstrategies.UnsafeIndexingMixin``` instead.
+If you need your own metaclass, you can combine yours with the rstrategies one using multiple inheritance [like here](https://github.com/HPI-SWA-Lab/RSqueak/blob/d5ff2572106d23a5246884de6f8b86f46d85f4f7/spyvm/storage_contexts.py#L24).
+Also implement a ```storage_factory()``` method, which returns an instance of ```rstrategies.StorageFactory```, which is described below.
+
+#### Strategy classes
+
+Now you can create the actual strategy classes, subclassing them from the single root class.
+The following list summarizes the basic strategies available.
+* ```EmptyStrategy```
+ A strategy for empty collections; very efficient, but limited. Does not allocate anything.
+* ```SingleValueStrategy```
+ A strategy for collections containing the same object ```n``` times. Only allocates memory to store the size of the collection.
+* ```GenericStrategy```
+ A non-optimized strategy backed by a generic python list. This is the fallback strategy, since it can store everything, but is not optimized.
+* ```WeakGenericStrategy```
+ Like ```GenericStrategy```, but uses ```weakref``` to hold on weakly to its elements.
+* ```SingleTypeStrategy```
+ Can store a single unboxed type like int or float. This is the main
+* ```TaggingStrategy```
+ Extension of SingleTypeStrategy. Uses a specific value in the value range of the unboxed type to represent
+ one additional, arbitrary object.
+
+There are also intermediate classes, which allow creating new, more customized strategies. For this, you should get familiar with the code.
+
+Include one of these mixin classes using ```import_from_mixin```.
+The mixin classes contain comments describing methods or fields which are also required in the strategy class in order to use them.
+Additionally, add the @rstrategies.strategy(generalize=alist) decorator to all strategy classes.
+The ```alist``` parameter must contain all strategies, which the decorated strategy can switch to, if it can not represent a new element anymore.
+[Example](https://github.com/HPI-SWA-Lab/RSqueak/blob/d5ff2572106d23a5246884de6f8b86f46d85f4f7/spyvm/storage.py#L87) for an implemented strategy.
+See the other strategy classes behind this link for more examples.
+
+#### Strategy Factory
+
+The last part is subclassing ```rstrategies.StrategyFactory```, overwriting the method ```instantiate_strategy``` if necessary and passing the strategies root class to the constructor.
+The factory provides the methods ```switch_strategy```, ```set_initial_strategy```, ```strategy_type_for``` which can be used by the VM code to use the mechanism behind strategies.
+See the comments in the source code.
+
+The strategy mixins offer the following methods to manipulate the contents of the collection:
+* basic API
+ * ```size```
+* fixed size API
+ * ```store```, ```fetch```, ```slice```, ```store_all```, ```fetch_all```
+* variable size API
+ * ```insert```, ```delete```, ```append```, ```pop```
+
+If the collection has a fixed size, simply never use any of the variable size methods in the VM code.
+Since the strategies are singletons, these methods need the collection object as first parameter.
+For convenience, more fitting accessor methods should be implemented on the collection class itself.
diff --git a/rpython/rlib/rstrategies/__init__.py b/rpython/rlib/rstrategies/__init__.py
new file mode 100644
--- /dev/null
+++ b/rpython/rlib/rstrategies/__init__.py
@@ -0,0 +1,1 @@
+# Empy
diff --git a/rpython/rlib/rstrategies/logger.py b/rpython/rlib/rstrategies/logger.py
new file mode 100644
--- /dev/null
+++ b/rpython/rlib/rstrategies/logger.py
@@ -0,0 +1,54 @@
+
+class LogEntry(object):
+ def __init__(self):
+ self.slots = 0
+ self.objects = 0
+ self.element_typenames = {}
+
+ def add(self, size, element_typename):
+ self.slots += size
+ self.objects += 1
+ if element_typename:
+ self.element_typenames[element_typename] = None
+
+ def classnames(self):
+ return self.element_typenames.keys()
+
+class Logger(object):
+ _attrs_ = ["active", "aggregate", "logs"]
+ _immutable_fields_ = ["active?", "aggregate?", "logs"]
+
+ def __init__(self):
+ self.active = False
+ self.aggregate = False
+ self.logs = {}
+
+ def activate(self, aggregate=False):
+ self.active = True
+ self.aggregate = self.aggregate or aggregate
+
+ def log(self, new_strategy, size, cause="", old_strategy="", typename="", element_typename=""):
+ if self.aggregate:
+ key = (cause, old_strategy, new_strategy, typename)
+ if key not in self.logs:
+ self.logs[key] = LogEntry()
+ entry = self.logs[key]
+ entry.add(size, element_typename)
+ else:
+ element_typenames = [ element_typename ] if element_typename else []
+ self.output(cause, old_strategy, new_strategy, typename, size, 1, element_typenames)
+
+ def print_aggregated_log(self):
+ if not self.aggregate:
+ return
+ for key, entry in self.logs.items():
+ cause, old_strategy, new_strategy, typename = key
+ slots, objects, element_typenames = entry.slots, entry.objects, entry.classnames()
+ self.output(cause, old_strategy, new_strategy, typename, slots, objects, element_typenames)
+
+ def output(self, cause, old_strategy, new_strategy, typename, slots, objects, element_typenames):
+ old_strategy_string = "%s -> " % old_strategy if old_strategy else ""
+ classname_string = " of %s" % typename if typename else ""
+ element_string = (" elements: " + " ".join(element_typenames)) if element_typenames else ""
+ format = (cause, old_strategy_string, new_strategy, classname_string, slots, objects, element_string)
+ print "%s (%s%s)%s size %d objects %d%s" % format
diff --git a/rpython/rlib/rstrategies/logparser.py b/rpython/rlib/rstrategies/logparser.py
new file mode 100644
--- /dev/null
+++ b/rpython/rlib/rstrategies/logparser.py
@@ -0,0 +1,685 @@
+
+import re, os, sys, operator
+
+"""
+This script parses a log produced by rstrategies_logger.py into a graph and converts it to various outputs.
+The most useful outputs are the dot* commands producing a visualization of the log using the dot-command of graphviz.
+Every strategy is a node in the graph, and the edges are collections or objects that transition between
+two strategies at some point during the log.
+Artificial nodes are created for log entries without an explicit source node. These are the events when a
+collection is created.
+The input to this script is a logfile, a command and optional flags.
+If the name of the logfile includes one of the AVAILABLE_VMS as a substring, the first three global variables
+are automatically configured.
+The script should work without these configurations, but the output will probably not be that pretty.
+To avoid errors, use the -a flag when running without proper configuration.
+"""
+
+# This should contain a full list of storage nodes (strategies).
+# All strategies not included here will be combined into a single "Other"-node, if the -a flag is not given.
+STORAGE_NODES = []
+
+# This allows arbitrary renamings of storage strategy nodes
+NODE_RENAMINGS = {}
+
+# Artificial storage-source nodes are automatically named like the associated operation.
+# This dict allows customizing the names of these nodes.
+STORAGE_SOURCES = {}
+
+def SET_VM(vm_name):
+ global STORAGE_NODES
+ global NODE_RENAMINGS
+ global STORAGE_SOURCES
+ if vm_name == 'RSqueak':
+ STORAGE_NODES = ['List', 'WeakList', 'SmallIntegerOrNil', 'FloatOrNil', 'AllNil']
+ NODE_RENAMINGS = dict((x+'Strategy', x) for x in STORAGE_NODES)
+ STORAGE_SOURCES = {'Filledin': 'Image Loading', 'Initialized': 'Object Creation'}
+ elif vm_name == 'Pycket':
+ STORAGE_SOURCES = {'Created': 'Array Creation'}
+ # TODO
+ elif vm_name == 'Topaz':
+ # TODO
+ pass
+ else:
+ raise Exception("Unhandled vm name %s" % vm_name)
+
+AVAILABLE_VMS = ['RSqueak', 'Pycket', 'Topaz']
+
+# ====================================================================
+# ======== Logfile parsing
+# ====================================================================
+
+def percent(part, total):
+ if total == 0:
+ return 0
+ return float(part)*100 / total
+
+def parse(filename, flags, callback):
+ parsed_entries = 0
+ if filename == "-":
+ opener = lambda: sys.stdin
+ else:
+ opener = lambda: open(filename, 'r', 1)
+ with opener() as file:
+ while True:
+ line = file.readline()
+ if len(line) == 0:
+ break
+ entry = parse_line(line, flags)
+ if entry:
+ parsed_entries += 1
+ callback(entry)
+ return parsed_entries
+
+line_pattern = re.compile("^(?P<operation>\w+) \(((?P<old>\w+) -> )?(?P<new>\w+)\)( of (?P<classname>.+))? size (?P<size>[0-9]+)( objects (?P<objects>[0-9]+))?( elements: (?P<classnames>.+( .+)*))?$")
+
+def parse_line(line, flags):
+ result = line_pattern.match(line)
+ if result is None:
+ if flags.verbose:
+ print "Could not parse line: %s" % line[:-1]
+ return None
+ operation = str(result.group('operation'))
+ old_storage = result.group('old')
+ new_storage = str(result.group('new'))
+ classname = str(result.group('classname'))
+ size = int(result.group('size'))
+ objects = result.group('objects')
+ objects = int(objects) if objects else 1
+ classnames = result.group('classnames')
+ if classnames is not None:
+ classnames = classnames.split(' ')
+ classnames = set(classnames)
+ else:
+ classnames = set()
+
+ is_storage_source = old_storage is None
+ if is_storage_source:
+ if operation in STORAGE_SOURCES:
+ old_storage = STORAGE_SOURCES[operation]
+ else:
+ print "Using operation %s as storage source." % operation
+ old_storage = str(old_storage)
+
+ if new_storage in NODE_RENAMINGS:
+ new_storage = NODE_RENAMINGS[new_storage]
+ if old_storage in NODE_RENAMINGS:
+ old_storage = NODE_RENAMINGS[old_storage]
+
+ return LogEntry(operation, old_storage, new_storage, classname, size, objects, classnames, is_storage_source)
+
+class LogEntry(object):
+
+ def __init__(self, operation, old_storage, new_storage, classname, size, objects, classnames, is_storage_source):
+ self.operation = operation
+ self.old_storage = old_storage
+ self.new_storage = new_storage
+ self.classname = classname
+ self.size = size
+ self.objects = objects
+ self.classnames = classnames
+ self.is_storage_source = is_storage_source
+ assert old_storage != new_storage, "old and new storage identical in log entry: %s" % self
+
+ def full_key(self):
+ return (self.operation, self.old_storage, self.new_storage)
+
+ def __lt__(self, other):
+ return self.classname < other.classname
+
+ def __repr__(self):
+ return "%s(%s)" % (self.__str__(), object.__repr__(self))
+
+ def __str__(self):
+ old_storage_string = "%s -> " % self.old_storage if self.old_storage else ""
+ classname_string = " of %s" % self.classname if self.classname else ""
+ objects_string = " objects %d" % self.objects if self.objects > 1 else ""
+ return "%s (%s%s)%s size %d%s" % (self.operation, old_storage_string, self.new_storage, classname_string, self.size, objects_string)
+
+# ====================================================================
+# ======== Graph parsing
+# ====================================================================
+
+class Operations(object):
+
+ def __init__(self, objects=0, slots=0, element_classnames=[]):
+ self.objects = objects
+ self.slots = slots
+ self.element_classnames = set(element_classnames)
+
+ def __str__(self, total=None):
+ if self.objects == 0:
+ avg_slots = 0
+ else:
+ avg_slots = float(self.slots) / self.objects
+ if total is not None and total.slots != 0:
+ percent_slots = " (%.1f%%)" % percent(self.slots, total.slots)
+ else:
+ percent_slots = ""
+ if total is not None and total.objects != 0:
+ percent_objects = " (%.1f%%)" % percent(self.objects, total.objects)
+ else:
+ percent_objects = ""
+ slots = format(self.slots, ",d")
+ objects = format(self.objects, ",d")
+ classnames = (" [ elements: %s ]" % ' '.join([str(x) for x in self.element_classnames])) \
+ if len(self.element_classnames) else ""
+ return "%s%s slots in %s%s objects (avg size: %.1f)%s" % (slots, percent_slots, objects, percent_objects, avg_slots, classnames)
+
+ def __repr__(self):
+ return "%s(%s)" % (self.__str__(), object.__repr__(self))
+
+ def add_log_entry(self, entry):
+ self.slots = self.slots + entry.size
+ self.objects = self.objects + entry.objects
+ self.element_classnames |= entry.classnames
+
+ def __sub__(self, other):
+ return Operations(self.objects - other.objects, self.slots - other.slots)
+
+ def __add__(self, other):
+ return Operations(self.objects + other.objects, self.slots + other.slots)
+
+ def __lt__(self, other):
+ return self.slots < other.slots
+
+ def empty(self):
+ return self.objects == 0 and self.slots == 0
+
+ def prefixprint(self, key="", total=None):
+ if not self.empty():
+ print "%s%s" % (key, self.__str__(total))
+
+class ClassOperations(object):
+
+ def __init__(self):
+ self.classes = {}
+
+ def cls(self, name):
+ if name not in self.classes:
+ self.classes[name] = Operations()
+ return self.classes[name]
+
+ def total(self):
+ return reduce(operator.add, self.classes.values(), Operations())
+
+ def __str__(self):
+ return "ClassOperations(%s)" % self.classes
+
+ def __repr__(self):
+ return "%s(%s)" % (self.__str__(), object.__repr__(self))
+
+ def __add__(self, other):
+ result = ClassOperations()
+ result.classes = dict(self.classes)
+ for classname, other_class in other.classes.items():
+ result.cls(classname) # Make sure exists.
+ result.classes[classname] += other_class
+ return result
+
+ def __sub__(self, other):
+ result = ClassOperations()
+ result.classes = dict(self.classes)
+ for classname, other_class in other.classes.items():
+ result.cls(classname) # Make sure exists.
+ result.classes[classname] -= other_class
+ return result
+
+class StorageEdge(object):
+
+ def __init__(self, operation="None", origin=None, target=None):
+ self.operation = operation
+ self.classes = ClassOperations()
+ self.origin = origin
+ self.target = target
+ self.is_storage_source = False
+
+ def full_key(self):
+ return (self.operation, self.origin.name, self.target.name)
+
+ def cls(self, classname):
+ return self.classes.cls(classname)
+
+ def total(self):
+ return self.classes.total()
+
+ def notify_nodes(self):
+ self.origin.note_outgoing(self)
+ self.target.note_incoming(self)
+
+ def add_log_entry(self, entry):
+ self.cls(entry.classname).add_log_entry(entry)
+ if entry.is_storage_source:
+ self.is_storage_source = True
+
+ def as_log_entries(self):
+ entries = []
+ for classname, ops in self.classes.classes.items():
+ origin = None if self.is_storage_source else self.origin.name
+ entry = LogEntry(self.operation, origin, self.target.name, classname,
+ ops.slots, ops.objects, ops.element_classnames, self.is_storage_source)
+ entries.append(entry)
+ return entries
+
+ def __lt__(self, other):
+ return self.full_key() < other.full_key()
+
+ def __str__(self):
+ return "[%s %s -> %s]" % (self.operation, self.origin, self.target)
+
+ def __repr__(self):
+ return "%s(%s)" % (self.__str__(), object.__repr__(self))
+
+ def __add__(self, other):
+ origin = self.origin if self.origin is not None else other.origin
+ target = self.target if self.target is not None else other.target
+ result = StorageEdge(self.operation, origin, target)
+ result.classes += self.classes + other.classes
+ return result
+
+ def __sub__(self, other):
+ origin = self.origin if self.origin is not None else other.origin
+ target = self.target if self.target is not None else other.target
+ result = StorageEdge(self.operation, origin, target)
+ result.classes += self.classes - other.classes
+ return result
+
+class StorageNode(object):
+
+ def __init__(self, name):
+ self.name = name
+ self.incoming = set()
+ self.outgoing = set()
+
+ def note_incoming(self, edge):
+ assert edge.target is self
+ if edge not in self.incoming:
+ self.incoming.add(edge)
+
+ def note_outgoing(self, edge):
+ assert edge.origin is self
+ if edge not in self.outgoing:
+ self.outgoing.add(edge)
+
+ def incoming_edges(self, operation):
+ return filter(lambda x: x.operation == operation, self.incoming)
+
+ def outgoing_edges(self, operation):
+ return filter(lambda x: x.operation == operation, self.outgoing)
+
+ def sum_incoming(self, operation):
+ return reduce(operator.add, self.incoming_edges(operation), StorageEdge(operation))
+
+ def sum_outgoing(self, operation):
+ return reduce(operator.add, self.outgoing_edges(operation), StorageEdge(operation))
+
+ def sum_all_incoming(self):
+ return reduce(operator.add, self.incoming, StorageEdge())
+
+ def sum_all_outgoing(self):
+ return reduce(operator.add, self.outgoing, StorageEdge())
+
+ def __str__(self):
+ return self.name
+
+ def __repr__(self):
+ return "%s(%s)" % (self.__str__(), object.__repr__(self))
+
+ def merge_edge_sets(self, set1, set2, key_slot):
+ getter = lambda edge: edge.__dict__[key_slot]
+ set_dict = dict([(getter(edge), edge) for edge in set1])
+ for edge in set2:
+ key = getter(edge)
+ if key not in set_dict:
+ set_dict[key] = edge
+ else:
+ set_dict[key] += edge
+ return set(set_dict.values())
+
+ def __add__(self, other):
+ result = StorageNode("%s %s" % (self.name, other.name))
+ result.incoming = self.merge_edge_sets(self.incoming, other.incoming, "origin")
+ # TODO bad code
+ for edge in result.incoming:
+ edge.target = result
+ result.outgoing = self.merge_edge_sets(self.outgoing, other.outgoing, "target")
+ for edge in result.outgoing:
+ edge.origin = result
+ return result
+
+ def __lt__(self, other):
+ return self.name < other.name
+
+ def is_artificial(self):
+ for outgoing in self.outgoing:
+ if outgoing.is_storage_source:
+ return True
+ return False
+
+ def is_storage_node(self):
+ return self.is_artificial() or self.name in STORAGE_NODES
+
+ def dot_name(self):
+ return self.name.replace(" ", "_")
+
+class StorageGraph(object):
+
+ def __init__(self):
+ self.nodes = {}
+ self.edges = {}
+ self.operations = set()
+
+ def node(self, name):
+ if str(name) == 'None':
+ import pdb; pdb.set_trace()
+ if name not in self.nodes:
+ self.nodes[name] = StorageNode(name)
+ return self.nodes[name]
+
+ def assert_sanity(self):
+ visited_edges = set()
+ for node in self.nodes.values():
+ for edge in node.incoming:
+ assert edge in self.edges.values(), "Edge not in graph's edges: %s" % edge
+ visited_edges.add(edge)
+ if not edge.target is node:
+ print "Wrong edge target: %s\nIncoming edge: %s\nIn node: %s" % (edge.target, edge, node)
+ assert False
+ if not edge in edge.origin.outgoing:
+ print "Edge not in origin's outgoing: %s\nIncoming edge: %s\nIn node: %s" % (edge.origin.outgoing, edge, node)
+ assert False
+ for edge in node.outgoing:
+ assert edge in self.edges.values(), "Edge not in graph's edges: %s" % edge
+ visited_edges.add(edge)
+ if not edge.origin is node:
+ print "Wrong edge origin: %s\nOutgoing edge: %s\nIn node: %s" % (edge.origin, edge, node)
+ assert False
+ if not edge in edge.target.incoming:
+ print "Edge not in origin's incoming: %s\nOutgoing edge: %s\nIn node: %s" % (edge.target.incoming, edge, node)
+ assert False
+ assert len(visited_edges) == len(self.edges.values()), "Not all of graph's edges visited."
+
+ def add_log_entry(self, log_entry):
+ self.operations.add(log_entry.operation)
+ key = log_entry.full_key()
+ if key not in self.edges:
+ edge = StorageEdge(log_entry.operation, self.node(log_entry.old_storage), self.node(log_entry.new_storage))
+ self.edges[key] = edge
+ edge.notify_nodes()
+ self.edges[key].add_log_entry(log_entry)
+
+ def collapse_nodes(self, collapsed_nodes, new_name=None):
+ if len(collapsed_nodes) == 0:
+ return
+ for node in collapsed_nodes:
+ del self.nodes[node.name]
+ for edge in node.incoming:
+ del self.edges[edge.full_key()]
+ for edge in node.outgoing:
+ del self.edges[edge.full_key()]
+ new_node = reduce(operator.add, collapsed_nodes)
+ if new_name is not None:
+ new_node.name = new_name
+ self.nodes[new_node.name] = new_node
+ # TODO bad code
+ for node in collapsed_nodes:
+ for edge in node.incoming:
+ edge.origin.outgoing.remove(edge)
+ new_edges = filter(lambda filtered: filtered.origin == edge.origin, new_node.incoming)
+ assert len(new_edges) == 1
+ edge.origin.outgoing.add(new_edges[0])
+ for edge in node.outgoing:
+ edge.target.incoming.remove(edge)
+ new_edges = filter(lambda filtered: filtered.target == edge.target, new_node.outgoing)
+ assert len(new_edges) == 1
+ edge.target.incoming.add(new_edges[0])
+ for edge in new_node.incoming:
+ self.edges[edge.full_key()] = edge
+ for edge in new_node.outgoing:
+ self.edges[edge.full_key()] = edge
+ self.assert_sanity()
+
+ def collapse_nonstorage_nodes(self, new_name=None):
+ nodes = filter(lambda x: not x.is_storage_node(), self.nodes.values())
+ self.collapse_nodes(nodes, new_name)
+
+ def sorted_nodes(self):
+ nodes = self.nodes.values()
+ nodes.sort()
+ return nodes
+
+def make_graph(logfile, flags):
+ graph = StorageGraph()
+ def callback(entry):
+ graph.add_log_entry(entry)
+ parse(logfile, flags, callback)
+ graph.assert_sanity()
+ return graph
+
+# ====================================================================
+# ======== Command - Summarize log content
+# ====================================================================
+
+def command_summarize(logfile, flags):
+ graph = make_graph(logfile, flags)
+ if not flags.allstorage:
+ graph.collapse_nonstorage_nodes()
+ for node in graph.sorted_nodes():
+ node.print_summary(flags, graph.operations)
+
+def StorageNode_print_summary(self, flags, all_operations):
+ print "\n%s:" % self.name
+ sum = StorageEdge()
+ total_incoming = self.sum_all_incoming().total() if flags.percent else None
+
+ print "\tIncoming:"
+ for operation in all_operations:
+ if flags.detailed:
+ edges = [ (edge.origin.name, edge) for edge in self.incoming_edges(operation) ]
+ else:
+ edges = [ (operation, self.sum_incoming(operation)) ]
+ for edgename, edge in edges:
+ edge.print_with_name("\t\t\t", edgename, total_incoming, flags)
+ sum += edge
+
+ print "\tOutgoing:"
+ for operation in all_operations:
+ if flags.detailed:
+ edges = [ (edge.target.name, edge) for edge in self.outgoing_edges(operation) ]
+ else:
+ edges = [ (operation, self.sum_outgoing(operation)) ]
+ for edgename, edge in edges:
+ edge.print_with_name("\t\t\t", edgename, total_incoming, flags)
+ sum -= edge
+
+ sum.print_with_name("\t", "Remaining", total_incoming, flags)
+
+StorageNode.print_summary = StorageNode_print_summary
+
+def StorageEdge_print_with_name(self, prefix, edgename, total_reference, flags):
+ if flags.classes:
+ print "%s%s:" % (prefix, edgename)
+ prefix += "\t\t"
+ operations = self.classes.classes.items()
+ operations.sort(reverse=True, key=operator.itemgetter(1))
+ else:
+ operations = [ (edgename, self.total()) ]
+ for classname, classops in operations:
+ classops.prefixprint("%s%s: " % (prefix, classname), total_reference)
+
+StorageEdge.print_with_name = StorageEdge_print_with_name
+
+# ====================================================================
+# ======== Command - DOT output
+# ====================================================================
+
+# Output is valid dot code and can be parsed by the graphviz dot utility.
+def command_print_dot(logfile, flags):
+ graph = make_graph(logfile, flags)
+ print "/*"
+ print "Storage Statistics (dot format):"
+ print "================================"
+ print "*/"
+ print dot_string(graph, flags)
+
+def run_dot(logfile, flags, output_type):
+ import subprocess
+ dot = dot_string(make_graph(logfile, flags), flags)
+ command = ["dot", "-T%s" % output_type, "-o%s.%s" % (flags.logfile, output_type)]
+ print "Running:\n%s" % " ".join(command)
+ p = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
+ output = p.communicate(input=dot)[0]
+ print output
+
+def command_dot(logfile, flags):
+ run_dot(logfile, flags, "jpg")
+def command_dot_ps(logfile, flags):
+ run_dot(logfile, flags, "ps")
+def command_dot_pdf(logfile, flags):
+ run_dot(logfile, flags, "pdf")
+def command_dot_svg(logfile, flags):
+ run_dot(logfile, flags, "svg")
+
+def dot_string(graph, flags):
+ result = "digraph G {"
+ incoming_cache = {}
+ if not flags.allstorage:
+ graph.collapse_nonstorage_nodes("Other")
+
+ def make_label(edge, prefix="", total_edge=None, slots_per_object=False):
+ object_suffix = " objects"
+ slots_suffix = " slots"
+ if not flags.objects or not flags.slots:
+ object_suffix = slots_suffix = ""
+ if total_edge and flags.percent and total_edge.objects != 0:
+ percent_objects = " (%.1f%%)" % percent(edge.objects, total_edge.objects)
+ percent_slots = " (%.1f%%)" % percent(edge.slots, total_edge.slots)
+ else:
+ percent_objects = percent_slots = ""
+ label = ""
+ if flags.objects:
+ label += "%s%s%s%s<BR/>" % (prefix, format(edge.objects, ",.0f"), object_suffix, percent_objects)
+ if flags.slots:
+ label += "%s%s%s%s<BR/>" % (prefix, format(edge.slots, ",.0f"), slots_suffix, percent_slots)
+ if slots_per_object and flags.slotsPerObject:
+ label += "%.1f slots/object<BR/>" % (float(total.slots) / total.objects)
+ return label
+
+ for node in graph.nodes.values():
+ incoming = node.sum_all_incoming().total()
+ outgoing = node.sum_all_outgoing().total()
+ remaining = incoming - outgoing
+ if node.is_artificial():
+ incoming_cache[node.name] = outgoing
+ shape = ",shape=box"
+ label = make_label(outgoing)
+ else:
+ incoming_cache[node.name] = incoming
+ shape = ""
+ label = make_label(incoming, "Incoming: ")
+ if remaining.objects != incoming.objects:
+ label += make_label(remaining, "Remaining: ", incoming)
+ result += "%s [label=<<B><U>%s</U></B><BR/>%s>%s];" % (node.dot_name(), node.name, label, shape)
+
+ for edge in graph.edges.values():
+ total = edge.total()
+ incoming = incoming_cache[edge.origin.name]
+ label = make_label(total, "", incoming, slots_per_object=True)
+ target_node = edge.target.dot_name()
+ source_node = edge.origin.dot_name()
+ result += "%s -> %s [label=<%s>];" % (source_node, target_node, label)
+
+ result += "}"
+ return result
+
+# ====================================================================
+# ======== Other commands
+# ====================================================================
+
+def command_aggregate(logfile, flags):
+ graph = make_graph(logfile, flags)
+ edges = graph.edges.values()
+ edges.sort()
+ for edge in edges:
+ logentries = edge.as_log_entries()
+ logentries.sort()
+ for entry in logentries:
+ print entry
+
+def command_print_entries(logfile, flags):
+ def callback(entry):
+ print entry
+ parse(logfile, flags, callback)
+
+# ====================================================================
+# ======== Main
+# ====================================================================
+
+class Flags(object):
+
+ def __init__(self, flags):
+ self.flags = {}
+ for name, short in flags:
+ self.__dict__[name] = False
+ self.flags[short] = name
+
+ def handle(self, arg):
+ if arg in self.flags:
+ self.__dict__[self.flags[arg]] = True
+ return True
+ else:
+ return False
+
+ def __str__(self):
+ descriptions = [ ("%s (%s)" % description) for description in self.flags.items() ]
+ return "[%s]" % " | ".join(descriptions)
+
+def usage(flags, commands):
+ print "Arguments: logfile command %s" % flags
+ print "Available commands: %s" % commands
+ exit(1)
+
+def main(argv):
+ flags = Flags([
+ # General
+ ('verbose', '-v'),
+
+ # All outputs
+ ('percent', '-p'),
+ ('allstorage', '-a'),
+
+ # Text outputs
+ ('detailed', '-d'),
+ ('classes', '-c'),
+
+ # dot outputs
+ ('slots', '-s'),
+ ('objects', '-o'),
+ ('slotsPerObject', '-S'),
+ ])
+
+ command_prefix = "command_"
+ module = sys.modules[__name__].__dict__
+ commands = [ a[len(command_prefix):] for a in module.keys() if a.startswith(command_prefix) ]
+
+ if len(argv) < 2:
+ usage(flags, commands)
+ logfile = argv[0]
+ flags.logfile = logfile
+ for vm_name in AVAILABLE_VMS:
+ if vm_name in logfile:
+ print "Using VM configuration %s" % vm_name
+ SET_VM(vm_name)
+ break
+ command = argv[1]
+ for flag in argv[2:]:
+ if not flags.handle(flag):
+ usage(flags, commands)
+ if command not in commands:
+ usage(flags, commands)
+
+ func = module[command_prefix + command]
+ func(logfile, flags)
+
+if __name__ == "__main__":
+ main(sys.argv[1:])
diff --git a/rpython/rlib/rstrategies/rstrategies.py b/rpython/rlib/rstrategies/rstrategies.py
new file mode 100644
--- /dev/null
+++ b/rpython/rlib/rstrategies/rstrategies.py
@@ -0,0 +1,570 @@
+
+import weakref, sys
+from rpython.rlib.rstrategies import logger
+from rpython.rlib import jit, objectmodel, rerased
+from rpython.rlib.objectmodel import specialize
+
+def make_accessors(strategy='strategy', storage='storage'):
+ """
+ Instead of using this generator, the methods can be implemented manually.
+ A third way is to overwrite the getter/setter methods in StrategyFactory.
+ """
+ def make_getter(attr):
+ def getter(self): return getattr(self, attr)
+ return getter
+ def make_setter(attr):
+ def setter(self, val): setattr(self, attr, val)
+ return setter
+ classdef = sys._getframe(1).f_locals
+ classdef['_get_strategy'] = make_getter(strategy)
+ classdef['_set_strategy'] = make_setter(strategy)
+ classdef['_get_storage'] = make_getter(storage)
+ classdef['_set_storage'] = make_setter(storage)
+
+class StrategyMetaclass(type):
+ """
+ A metaclass is required, because we need certain attributes to be special
+ for every single strategy class.
+ """
+ def __new__(self, name, bases, attrs):
+ attrs['_is_strategy'] = False
+ attrs['_is_singleton'] = False
+ attrs['_specializations'] = []
+ # Not every strategy uses rerased-pairs, but they won't hurt
+ erase, unerase = rerased.new_erasing_pair(name)
+ def get_storage(self, w_self):
+ erased = self.strategy_factory().get_storage(w_self)
+ return unerase(erased)
+ def set_storage(self, w_self, storage):
+ erased = erase(storage)
+ self.strategy_factory().set_storage(w_self, erased)
+ attrs['get_storage'] = get_storage
+ attrs['set_storage'] = set_storage
+ return type.__new__(self, name, bases, attrs)
+
+def strategy(generalize=None, singleton=True):
+ """
+ Strategy classes must be decorated with this.
+ generalize is a list of other strategies, that can be switched to from the decorated strategy.
+ If the singleton flag is set to False, new strategy instances will be created,
+ instead of always reusing the singleton object.
+ """
+ def decorator(strategy_class):
+ # Patch strategy class: Add generalized_strategy_for and mark as strategy class.
+ if generalize:
+ @jit.unroll_safe
+ def generalized_strategy_for(self, value):
+ # TODO - optimize this method
+ for strategy in generalize:
+ if self.strategy_factory().strategy_singleton_instance(strategy).check_can_handle(value):
+ return strategy
+ raise Exception("Could not find generalized strategy for %s coming from %s" % (value, self))
+ strategy_class.generalized_strategy_for = generalized_strategy_for
+ for generalized in generalize:
+ generalized._specializations.append(strategy_class)
+ strategy_class._is_strategy = True
+ strategy_class._generalizations = generalize
+ strategy_class._is_singleton = singleton
+ return strategy_class
+ return decorator
+
+class StrategyFactory(object):
+ _immutable_fields_ = ["strategies[*]", "logger", "strategy_singleton_field"]
+ factory_instance_counter = 0
+
+ def __init__(self, root_class, all_strategy_classes=None):
+ if all_strategy_classes is None:
+ all_strategy_classes = self.collect_subclasses(root_class)
+ self.strategies = []
+ self.logger = logger.Logger()
+
+ # This is to avoid confusion between multiple factories existing simultaneously (e.g. in tests)
+ self.strategy_singleton_field = "__singleton_%i" % StrategyFactory.factory_instance_counter
+ StrategyFactory.factory_instance_counter += 1
+
+ self.create_strategy_instances(root_class, all_strategy_classes)
+
+ def create_strategy_instances(self, root_class, all_strategy_classes):
+ for strategy_class in all_strategy_classes:
+ if strategy_class._is_strategy:
+ setattr(strategy_class, self.strategy_singleton_field, self.instantiate_strategy(strategy_class))
+ self.strategies.append(strategy_class)
+ self.patch_strategy_class(strategy_class, root_class)
+ self.order_strategies()
+
+ # =============================
+ # API methods
+ # =============================
+
+ def switch_strategy(self, w_self, new_strategy_type, new_element=None):
+ """
+ Switch the strategy of w_self to the new type.
+ new_element can be given as as hint, purely for logging purposes.
+ It should be the object that was added to w_self, causing the strategy switch.
+ """
+ old_strategy = self.get_strategy(w_self)
+ if new_strategy_type._is_singleton:
+ new_strategy = self.strategy_singleton_instance(new_strategy_type)
+ else:
+ size = old_strategy.size(w_self)
+ new_strategy = self.instantiate_strategy(new_strategy_type, w_self, size)
+ self.set_strategy(w_self, new_strategy)
+ old_strategy.convert_storage_to(w_self, new_strategy)
+ new_strategy.strategy_switched(w_self)
+ self.log(w_self, new_strategy, old_strategy, new_element)
+ return new_strategy
+
+ def set_initial_strategy(self, w_self, strategy_type, size, elements=None):
+ """
+ Initialize the strategy and storage fields of w_self.
+ This must be called before switch_strategy or any strategy method can be used.
+ elements is an optional list of values initially stored in w_self.
+ If given, then len(elements) == size must hold.
+ """
+ assert self.get_strategy(w_self) is None, "Strategy should not be initialized yet!"
+ if strategy_type._is_singleton:
+ strategy = self.strategy_singleton_instance(strategy_type)
+ else:
+ strategy = self.instantiate_strategy(strategy_type, w_self, size)
+ self.set_strategy(w_self, strategy)
+ strategy.initialize_storage(w_self, size)
+ element = None
+ if elements:
+ strategy.store_all(w_self, elements)
+ if len(elements) > 0: element = elements[0]
+ strategy.strategy_switched(w_self)
+ self.log(w_self, strategy, None, element)
+ return strategy
+
+ @jit.unroll_safe
+ def strategy_type_for(self, objects):
+ """
+ Return the best-fitting strategy to hold all given objects.
+ """
+ specialized_strategies = len(self.strategies)
+ can_handle = [True] * specialized_strategies
+ for obj in objects:
+ if specialized_strategies <= 1:
+ break
+ for i, strategy in enumerate(self.strategies):
+ if can_handle[i] and not self.strategy_singleton_instance(strategy).check_can_handle(obj):
+ can_handle[i] = False
+ specialized_strategies -= 1
+ for i, strategy_type in enumerate(self.strategies):
+ if can_handle[i]:
+ return strategy_type
+ raise Exception("Could not find strategy to handle: %s" % objects)
+
+ def decorate_strategies(self, transitions):
+ """
+ As an alternative to decorating all strategies with @strategy,
+ invoke this in the constructor of your StrategyFactory subclass, before
+ calling __init__. transitions is a dict mapping all strategy classes to
+ their 'generalize' list parameter (see @strategy decorator).
+ """
+ "NOT_RPYTHON"
+ for strategy_class, generalized in transitions.items():
+ strategy(generalized)(strategy_class)
+
+ # =============================
+ # The following methods can be overwritten to customize certain aspects of the factory.
+ # =============================
+
+ def instantiate_strategy(self, strategy_type, w_self=None, initial_size=0):
+ """
+ Return a functional instance of strategy_type.
+ Overwrite this if you need a non-default constructor.
+ The two additional parameters should be ignored for singleton-strategies.
+ """
+ return strategy_type()
+
+ def log(self, w_self, new_strategy, old_strategy=None, new_element=None):
+ """
+ This can be overwritten into a more appropriate call to self.logger.log
+ """
+ if not self.logger.active: return
+ new_strategy_str = self.log_string_for_object(new_strategy)
+ old_strategy_str = self.log_string_for_object(old_strategy)
+ element_typename = self.log_string_for_object(new_element)
+ size = new_strategy.size(w_self)
+ typename = ""
+ cause = "Switched" if old_strategy else "Created"
+ self.logger.log(new_strategy_str, size, cause, old_strategy_str, typename, element_typename)
+
+ @specialize.call_location()
+ def log_string_for_object(self, obj):
+ """
+ This can be overwritten instead of the entire log() method.
+ Keep the specialize-annotation in order to handle different kinds of objects here.
+ """
+ return obj.__class__.__name__ if obj else ""
+
+ # These storage accessors are specialized because the storage field is
+ # populated by erased-objects which seem to be incompatible sometimes.
+ @specialize.call_location()
+ def get_storage(self, obj):
+ return obj._get_storage()
+ @specialize.call_location()
+ def set_storage(self, obj, val):
+ return obj._set_storage(val)
+
+ def get_strategy(self, obj):
+ return obj._get_strategy()
+ def set_strategy(self, obj, val):
+ return obj._set_strategy(val)
+
+ # =============================
+ # Internal methods
+ # =============================
+
+ def patch_strategy_class(self, strategy_class, root_class):
+ "NOT_RPYTHON"
+ # Patch root class: Add default handler for visitor
+ def convert_storage_from_OTHER(self, w_self, previous_strategy):
+ self.convert_storage_from(w_self, previous_strategy)
+ funcname = "convert_storage_from_" + strategy_class.__name__
+ convert_storage_from_OTHER.func_name = funcname
+ setattr(root_class, funcname, convert_storage_from_OTHER)
+
+ # Patch strategy class: Add polymorphic visitor function
+ def convert_storage_to(self, w_self, new_strategy):
+ getattr(new_strategy, funcname)(w_self, self)
+ strategy_class.convert_storage_to = convert_storage_to
+
+ def collect_subclasses(self, cls):
+ "NOT_RPYTHON"
+ subclasses = []
+ for subcls in cls.__subclasses__():
+ subclasses.append(subcls)
+ subclasses.extend(self.collect_subclasses(subcls))
+ return subclasses
+
+ def order_strategies(self):
+ "NOT_RPYTHON"
+ def get_generalization_depth(strategy, visited=None):
+ if visited is None:
+ visited = set()
+ if strategy._generalizations:
+ if strategy in visited:
+ raise Exception("Cycle in generalization-tree of %s" % strategy)
+ visited.add(strategy)
+ depth = 0
+ for generalization in strategy._generalizations:
+ other_depth = get_generalization_depth(generalization, visited)
+ depth = max(depth, other_depth)
+ return depth + 1
+ else:
+ return 0
+ self.strategies.sort(key=get_generalization_depth, reverse=True)
+
+ @jit.elidable
+ def strategy_singleton_instance(self, strategy_class):
+ return getattr(strategy_class, self.strategy_singleton_field)
+
+ def _freeze_(self):
+ # Instance will be frozen at compile time, making accesses constant.
+ # The constructor does meta stuff which is not possible after translation.
+ return True
+
+class AbstractStrategy(object):
+ """
+ == Required:
+ strategy_factory(self) - Access to StorageFactory
+ """
+
+ def strategy_switched(self, w_self):
+ # Overwrite this method for a hook whenever the strategy
+ # of w_self was switched to self.
+ pass
+
+ # Main Fixedsize API
+
+ def store(self, w_self, index0, value):
+ raise NotImplementedError("Abstract method")
+
+ def fetch(self, w_self, index0):
+ raise NotImplementedError("Abstract method")
+
+ def size(self, w_self):
+ raise NotImplementedError("Abstract method")
+
+ # Fixedsize utility methods
+
+ def slice(self, w_self, start, end):
+ return [ self.fetch(w_self, i) for i in range(start, end)]
+
+ def fetch_all(self, w_self):
+ return self.slice(w_self, 0, self.size(w_self))
+
+ def store_all(self, w_self, elements):
+ for i, e in enumerate(elements):
+ self.store(w_self, i, e)
+
+ # Main Varsize API
+
+ def insert(self, w_self, index0, list_w):
+ raise NotImplementedError("Abstract method")
+
+ def delete(self, w_self, start, end):
+ raise NotImplementedError("Abstract method")
+
+ # Varsize utility methods
+
+ def append(self, w_self, list_w):
+ self.insert(w_self, self.size(w_self), list_w)
+
+ def pop(self, w_self, index0):
+ e = self.fetch(w_self, index0)
+ self.delete(w_self, index0, index0+1)
+ return e
+
+ # Internal methods
+
+ def initialize_storage(self, w_self, initial_size):
+ raise NotImplementedError("Abstract method")
+
+ def check_can_handle(self, value):
+ raise NotImplementedError("Abstract method")
+
+ def convert_storage_to(self, w_self, new_strategy):
+ # This will be overwritten in patch_strategy_class
+ new_strategy.convert_storage_from(w_self, self)
+
+ @jit.unroll_safe
+ def convert_storage_from(self, w_self, previous_strategy):
+ # This is a very unefficient (but most generic) way to do this.
+ # Subclasses should specialize.
+ storage = previous_strategy.fetch_all(w_self)
+ self.initialize_storage(w_self, previous_strategy.size(w_self))
+ for i, field in enumerate(storage):
+ self.store(w_self, i, field)
+
+ def generalize_for_value(self, w_self, value):
+ strategy_type = self.generalized_strategy_for(value)
+ new_instance = self.strategy_factory().switch_strategy(w_self, strategy_type, new_element=value)
+ return new_instance
+
+ def cannot_handle_store(self, w_self, index0, value):
+ new_instance = self.generalize_for_value(w_self, value)
+ new_instance.store(w_self, index0, value)
+
+ def cannot_handle_insert(self, w_self, index0, list_w):
+ # TODO - optimize. Prevent multiple generalizations and slicing done by callers.
+ new_strategy = self.generalize_for_value(w_self, list_w[0])
+ new_strategy.insert(w_self, index0, list_w)
+
+# ============== Special Strategies with no storage array ==============
+
+class EmptyStrategy(AbstractStrategy):
+ # == Required:
+ # See AbstractStrategy
+
+ def initialize_storage(self, w_self, initial_size):
+ assert initial_size == 0
+ self.set_storage(w_self, None)
+ def convert_storage_from(self, w_self, previous_strategy):
+ self.set_storage(w_self, None)
+ def fetch(self, w_self, index0):
+ raise IndexError
+ def store(self, w_self, index0, value):
+ self.cannot_handle_insert(w_self, index0, [value])
+ def insert(self, w_self, index0, list_w):
+ self.cannot_handle_insert(w_self, index0, list_w)
+ def delete(self, w_self, start, end):
+ self.check_index_range(w_self, start, end)
+ def size(self, w_self):
+ return 0
+ def check_can_handle(self, value):
+ return False
+
+class SingleValueStrategyStorage(object):
+ """Small container object for a size value."""
+ _attrs_ = ['size']
+ def __init__(self, size=0):
+ self.size = size
+
+class SingleValueStrategy(AbstractStrategy):
+ # == Required:
+ # See AbstractStrategy
+ # check_index_*(...) - use mixin SafeIndexingMixin or UnsafeIndexingMixin
+ # value(self) - the single value contained in this strategy. Should be constant.
+
+ def initialize_storage(self, w_self, initial_size):
+ storage_obj = SingleValueStrategyStorage(initial_size)
+ self.set_storage(w_self, storage_obj)
+ def convert_storage_from(self, w_self, previous_strategy):
+ self.initialize_storage(w_self, previous_strategy.size(w_self))
+
+ def fetch(self, w_self, index0):
+ self.check_index_fetch(w_self, index0)
+ return self.value()
+ def store(self, w_self, index0, value):
+ self.check_index_store(w_self, index0)
+ if self.check_can_handle(value):
+ return
+ self.cannot_handle_store(w_self, index0, value)
+
+ @jit.unroll_safe
+ def insert(self, w_self, index0, list_w):
+ storage_obj = self.get_storage(w_self)
+ for i in range(len(list_w)):
+ if self.check_can_handle(list_w[i]):
+ storage_obj.size += 1
+ else:
+ self.cannot_handle_insert(w_self, index0 + i, list_w[i:])
+ return
+
+ def delete(self, w_self, start, end):
+ self.check_index_range(w_self, start, end)
+ self.get_storage(w_self).size -= (end - start)
+ def size(self, w_self):
+ return self.get_storage(w_self).size
+ def check_can_handle(self, value):
+ return value is self.value()
+
+# ============== Basic strategies with storage ==============
+
+class StrategyWithStorage(AbstractStrategy):
+ # == Required:
+ # See AbstractStrategy
+ # check_index_*(...) - use mixin SafeIndexingMixin or UnsafeIndexingMixin
+ # default_value(self) - The value to be initially contained in this strategy
+
+ def initialize_storage(self, w_self, initial_size):
+ default = self._unwrap(self.default_value())
+ self.set_storage(w_self, [default] * initial_size)
+
+ @jit.unroll_safe
+ def convert_storage_from(self, w_self, previous_strategy):
+ size = previous_strategy.size(w_self)
+ new_storage = [ self._unwrap(previous_strategy.fetch(w_self, i))
+ for i in range(size) ]
+ self.set_storage(w_self, new_storage)
+
+ def store(self, w_self, index0, wrapped_value):
+ self.check_index_store(w_self, index0)
+ if self.check_can_handle(wrapped_value):
+ unwrapped = self._unwrap(wrapped_value)
+ self.get_storage(w_self)[index0] = unwrapped
+ else:
+ self.cannot_handle_store(w_self, index0, wrapped_value)
+
+ def fetch(self, w_self, index0):
+ self.check_index_fetch(w_self, index0)
+ unwrapped = self.get_storage(w_self)[index0]
+ return self._wrap(unwrapped)
+
+ def _wrap(self, value):
+ raise NotImplementedError("Abstract method")
+
+ def _unwrap(self, value):
+ raise NotImplementedError("Abstract method")
+
+ def size(self, w_self):
+ return len(self.get_storage(w_self))
+
+ @jit.unroll_safe
+ def insert(self, w_self, start, list_w):
+ if start > self.size(w_self):
+ start = self.size(w_self)
+ for i in range(len(list_w)):
+ if self.check_can_handle(list_w[i]):
+ self.get_storage(w_self).insert(start + i, self._unwrap(list_w[i]))
+ else:
+ self.cannot_handle_insert(w_self, start + i, list_w[i:])
+ return
+
+ def delete(self, w_self, start, end):
+ self.check_index_range(w_self, start, end)
+ assert start >= 0 and end >= 0
+ del self.get_storage(w_self)[start : end]
+
+class GenericStrategy(StrategyWithStorage):
+ # == Required:
+ # See StrategyWithStorage
+
+ def _wrap(self, value):
+ return value
+ def _unwrap(self, value):
+ return value
+ def check_can_handle(self, wrapped_value):
+ return True
+
+class WeakGenericStrategy(StrategyWithStorage):
+ # == Required:
+ # See StrategyWithStorage
+
+ def _wrap(self, value):
+ return value() or self.default_value()
+ def _unwrap(self, value):
+ assert value is not None
+ return weakref.ref(value)
+ def check_can_handle(self, wrapped_value):
+ return True
+
+# ============== Mixins for index checking operations ==============
+
+class SafeIndexingMixin(object):
+ def check_index_store(self, w_self, index0):
+ self.check_index(w_self, index0)
+ def check_index_fetch(self, w_self, index0):
+ self.check_index(w_self, index0)
+ def check_index_range(self, w_self, start, end):
+ if end < start:
+ raise IndexError
+ self.check_index(w_self, start)
+ self.check_index(w_self, end)
+ def check_index(self, w_self, index0):
+ if index0 < 0 or index0 >= self.size(w_self):
+ raise IndexError
+
+class UnsafeIndexingMixin(object):
+ def check_index_store(self, w_self, index0):
+ pass
+ def check_index_fetch(self, w_self, index0):
+ pass
+ def check_index_range(self, w_self, start, end):
+ pass
+
+# ============== Specialized Storage Strategies ==============
+
+class SpecializedStrategy(StrategyWithStorage):
+ # == Required:
+ # See StrategyWithStorage
+ # wrap(self, value) - Return a boxed object for the primitive value
+ # unwrap(self, value) - Return the unboxed primitive value of value
+
+ def _unwrap(self, value):
+ return self.unwrap(value)
+ def _wrap(self, value):
+ return self.wrap(value)
+
+class SingleTypeStrategy(SpecializedStrategy):
+ # == Required Functions:
+ # See SpecializedStrategy
+ # contained_type - The wrapped type that can be stored in this strategy
+
+ def check_can_handle(self, value):
+ return isinstance(value, self.contained_type)
+
+class TaggingStrategy(SingleTypeStrategy):
+ """This strategy uses a special tag value to represent a single additional object."""
+ # == Required:
+ # See SingleTypeStrategy
+ # wrapped_tagged_value(self) - The tagged object
+ # unwrapped_tagged_value(self) - The unwrapped tag value representing the tagged object
+
+ def check_can_handle(self, value):
+ return value is self.wrapped_tagged_value() or \
+ (isinstance(value, self.contained_type) and \
+ self.unwrap(value) != self.unwrapped_tagged_value())
+
+ def _unwrap(self, value):
+ if value is self.wrapped_tagged_value():
+ return self.unwrapped_tagged_value()
+ return self.unwrap(value)
+
+ def _wrap(self, value):
+ if value == self.unwrapped_tagged_value():
+ return self.wrapped_tagged_value()
+ return self.wrap(value)
diff --git a/rpython/rlib/rstrategies/test.py b/rpython/rlib/rstrategies/test.py
new file mode 100644
--- /dev/null
+++ b/rpython/rlib/rstrategies/test.py
@@ -0,0 +1,434 @@
+
+import py
+from rpython.rlib.rstrategies import rstrategies as rs
+from rpython.rlib.objectmodel import import_from_mixin
+
+# === Define small model tree
+
+class W_AbstractObject(object):
+ pass
+
+class W_Object(W_AbstractObject):
+ pass
+
+class W_Integer(W_AbstractObject):
+ def __init__(self, value):
+ self.value = value
+ def __eq__(self, other):
+ return isinstance(other, W_Integer) and self.value == other.value
+
+class W_List(W_AbstractObject):
+ rs.make_accessors()
+ def __init__(self, strategy=None, size=0, elements=None):
+ self.strategy = None
+ if strategy:
+ factory.set_initial_strategy(self, strategy, size, elements)
+ def fetch(self, i):
+ assert self.strategy
+ return self.strategy.fetch(self, i)
+ def store(self, i, value):
+ assert self.strategy
+ return self.strategy.store(self, i, value)
+ def size(self):
+ assert self.strategy
+ return self.strategy.size(self)
+ def insert(self, index0, list_w):
+ assert self.strategy
+ return self.strategy.insert(self, index0, list_w)
+ def delete(self, start, end):
+ assert self.strategy
+ return self.strategy.delete(self, start, end)
+ def append(self, list_w):
+ assert self.strategy
+ return self.strategy.append(self, list_w)
+ def pop(self, index0):
+ assert self.strategy
+ return self.strategy.pop(self, index0)
+ def slice(self, start, end):
+ assert self.strategy
+ return self.strategy.slice(self, start, end)
+ def fetch_all(self):
+ assert self.strategy
+ return self.strategy.fetch_all(self)
+ def store_all(self, elements):
+ assert self.strategy
+ return self.strategy.store_all(self, elements)
+
+w_nil = W_Object()
+
+# === Define concrete strategy classes
+
+class AbstractStrategy(object):
+ __metaclass__ = rs.StrategyMetaclass
+ import_from_mixin(rs.AbstractStrategy)
+ import_from_mixin(rs.SafeIndexingMixin)
+ def __init__(self, factory, w_self=None, size=0):
+ self.factory = factory
+ def strategy_factory(self):
+ return self.factory
+
+class Factory(rs.StrategyFactory):
+ switching_log = []
+
+ def __init__(self, root_class):
+ self.decorate_strategies({
+ EmptyStrategy: [GenericStrategy],
+ NilStrategy: [IntegerOrNilStrategy, GenericStrategy],
+ GenericStrategy: [],
+ WeakGenericStrategy: [],
+ IntegerStrategy: [IntegerOrNilStrategy, GenericStrategy],
+ IntegerOrNilStrategy: [GenericStrategy],
+ })
+ rs.StrategyFactory.__init__(self, root_class)
+
+ def instantiate_strategy(self, strategy_type, w_self=None, size=0):
+ return strategy_type(self, w_self, size)
+
+ def set_strategy(self, w_list, strategy):
+ old_strategy = self.get_strategy(w_list)
+ self.switching_log.append((old_strategy, strategy))
+ super(Factory, self).set_strategy(w_list, strategy)
+
+ def clear_log(self):
+ del self.switching_log[:]
+
+class EmptyStrategy(AbstractStrategy):
+ import_from_mixin(rs.EmptyStrategy)
+
+class NilStrategy(AbstractStrategy):
+ import_from_mixin(rs.SingleValueStrategy)
+ def value(self): return w_nil
+
+class GenericStrategy(AbstractStrategy):
+ import_from_mixin(rs.GenericStrategy)
+ import_from_mixin(rs.UnsafeIndexingMixin)
+ def default_value(self): return w_nil
+
+class WeakGenericStrategy(AbstractStrategy):
+ import_from_mixin(rs.WeakGenericStrategy)
+ def default_value(self): return w_nil
+
+class IntegerStrategy(AbstractStrategy):
+ import_from_mixin(rs.SingleTypeStrategy)
+ contained_type = W_Integer
+ def wrap(self, value): return W_Integer(value)
+ def unwrap(self, value): return value.value
+ def default_value(self): return W_Integer(0)
+
+class IntegerOrNilStrategy(AbstractStrategy):
+ import_from_mixin(rs.TaggingStrategy)
+ contained_type = W_Integer
+ def wrap(self, value): return W_Integer(value)
+ def unwrap(self, value): return value.value
+ def default_value(self): return w_nil
+ def wrapped_tagged_value(self): return w_nil
+ def unwrapped_tagged_value(self): import sys; return sys.maxint
+
+ at rs.strategy(generalize=[], singleton=False)
+class NonSingletonStrategy(GenericStrategy):
+ def __init__(self, factory, w_list=None, size=0):
+ super(NonSingletonStrategy, self).__init__(factory, w_list, size)
+ self.w_list = w_list
+ self.size = size
+
+class NonStrategy(NonSingletonStrategy):
+ pass
+
+factory = Factory(AbstractStrategy)
+
+def check_contents(list, expected):
+ assert list.size() == len(expected)
+ for i, val in enumerate(expected):
+ assert list.fetch(i) == val
+
+def teardown():
+ factory.clear_log()
+
+# === Test Initialization and fetch
+
+def test_setup():
+ pass
+
+def test_factory_setup():
+ expected_strategies = 7
+ assert len(factory.strategies) == expected_strategies
+ assert len(set(factory.strategies)) == len(factory.strategies)
+ for strategy in factory.strategies:
+ assert isinstance(factory.strategy_singleton_instance(strategy), strategy)
+
+def test_factory_setup_singleton_instances():
+ new_factory = Factory(AbstractStrategy)
+ s1 = factory.strategy_singleton_instance(GenericStrategy)
+ s2 = new_factory.strategy_singleton_instance(GenericStrategy)
+ assert s1 is not s2
+ assert s1.strategy_factory() is factory
+ assert s2.strategy_factory() is new_factory
+
+def test_metaclass():
+ assert NonStrategy._is_strategy == False
+ assert IntegerOrNilStrategy._is_strategy == True
+ assert IntegerOrNilStrategy._is_singleton == True
+ assert NonSingletonStrategy._is_singleton == False
+ assert NonStrategy._is_singleton == False
+ assert NonStrategy.get_storage is not NonSingletonStrategy.get_storage
+
+def test_singletons():
+ def do_test_singletons(cls, expected_true):
+ l1 = W_List(cls, 0)
+ l2 = W_List(cls, 0)
+ if expected_true:
+ assert l1.strategy is l2.strategy
+ else:
+ assert l1.strategy is not l2.strategy
+ do_test_singletons(EmptyStrategy, True)
+ do_test_singletons(NonSingletonStrategy, False)
+ do_test_singletons(NonStrategy, False)
+ do_test_singletons(GenericStrategy, True)
+
+def do_test_initialization(cls, default_value=w_nil, is_safe=True):
+ size = 10
+ l = W_List(cls, size)
+ s = l.strategy
+ assert s.size(l) == size
+ assert s.fetch(l,0) == default_value
+ assert s.fetch(l,size/2) == default_value
+ assert s.fetch(l,size-1) == default_value
+ py.test.raises(IndexError, s.fetch, l, size)
+ py.test.raises(IndexError, s.fetch, l, size+1)
+ py.test.raises(IndexError, s.fetch, l, size+5)
+ if is_safe:
+ py.test.raises(IndexError, s.fetch, l, -1)
+ else:
+ assert s.fetch(l, -1) == s.fetch(l, size - 1)
+
+def test_init_Empty():
+ l = W_List(EmptyStrategy, 0)
+ s = l.strategy
+ assert s.size(l) == 0
+ py.test.raises(IndexError, s.fetch, l, 0)
+ py.test.raises(IndexError, s.fetch, l, 10)
+
+def test_init_Nil():
+ do_test_initialization(NilStrategy)
+
+def test_init_Generic():
+ do_test_initialization(GenericStrategy, is_safe=False)
+
+def test_init_WeakGeneric():
+ do_test_initialization(WeakGenericStrategy)
+
+def test_init_Integer():
+ do_test_initialization(IntegerStrategy, default_value=W_Integer(0))
+
+def test_init_IntegerOrNil():
+ do_test_initialization(IntegerOrNilStrategy)
+
+# === Test Simple store
+
+def do_test_store(cls, stored_value=W_Object(), is_safe=True, is_varsize=False):
+ size = 10
+ l = W_List(cls, size)
+ s = l.strategy
+ def store_test(index):
+ s.store(l, index, stored_value)
+ assert s.fetch(l, index) == stored_value
+ store_test(0)
+ store_test(size/2)
+ store_test(size-1)
+ if not is_varsize:
+ py.test.raises(IndexError, s.store, l, size, stored_value)
+ py.test.raises(IndexError, s.store, l, size+1, stored_value)
+ py.test.raises(IndexError, s.store, l, size+5, stored_value)
+ if is_safe:
+ py.test.raises(IndexError, s.store, l, -1, stored_value)
+ else:
+ store_test(-1)
+
+def test_store_Nil():
+ do_test_store(NilStrategy, stored_value=w_nil)
+
+def test_store_Generic():
+ do_test_store(GenericStrategy, is_safe=False)
+
+def test_store_WeakGeneric():
+ do_test_store(WeakGenericStrategy, stored_value=w_nil)
+
+def test_store_Integer():
+ do_test_store(IntegerStrategy, stored_value=W_Integer(100))
+
+def test_store_IntegerOrNil():
+ do_test_store(IntegerOrNilStrategy, stored_value=W_Integer(100))
+ do_test_store(IntegerOrNilStrategy, stored_value=w_nil)
+
+# === Test Insert
+
+def do_test_insert(cls, values):
+ l = W_List(cls, 0)
+ assert len(values) >= 6
+ values1 = values[0:2]
+ values2 = values[2:4]
+ values3 = values[4:6]
+ l.insert(0, values1+values3)
+ check_contents(l, values1+values3)
+ l.insert(2, values2)
+ check_contents(l, values)
+
+def test_insert_Nil():
+ do_test_insert(NilStrategy, [w_nil]*6)
+
+def test_insert_Generic():
+ do_test_insert(GenericStrategy, [W_Object() for _ in range(6)])
+
+def test_insert_WeakGeneric():
+ do_test_insert(WeakGenericStrategy, [W_Object() for _ in range(6)])
+
+def test_insert_Integer():
+ do_test_insert(IntegerStrategy, [W_Integer(x) for x in range(6)])
+
+def test_insert_IntegerOrNil():
+ do_test_insert(IntegerOrNilStrategy, [w_nil]+[W_Integer(x) for x in range(4)]+[w_nil])
+ do_test_insert(IntegerOrNilStrategy, [w_nil]*6)
+
+# === Test Delete
+
+def do_test_delete(cls, values):
+ assert len(values) >= 6
+ l = W_List(cls, len(values), values)
+ l.delete(2, 4)
+ del values[2: 4]
+ check_contents(l, values)
+ l.delete(1, 2)
+ del values[1: 2]
+ check_contents(l, values)
+
+def test_delete_Nil():
+ do_test_delete(NilStrategy, [w_nil]*6)
+
+def test_delete_Generic():
+ do_test_delete(GenericStrategy, [W_Object() for _ in range(6)])
+
+def test_delete_WeakGeneric():
+ do_test_delete(WeakGenericStrategy, [W_Object() for _ in range(6)])
+
+def test_delete_Integer():
+ do_test_delete(IntegerStrategy, [W_Integer(x) for x in range(6)])
+
+def test_delete_IntegerOrNil():
+ do_test_delete(IntegerOrNilStrategy, [w_nil]+[W_Integer(x) for x in range(4)]+[w_nil])
+ do_test_delete(IntegerOrNilStrategy, [w_nil]*6)
+
+# === Test Transitions
+
+def test_CheckCanHandle():
+ def assert_handles(cls, good, bad):
+ s = cls(0)
+ for val in good:
+ assert s.check_can_handle(val)
+ for val in bad:
+ assert not s.check_can_handle(val)
+ obj = W_Object()
+ i = W_Integer(0)
+ nil = w_nil
+
+ assert_handles(EmptyStrategy, [], [nil, obj, i])
+ assert_handles(NilStrategy, [nil], [obj, i])
+ assert_handles(GenericStrategy, [nil, obj, i], [])
+ assert_handles(WeakGenericStrategy, [nil, obj, i], [])
+ assert_handles(IntegerStrategy, [i], [nil, obj])
+ assert_handles(IntegerOrNilStrategy, [nil, i], [obj])
+
+def do_test_transition(OldStrategy, value, NewStrategy, initial_size=10):
+ w = W_List(OldStrategy, initial_size)
+ old = w.strategy
+ w.store(0, value)
+ assert isinstance(w.strategy, NewStrategy)
+ assert factory.switching_log == [(None, old), (old, w.strategy)]
+
+def test_AllNil_to_Generic():
+ do_test_transition(NilStrategy, W_Object(), GenericStrategy)
+
+def test_AllNil_to_IntegerOrNil():
+ do_test_transition(NilStrategy, W_Integer(0), IntegerOrNilStrategy)
+
+def test_IntegerOrNil_to_Generic():
+ do_test_transition(IntegerOrNilStrategy, W_Object(), GenericStrategy)
+
+def test_Integer_to_IntegerOrNil():
+ do_test_transition(IntegerStrategy, w_nil, IntegerOrNilStrategy)
+
+def test_Integer_Generic():
+ do_test_transition(IntegerStrategy, W_Object(), GenericStrategy)
+
+def test_TaggingValue_not_storable():
+ tag = IntegerOrNilStrategy(10).unwrapped_tagged_value() # sys.maxint
+ do_test_transition(IntegerOrNilStrategy, W_Integer(tag), GenericStrategy)
+
+# TODO - Test transition from varsize back to Empty
+
+# === Test helper methods
+
+def generic_list():
+ values = [W_Object() for _ in range(6)]
+ return W_List(GenericStrategy, len(values), values), values
+
+def test_slice():
+ l, v = generic_list()
+ assert l.slice(2, 4) == v[2:4]
+
+def test_fetch_all():
+ l, v = generic_list()
+ assert l.fetch_all() == v
+
+def test_append():
+ l, v = generic_list()
+ o1 = W_Object()
+ o2 = W_Object()
+ l.append([o1])
+ assert l.fetch_all() == v + [o1]
+ l.append([o1, o2])
+ assert l.fetch_all() == v + [o1, o1, o2]
+
+def test_pop():
+ l, v = generic_list()
+ o = l.pop(3)
+ del v[3]
+ assert l.fetch_all() == v
+ o = l.pop(3)
+ del v[3]
+ assert l.fetch_all() == v
+
+def test_store_all():
+ l, v = generic_list()
+ v2 = [W_Object() for _ in range(4) ]
+ v3 = [W_Object() for _ in range(l.size()) ]
+ assert v2 != v
+ assert v3 != v
+
+ l.store_all(v2)
+ assert l.fetch_all() == v2+v[4:]
+ l.store_all(v3)
+ assert l.fetch_all() == v3
+
+ py.test.raises(IndexError, l.store_all, [W_Object() for _ in range(8) ])
+
+# === Test Weak Strategy
+# TODO
+
+# === Other tests
+
+def test_optimized_strategy_switch(monkeypatch):
+ l = W_List(NilStrategy, 5)
+ s = l.strategy
+ s.copied = 0
+ def convert_storage_from_default(self, w_self, other):
+ assert False, "The default convert_storage_from() should not be called!"
+ def convert_storage_from_special(self, w_self, other):
+ s.copied += 1
+
+ monkeypatch.setattr(AbstractStrategy, "convert_storage_from_NilStrategy", convert_storage_from_special)
+ monkeypatch.setattr(AbstractStrategy, "convert_storage_from", convert_storage_from_default)
+ try:
+ factory.switch_strategy(l, IntegerOrNilStrategy)
+ finally:
+ monkeypatch.undo()
+ assert s.copied == 1, "Optimized switching routine not called exactly one time."
More information about the pypy-commit
mailing list