[Python-checkins] r56518 - in doctools/trunk: HACKING TODO sphinx/__init__.py sphinx/builder.py sphinx/console.py sphinx/json.py sphinx/search.py sphinx/smartypants.py sphinx/stemmer.py sphinx/util sphinx/util.py sphinx/util/__init__.py sphinx/util/console.py sphinx/util/json.py sphinx/util/smartypants.py sphinx/util/stemmer.py sphinx/web/wsgiutil.py sphinx/writer.py

georg.brandl python-checkins at python.org
Tue Jul 24 12:25:54 CEST 2007


Author: georg.brandl
Date: Tue Jul 24 12:25:53 2007
New Revision: 56518

Added:
   doctools/trunk/HACKING
   doctools/trunk/sphinx/util/
   doctools/trunk/sphinx/util/__init__.py
      - copied unchanged from r56508, doctools/trunk/sphinx/util.py
   doctools/trunk/sphinx/util/console.py
      - copied, changed from r56508, doctools/trunk/sphinx/console.py
   doctools/trunk/sphinx/util/json.py
      - copied, changed from r56508, doctools/trunk/sphinx/json.py
   doctools/trunk/sphinx/util/smartypants.py
      - copied unchanged from r56508, doctools/trunk/sphinx/smartypants.py
   doctools/trunk/sphinx/util/stemmer.py
      - copied, changed from r56508, doctools/trunk/sphinx/stemmer.py
Removed:
   doctools/trunk/sphinx/console.py
   doctools/trunk/sphinx/json.py
   doctools/trunk/sphinx/smartypants.py
   doctools/trunk/sphinx/stemmer.py
   doctools/trunk/sphinx/util.py
Modified:
   doctools/trunk/TODO
   doctools/trunk/sphinx/__init__.py
   doctools/trunk/sphinx/builder.py
   doctools/trunk/sphinx/search.py
   doctools/trunk/sphinx/web/wsgiutil.py
   doctools/trunk/sphinx/writer.py
Log:
Move utils to separate package, add coding document.


Added: doctools/trunk/HACKING
==============================================================================
--- (empty file)
+++ doctools/trunk/HACKING	Tue Jul 24 12:25:53 2007
@@ -0,0 +1,140 @@
+.. -*- mode: rst -*-
+
+===============
+Coding overview
+===============
+
+This document tries to give you a cursory overview of the doctools code.
+
+
+Dependencies
+------------
+
+The converter doesn't have any dependencies except Python 2.5.
+
+Sphinx needs Python 2.5, Docutils 0.4 (not SVN, because of API changes), Jinja
+>= 1.1 (which is at the moment included as an SVN external) and Pygments >= 0.8
+(which is optional and can be installed from the cheese shop).
+
+
+The converter
+-------------
+
+There's not too much to say about the converter.  It's quite as finished as
+possible, and as it has to only work with the body of documentation found in the
+Python core, it doesn't have to be as general as possible.
+
+(If other projects using the LaTeX documentation toolchain want to convert their
+docs to the new format, the converter will probably have to be amended.)
+
+In ``restwriter.py``, there's some commentary about the inner works of the
+converter concerning a single file.
+
+The ``filenamemap.py`` file tells the converter how to rearrange the converted
+files in the reST source directories.  There, for example, the tutorial is split
+up in several files, and old or unusable files are flagged as not convertable.
+Also, non-LaTeX files, such as code include files, are listed to be copied into
+corresponding directories.
+
+The directory ``newfiles`` contains a bunch of files that didn't exist in the
+old distribution, such as the documentation of Sphinx markup, that will be
+copied to the reST directory too.
+
+
+Sphinx
+------
+
+Sphinx consists of two parts:
+
+* The builder takes the reST sources and converts them into an output format.
+  (Presently, HTML, HTML Help or webapp-usable pickles.)
+
+* The web application takes the webapp-usable pickles, which mainly contain the
+  HTML bodies converted from reST and some additional information, and turns them
+  into a WSGI application, complete with commenting, navigation etc.
+  (The subpackage ``web`` is responsible for this.)
+
+An overview of the source files:
+
+addnodes.py
+  Contains docutils node classes that are not part of standard docutils.  These
+  node classes must be handled by every docutils writer that gets one of our
+  nodetrees.
+
+  (The docutils parse a reST document into a tree of "nodes". This nodetree can
+  then be converted into an internal representation, XML or anything a Writer
+  exists for.)
+
+builder.py
+  Contains the Builder classes, which are responsible for the process of building
+  the output files from docutils node trees.
+
+  The builder is called by ``sphinx-build.py``.
+
+directives.py
+  Directive functions that transform our custom directives (like ``.. function::``)
+  into doctree nodes.
+
+environment.py
+  The "build environment", a class that holds metadata about all doctrees, and is
+  responsible for building them out of reST source files.
+
+  The environment is stored, in a pickled form, in the output directory, in
+  order to enable incremental builds if only a few source files change, which
+  usually is the case.
+
+highlighting.py
+  Glue to the Pygments highlighting library.  Will use no highlighting at all if
+  that is not installed.  Probably a stripped down version of the Pygments Python
+  lexer and HTML formatter could be included.
+
+htmlhelp.py
+  HTML help builder helper methods.
+
+_jinja.py, jinja
+  The Jinja templating engine, used for all HTML-related builders.
+
+refcounting.py
+  Helper to keep track of reference count data for the C API reference,
+  which is maintained as a separate file.
+
+roles.py
+  Role functions that transform our custom roles (like ``:meth:``) into doctree
+  nodes.
+
+search.py
+  Helper to create a search index for the offline search.
+
+style
+  Directory for all static files for HTML-related builders.
+
+templates
+  Directory for Jinja templates, ATM only for HTML.
+
+util
+  General utilities.
+
+writer.py
+  The docutils HTML writer subclass which understands our additional nodes.
+
+
+Code style
+----------
+
+PEP 8 (http://www.python.org/dev/peps/pep-0008) must be observed, with the
+following exceptions:
+
+* Line length is limited to 90 characters.
+* Relative imports are used, using with the new-in-2.5 'leading dot' syntax.
+
+The file encoding is UTF-8, this should be indicated in the file's first line
+with ::
+
+   # -*- coding: utf-8 -*-
+
+
+Python 3.0 compatibility
+------------------------
+
+As it will be used for Python 3.0 too, the toolset should be kept in a state
+where it is fully usable Python 3 code after one run of the ``2to3`` utility.

Modified: doctools/trunk/TODO
==============================================================================
--- doctools/trunk/TODO	(original)
+++ doctools/trunk/TODO	Tue Jul 24 12:25:53 2007
@@ -2,6 +2,7 @@
 ===========
 
 - discuss and debug comments system
+- navigation links at the bottom too
 - write new Makefile, handle automatic version info and checkout
 - write a "printable" builder (export to latex, most probably)
 - discuss the default role

Modified: doctools/trunk/sphinx/__init__.py
==============================================================================
--- doctools/trunk/sphinx/__init__.py	(original)
+++ doctools/trunk/sphinx/__init__.py	Tue Jul 24 12:25:53 2007
@@ -14,7 +14,7 @@
 from os import path
 
 from .builder import builders
-from .console import nocolor
+from .util.console import nocolor
 
 __version__ = '$Revision: 5369 $'
 
@@ -99,6 +99,10 @@
         elif opt == '-N':
             nocolor()
 
+    if sys.platform == 'win32':
+        # Windows' cmd box doesn't understand ANSI sequences
+        nocolor()
+
     if builder is None:
         print 'No builder selected, using default: html'
         builder = 'html'

Modified: doctools/trunk/sphinx/builder.py
==============================================================================
--- doctools/trunk/sphinx/builder.py	(original)
+++ doctools/trunk/sphinx/builder.py	Tue Jul 24 12:25:53 2007
@@ -29,7 +29,7 @@
 from .util import (get_matching_files, attrdict, status_iterator,
                    ensuredir, get_category, relative_uri)
 from .writer import HTMLWriter
-from .console import bold, purple, green
+from .util.console import bold, purple, green
 from .htmlhelp import build_hhx
 from .environment import BuildEnvironment
 from .highlighting import pygments, get_stylesheet

Deleted: /doctools/trunk/sphinx/console.py
==============================================================================
--- /doctools/trunk/sphinx/console.py	Tue Jul 24 12:25:53 2007
+++ (empty file)
@@ -1,53 +0,0 @@
-# -*- coding: utf-8 -*-
-"""
-    sphinx.console
-    ~~~~~~~~~~~~~~
-
-    Format colored console output.
-
-    :copyright: 2007 by Georg Brandl.
-    :license: Python license.
-"""
-
-codes = {}
-
-def nocolor():
-    codes.clear()
-
-def colorize(name, text):
-    return codes.get(name, '') + text + codes.get('reset', '')
-
-def create_color_func(name):
-    def inner(text):
-        return colorize(name, text)
-    globals()[name] = inner
-
-_attrs = {
-    'reset':     '39;49;00m',
-    'bold':      '01m',
-    'faint':     '02m',
-    'standout':  '03m',
-    'underline': '04m',
-    'blink':     '05m',
-}
-
-for name, value in _attrs.items():
-    codes[name] = '\x1b[' + value
-
-_colors = [
-    ('black',     'darkgray'),
-    ('darkred',   'red'),
-    ('darkgreen', 'green'),
-    ('brown',     'yellow'),
-    ('darkblue',  'blue'),
-    ('purple',    'fuchsia'),
-    ('turquoise', 'teal'),
-    ('lightgray', 'white'),
-]
-
-for i, (dark, light) in enumerate(_colors):
-    codes[dark] = '\x1b[%im' % (i+30)
-    codes[light] = '\x1b[%i;01m' % (i+30)
-
-for name in codes:
-    create_color_func(name)

Deleted: /doctools/trunk/sphinx/json.py
==============================================================================
--- /doctools/trunk/sphinx/json.py	Tue Jul 24 12:25:53 2007
+++ (empty file)
@@ -1,72 +0,0 @@
-# -*- coding: utf-8 -*-
-"""
-    sphinx.json
-    ~~~~~~~~~~~
-
-    Minimal JSON module that generates small dumps.
-
-    This is not fully JSON compliant but enough for the searchindex.
-    And the generated files are smaller than the simplejson ones.
-
-    Uses the basestring encode function from simplejson.
-
-    :copyright: 2007 by Armin Ronacher, Bob Ippolito.
-    :license: Python license.
-"""
-
-import re
-
-ESCAPE = re.compile(r'[\x00-\x19\\"\b\f\n\r\t]')
-ESCAPE_ASCII = re.compile(r'([\\"]|[^\ -~])')
-ESCAPE_DICT = {
-    '\\': '\\\\',
-    '"': '\\"',
-    '\b': '\\b',
-    '\f': '\\f',
-    '\n': '\\n',
-    '\r': '\\r',
-    '\t': '\\t',
-}
-for i in range(0x20):
-    ESCAPE_DICT.setdefault(chr(i), '\\u%04x' % (i,))
-
-
-def encode_basestring_ascii(s):
-    def replace(match):
-        s = match.group(0)
-        try:
-            return ESCAPE_DICT[s]
-        except KeyError:
-            n = ord(s)
-            if n < 0x10000:
-                return '\\u%04x' % (n,)
-            else:
-                # surrogate pair
-                n -= 0x10000
-                s1 = 0xd800 | ((n >> 10) & 0x3ff)
-                s2 = 0xdc00 | (n & 0x3ff)
-                return '\\u%04x\\u%04x' % (s1, s2)
-    return '"' + str(ESCAPE_ASCII.sub(replace, s)) + '"'
-
-
-def dump_json(obj, key=False):
-    if key:
-        if not isinstance(obj, basestring):
-            obj = str(obj)
-        return encode_basestring_ascii(obj)
-    if obj is None:
-        return 'null'
-    elif obj is True or obj is False:
-        return obj and 'true' or 'false'
-    elif isinstance(obj, (int, long, float)):
-        return str(obj)
-    elif isinstance(obj, dict):
-        return '{%s}' % ','.join('%s:%s' % (
-            dump_json(key, True),
-            dump_json(value)
-        ) for key, value in obj.iteritems())
-    elif isinstance(obj, (tuple, list, set)):
-        return '[%s]' % ','.join(dump_json(x) for x in obj)
-    elif isinstance(obj, basestring):
-        return encode_basestring_ascii(obj)
-    raise TypeError(type(obj))

Modified: doctools/trunk/sphinx/search.py
==============================================================================
--- doctools/trunk/sphinx/search.py	(original)
+++ doctools/trunk/sphinx/search.py	Tue Jul 24 12:25:53 2007
@@ -13,8 +13,8 @@
 
 from collections import defaultdict
 from docutils.nodes import Text, NodeVisitor
-from .stemmer import PorterStemmer
-from .json import dump_json
+from .util.stemmer import PorterStemmer
+from .util.json import dump_json
 
 
 word_re = re.compile(r'\w+(?u)')

Deleted: /doctools/trunk/sphinx/smartypants.py
==============================================================================
--- /doctools/trunk/sphinx/smartypants.py	Tue Jul 24 12:25:53 2007
+++ (empty file)
@@ -1,263 +0,0 @@
-r"""
-This is based on SmartyPants.py by `Chad Miller`_.
-
-Copyright and License
-=====================
-
-SmartyPants_ license::
-
-    Copyright (c) 2003 John Gruber
-    (http://daringfireball.net/)
-    All rights reserved.
-
-    Redistribution and use in source and binary forms, with or without
-    modification, are permitted provided that the following conditions are
-    met:
-
-    *   Redistributions of source code must retain the above copyright
-        notice, this list of conditions and the following disclaimer.
-
-    *   Redistributions in binary form must reproduce the above copyright
-        notice, this list of conditions and the following disclaimer in
-        the documentation and/or other materials provided with the
-        distribution.
-
-    *   Neither the name "SmartyPants" nor the names of its contributors
-        may be used to endorse or promote products derived from this
-        software without specific prior written permission.
-
-    This software is provided by the copyright holders and contributors "as
-    is" and any express or implied warranties, including, but not limited
-    to, the implied warranties of merchantability and fitness for a
-    particular purpose are disclaimed. In no event shall the copyright
-    owner or contributors be liable for any direct, indirect, incidental,
-    special, exemplary, or consequential damages (including, but not
-    limited to, procurement of substitute goods or services; loss of use,
-    data, or profits; or business interruption) however caused and on any
-    theory of liability, whether in contract, strict liability, or tort
-    (including negligence or otherwise) arising in any way out of the use
-    of this software, even if advised of the possibility of such damage.
-
-
-smartypants.py license::
-
-    smartypants.py is a derivative work of SmartyPants.
-
-    Redistribution and use in source and binary forms, with or without
-    modification, are permitted provided that the following conditions are
-    met:
-
-    *   Redistributions of source code must retain the above copyright
-        notice, this list of conditions and the following disclaimer.
-
-    *   Redistributions in binary form must reproduce the above copyright
-        notice, this list of conditions and the following disclaimer in
-        the documentation and/or other materials provided with the
-        distribution.
-
-    This software is provided by the copyright holders and contributors "as
-    is" and any express or implied warranties, including, but not limited
-    to, the implied warranties of merchantability and fitness for a
-    particular purpose are disclaimed. In no event shall the copyright
-    owner or contributors be liable for any direct, indirect, incidental,
-    special, exemplary, or consequential damages (including, but not
-    limited to, procurement of substitute goods or services; loss of use,
-    data, or profits; or business interruption) however caused and on any
-    theory of liability, whether in contract, strict liability, or tort
-    (including negligence or otherwise) arising in any way out of the use
-    of this software, even if advised of the possibility of such damage.
-
-.. _Chad Miller: http://web.chad.org/
-"""
-
-import re
-
-
-def sphinx_smarty_pants(t):
-    t = t.replace('&quot;', '"')
-    t = educateDashesOldSchool(t)
-    t = educateQuotes(t)
-    t = t.replace('"', '&quot;')
-    return t
-
-# Constants for quote education.
-
-punct_class = r"""[!"#\$\%'()*+,-.\/:;<=>?\@\[\\\]\^_`{|}~]"""
-close_class = r"""[^\ \t\r\n\[\{\(\-]"""
-dec_dashes = r"""&#8211;|&#8212;"""
-
-# Special case if the very first character is a quote
-# followed by punctuation at a non-word-break. Close the quotes by brute force:
-single_quote_start_re = re.compile(r"""^'(?=%s\\B)""" % (punct_class,))
-double_quote_start_re = re.compile(r"""^"(?=%s\\B)""" % (punct_class,))
-
-# Special case for double sets of quotes, e.g.:
-#   <p>He said, "'Quoted' words in a larger quote."</p>
-double_quote_sets_re = re.compile(r""""'(?=\w)""")
-single_quote_sets_re = re.compile(r"""'"(?=\w)""")
-
-# Special case for decade abbreviations (the '80s):
-decade_abbr_re = re.compile(r"""\b'(?=\d{2}s)""")
-
-# Get most opening double quotes:
-opening_double_quotes_regex = re.compile(r"""
-                (
-                        \s          |   # a whitespace char, or
-                        &nbsp;      |   # a non-breaking space entity, or
-                        --          |   # dashes, or
-                        &[mn]dash;  |   # named dash entities
-                        %s          |   # or decimal entities
-                        &\#x201[34];    # or hex
-                )
-                "                 # the quote
-                (?=\w)            # followed by a word character
-                """ % (dec_dashes,), re.VERBOSE)
-
-# Double closing quotes:
-closing_double_quotes_regex = re.compile(r"""
-                #(%s)?   # character that indicates the quote should be closing
-                "
-                (?=\s)
-                """ % (close_class,), re.VERBOSE)
-
-closing_double_quotes_regex_2 = re.compile(r"""
-                (%s)   # character that indicates the quote should be closing
-                "
-                """ % (close_class,), re.VERBOSE)
-
-# Get most opening single quotes:
-opening_single_quotes_regex = re.compile(r"""
-                (
-                        \s          |   # a whitespace char, or
-                        &nbsp;      |   # a non-breaking space entity, or
-                        --          |   # dashes, or
-                        &[mn]dash;  |   # named dash entities
-                        %s          |   # or decimal entities
-                        &\#x201[34];    # or hex
-                )
-                '                 # the quote
-                (?=\w)            # followed by a word character
-                """ % (dec_dashes,), re.VERBOSE)
-
-closing_single_quotes_regex = re.compile(r"""
-                (%s)
-                '
-                (?!\s | s\b | \d)
-                """ % (close_class,), re.VERBOSE)
-
-closing_single_quotes_regex_2 = re.compile(r"""
-                (%s)
-                '
-                (\s | s\b)
-                """ % (close_class,), re.VERBOSE)
-
-def educateQuotes(str):
-    """
-    Parameter:  String.
-
-    Returns:    The string, with "educated" curly quote HTML entities.
-
-    Example input:  "Isn't this fun?"
-    Example output: &#8220;Isn&#8217;t this fun?&#8221;
-    """
-
-    # Special case if the very first character is a quote
-    # followed by punctuation at a non-word-break. Close the quotes by brute force:
-    str = single_quote_start_re.sub("&#8217;", str)
-    str = double_quote_start_re.sub("&#8221;", str)
-
-    # Special case for double sets of quotes, e.g.:
-    #   <p>He said, "'Quoted' words in a larger quote."</p>
-    str = double_quote_sets_re.sub("&#8220;&#8216;", str)
-    str = single_quote_sets_re.sub("&#8216;&#8220;", str)
-
-    # Special case for decade abbreviations (the '80s):
-    str = decade_abbr_re.sub("&#8217;", str)
-
-    str = opening_single_quotes_regex.sub(r"\1&#8216;", str)
-    str = closing_single_quotes_regex.sub(r"\1&#8217;", str)
-    str = closing_single_quotes_regex_2.sub(r"\1&#8217;\2", str)
-
-    # Any remaining single quotes should be opening ones:
-    str = str.replace("'", "&#8216;")
-
-    str = opening_double_quotes_regex.sub(r"\1&#8220;", str)
-    str = closing_double_quotes_regex.sub(r"&#8221;", str)
-    str = closing_double_quotes_regex_2.sub(r"\1&#8221;", str)
-
-    # Any remaining quotes should be opening ones.
-    str = str.replace('"', "&#8220;")
-
-    return str
-
-
-def educateBackticks(str):
-    """
-    Parameter:  String.
-    Returns:    The string, with ``backticks'' -style double quotes
-        translated into HTML curly quote entities.
-    Example input:  ``Isn't this fun?''
-    Example output: &#8220;Isn't this fun?&#8221;
-    """
-    return str.replace("``", "&#8220;").replace("''", "&#8221;")
-
-
-def educateSingleBackticks(str):
-    """
-    Parameter:  String.
-    Returns:    The string, with `backticks' -style single quotes
-        translated into HTML curly quote entities.
-
-    Example input:  `Isn't this fun?'
-    Example output: &#8216;Isn&#8217;t this fun?&#8217;
-    """
-    return str.replace('`', "&#8216;").replace("'", "&#8217;")
-
-
-def educateDashesOldSchool(str):
-    """
-    Parameter:  String.
-
-    Returns:    The string, with each instance of "--" translated to
-        an en-dash HTML entity, and each "---" translated to
-        an em-dash HTML entity.
-    """
-    return str.replace('---', "&#8212;").replace('--', "&#8211;")
-
-
-def educateDashesOldSchoolInverted(str):
-    """
-    Parameter:  String.
-
-    Returns:    The string, with each instance of "--" translated to
-        an em-dash HTML entity, and each "---" translated to
-        an en-dash HTML entity. Two reasons why: First, unlike the
-        en- and em-dash syntax supported by
-        EducateDashesOldSchool(), it's compatible with existing
-        entries written before SmartyPants 1.1, back when "--" was
-        only used for em-dashes.  Second, em-dashes are more
-        common than en-dashes, and so it sort of makes sense that
-        the shortcut should be shorter to type. (Thanks to Aaron
-        Swartz for the idea.)
-    """
-    return str.replace('---', "&#8211;").replace('--', "&#8212;")
-
-
-
-def educateEllipses(str):
-    """
-    Parameter:  String.
-    Returns:    The string, with each instance of "..." translated to
-        an ellipsis HTML entity.
-
-    Example input:  Huh...?
-    Example output: Huh&#8230;?
-    """
-    return str.replace('...', "&#8230;").replace('. . .', "&#8230;")
-
-
-__author__ = "Chad Miller <smartypantspy at chad.org>"
-__version__ = "1.5_1.5: Sat, 13 Aug 2005 15:50:24 -0400"
-__url__ = "http://wiki.chad.org/SmartyPantsPy"
-__description__ = \
-    "Smart-quotes, smart-ellipses, and smart-dashes for weblog entries in pyblosxom"

Deleted: /doctools/trunk/sphinx/stemmer.py
==============================================================================
--- /doctools/trunk/sphinx/stemmer.py	Tue Jul 24 12:25:53 2007
+++ (empty file)
@@ -1,344 +0,0 @@
-#!/usr/bin/env python
-# -*- coding: utf-8 -*-
-"""
-    sphinx.stemmer
-    ~~~~~~~~~~~~~~
-
-    Porter Stemming Algorithm
-
-    This is the Porter stemming algorithm, ported to Python from the
-    version coded up in ANSI C by the author. It may be be regarded
-    as canonical, in that it follows the algorithm presented in
-
-    Porter, 1980, An algorithm for suffix stripping, Program, Vol. 14,
-    no. 3, pp 130-137,
-
-    only differing from it at the points maked --DEPARTURE-- below.
-
-    See also http://www.tartarus.org/~martin/PorterStemmer
-
-    The algorithm as described in the paper could be exactly replicated
-    by adjusting the points of DEPARTURE, but this is barely necessary,
-    because (a) the points of DEPARTURE are definitely improvements, and
-    (b) no encoding of the Porter stemmer I have seen is anything like
-    as exact as this version, even with the points of DEPARTURE!
-
-    Release 1: January 2001
-
-    :copyright: 2001 by Vivake Gupta <v at nano.com>.
-    :license: Public Domain (?).
-"""
-
-class PorterStemmer(object):
-
-    def __init__(self):
-        """The main part of the stemming algorithm starts here.
-        b is a buffer holding a word to be stemmed. The letters are in b[k0],
-        b[k0+1] ... ending at b[k]. In fact k0 = 0 in this demo program. k is
-        readjusted downwards as the stemming progresses. Zero termination is
-        not in fact used in the algorithm.
-
-        Note that only lower case sequences are stemmed. Forcing to lower case
-        should be done before stem(...) is called.
-        """
-
-        self.b = ""  # buffer for word to be stemmed
-        self.k = 0
-        self.k0 = 0
-        self.j = 0   # j is a general offset into the string
-
-    def cons(self, i):
-        """cons(i) is TRUE <=> b[i] is a consonant."""
-        if self.b[i] == 'a' or self.b[i] == 'e' or self.b[i] == 'i' \
-            or self.b[i] == 'o' or self.b[i] == 'u':
-            return 0
-        if self.b[i] == 'y':
-            if i == self.k0:
-                return 1
-            else:
-                return (not self.cons(i - 1))
-        return 1
-
-    def m(self):
-        """m() measures the number of consonant sequences between k0 and j.
-        if c is a consonant sequence and v a vowel sequence, and <..>
-        indicates arbitrary presence,
-
-           <c><v>       gives 0
-           <c>vc<v>     gives 1
-           <c>vcvc<v>   gives 2
-           <c>vcvcvc<v> gives 3
-           ....
-        """
-        n = 0
-        i = self.k0
-        while 1:
-            if i > self.j:
-                return n
-            if not self.cons(i):
-                break
-            i = i + 1
-        i = i + 1
-        while 1:
-            while 1:
-                if i > self.j:
-                    return n
-                if self.cons(i):
-                    break
-                i = i + 1
-            i = i + 1
-            n = n + 1
-            while 1:
-                if i > self.j:
-                    return n
-                if not self.cons(i):
-                    break
-                i = i + 1
-            i = i + 1
-
-    def vowelinstem(self):
-        """vowelinstem() is TRUE <=> k0,...j contains a vowel"""
-        for i in range(self.k0, self.j + 1):
-            if not self.cons(i):
-                return 1
-        return 0
-
-    def doublec(self, j):
-        """doublec(j) is TRUE <=> j,(j-1) contain a double consonant."""
-        if j < (self.k0 + 1):
-            return 0
-        if (self.b[j] != self.b[j-1]):
-            return 0
-        return self.cons(j)
-
-    def cvc(self, i):
-        """cvc(i) is TRUE <=> i-2,i-1,i has the form consonant - vowel - consonant
-        and also if the second c is not w,x or y. this is used when trying to
-        restore an e at the end of a short  e.g.
-
-           cav(e), lov(e), hop(e), crim(e), but
-           snow, box, tray.
-        """
-        if i < (self.k0 + 2) or not self.cons(i) or self.cons(i-1) or not self.cons(i-2):
-            return 0
-        ch = self.b[i]
-        if ch == 'w' or ch == 'x' or ch == 'y':
-            return 0
-        return 1
-
-    def ends(self, s):
-        """ends(s) is TRUE <=> k0,...k ends with the string s."""
-        length = len(s)
-        if s[length - 1] != self.b[self.k]: # tiny speed-up
-            return 0
-        if length > (self.k - self.k0 + 1):
-            return 0
-        if self.b[self.k-length+1:self.k+1] != s:
-            return 0
-        self.j = self.k - length
-        return 1
-
-    def setto(self, s):
-        """setto(s) sets (j+1),...k to the characters in the string s, readjusting k."""
-        length = len(s)
-        self.b = self.b[:self.j+1] + s + self.b[self.j+length+1:]
-        self.k = self.j + length
-
-    def r(self, s):
-        """r(s) is used further down."""
-        if self.m() > 0:
-            self.setto(s)
-
-    def step1ab(self):
-        """step1ab() gets rid of plurals and -ed or -ing. e.g.
-
-           caresses  ->  caress
-           ponies    ->  poni
-           ties      ->  ti
-           caress    ->  caress
-           cats      ->  cat
-
-           feed      ->  feed
-           agreed    ->  agree
-           disabled  ->  disable
-
-           matting   ->  mat
-           mating    ->  mate
-           meeting   ->  meet
-           milling   ->  mill
-           messing   ->  mess
-
-           meetings  ->  meet
-        """
-        if self.b[self.k] == 's':
-            if self.ends("sses"):
-                self.k = self.k - 2
-            elif self.ends("ies"):
-                self.setto("i")
-            elif self.b[self.k - 1] != 's':
-                self.k = self.k - 1
-        if self.ends("eed"):
-            if self.m() > 0:
-                self.k = self.k - 1
-        elif (self.ends("ed") or self.ends("ing")) and self.vowelinstem():
-            self.k = self.j
-            if self.ends("at"):   self.setto("ate")
-            elif self.ends("bl"): self.setto("ble")
-            elif self.ends("iz"): self.setto("ize")
-            elif self.doublec(self.k):
-                self.k = self.k - 1
-                ch = self.b[self.k]
-                if ch == 'l' or ch == 's' or ch == 'z':
-                    self.k = self.k + 1
-            elif (self.m() == 1 and self.cvc(self.k)):
-                self.setto("e")
-
-    def step1c(self):
-        """step1c() turns terminal y to i when there is another vowel in the stem."""
-        if (self.ends("y") and self.vowelinstem()):
-            self.b = self.b[:self.k] + 'i' + self.b[self.k+1:]
-
-    def step2(self):
-        """step2() maps double suffices to single ones.
-        so -ization ( = -ize plus -ation) maps to -ize etc. note that the
-        string before the suffix must give m() > 0.
-        """
-        if self.b[self.k - 1] == 'a':
-            if self.ends("ational"):   self.r("ate")
-            elif self.ends("tional"):  self.r("tion")
-        elif self.b[self.k - 1] == 'c':
-            if self.ends("enci"):      self.r("ence")
-            elif self.ends("anci"):    self.r("ance")
-        elif self.b[self.k - 1] == 'e':
-            if self.ends("izer"):      self.r("ize")
-        elif self.b[self.k - 1] == 'l':
-            if self.ends("bli"):       self.r("ble") # --DEPARTURE--
-            # To match the published algorithm, replace this phrase with
-            #   if self.ends("abli"):      self.r("able")
-            elif self.ends("alli"):    self.r("al")
-            elif self.ends("entli"):   self.r("ent")
-            elif self.ends("eli"):     self.r("e")
-            elif self.ends("ousli"):   self.r("ous")
-        elif self.b[self.k - 1] == 'o':
-            if self.ends("ization"):   self.r("ize")
-            elif self.ends("ation"):   self.r("ate")
-            elif self.ends("ator"):    self.r("ate")
-        elif self.b[self.k - 1] == 's':
-            if self.ends("alism"):     self.r("al")
-            elif self.ends("iveness"): self.r("ive")
-            elif self.ends("fulness"): self.r("ful")
-            elif self.ends("ousness"): self.r("ous")
-        elif self.b[self.k - 1] == 't':
-            if self.ends("aliti"):     self.r("al")
-            elif self.ends("iviti"):   self.r("ive")
-            elif self.ends("biliti"):  self.r("ble")
-        elif self.b[self.k - 1] == 'g': # --DEPARTURE--
-            if self.ends("logi"):      self.r("log")
-        # To match the published algorithm, delete this phrase
-
-    def step3(self):
-        """step3() dels with -ic-, -full, -ness etc. similar strategy to step2."""
-        if self.b[self.k] == 'e':
-            if self.ends("icate"):     self.r("ic")
-            elif self.ends("ative"):   self.r("")
-            elif self.ends("alize"):   self.r("al")
-        elif self.b[self.k] == 'i':
-            if self.ends("iciti"):     self.r("ic")
-        elif self.b[self.k] == 'l':
-            if self.ends("ical"):      self.r("ic")
-            elif self.ends("ful"):     self.r("")
-        elif self.b[self.k] == 's':
-            if self.ends("ness"):      self.r("")
-
-    def step4(self):
-        """step4() takes off -ant, -ence etc., in context <c>vcvc<v>."""
-        if self.b[self.k - 1] == 'a':
-            if self.ends("al"): pass
-            else: return
-        elif self.b[self.k - 1] == 'c':
-            if self.ends("ance"): pass
-            elif self.ends("ence"): pass
-            else: return
-        elif self.b[self.k - 1] == 'e':
-            if self.ends("er"): pass
-            else: return
-        elif self.b[self.k - 1] == 'i':
-            if self.ends("ic"): pass
-            else: return
-        elif self.b[self.k - 1] == 'l':
-            if self.ends("able"): pass
-            elif self.ends("ible"): pass
-            else: return
-        elif self.b[self.k - 1] == 'n':
-            if self.ends("ant"): pass
-            elif self.ends("ement"): pass
-            elif self.ends("ment"): pass
-            elif self.ends("ent"): pass
-            else: return
-        elif self.b[self.k - 1] == 'o':
-            if self.ends("ion") and (self.b[self.j] == 's' \
-                or self.b[self.j] == 't'): pass
-            elif self.ends("ou"): pass
-            # takes care of -ous
-            else: return
-        elif self.b[self.k - 1] == 's':
-            if self.ends("ism"): pass
-            else: return
-        elif self.b[self.k - 1] == 't':
-            if self.ends("ate"): pass
-            elif self.ends("iti"): pass
-            else: return
-        elif self.b[self.k - 1] == 'u':
-            if self.ends("ous"): pass
-            else: return
-        elif self.b[self.k - 1] == 'v':
-            if self.ends("ive"): pass
-            else: return
-        elif self.b[self.k - 1] == 'z':
-            if self.ends("ize"): pass
-            else: return
-        else:
-            return
-        if self.m() > 1:
-            self.k = self.j
-
-    def step5(self):
-        """step5() removes a final -e if m() > 1, and changes -ll to -l if
-        m() > 1.
-        """
-        self.j = self.k
-        if self.b[self.k] == 'e':
-            a = self.m()
-            if a > 1 or (a == 1 and not self.cvc(self.k-1)):
-                self.k = self.k - 1
-        if self.b[self.k] == 'l' and self.doublec(self.k) and self.m() > 1:
-            self.k = self.k -1
-
-    def stem(self, p, i, j):
-        """In stem(p,i,j), p is a char pointer, and the string to be stemmed
-        is from p[i] to p[j] inclusive. Typically i is zero and j is the
-        offset to the last character of a string, (p[j+1] == '\0'). The
-        stemmer adjusts the characters p[i] ... p[j] and returns the new
-        end-point of the string, k. Stemming never increases word length, so
-        i <= k <= j. To turn the stemmer into a module, declare 'stem' as
-        extern, and delete the remainder of this file.
-        """
-        # copy the parameters into statics
-        self.b = p
-        self.k = j
-        self.k0 = i
-        if self.k <= self.k0 + 1:
-            return self.b # --DEPARTURE--
-
-        # With this line, strings of length 1 or 2 don't go through the
-        # stemming process, although no mention is made of this in the
-        # published algorithm. Remove the line to match the published
-        # algorithm.
-
-        self.step1ab()
-        self.step1c()
-        self.step2()
-        self.step3()
-        self.step4()
-        self.step5()
-        return self.b[self.k0:self.k+1]

Deleted: /doctools/trunk/sphinx/util.py
==============================================================================
--- /doctools/trunk/sphinx/util.py	Tue Jul 24 12:25:53 2007
+++ (empty file)
@@ -1,109 +0,0 @@
-# -*- coding: utf-8 -*-
-"""
-    sphinx.util
-    ~~~~~~~~~~~
-
-    Utility functions for Sphinx.
-
-    :copyright: 2007 by Georg Brandl.
-    :license: Python license.
-"""
-
-import os
-import sys
-import fnmatch
-from os import path
-
-
-def relative_uri(base, to):
-    """Return a relative URL from ``base`` to ``to``."""
-    b2 = base.split('/')
-    t2 = to.split('/')
-    # remove common segments
-    for x, y in zip(b2, t2):
-        if x != y:
-            break
-        b2.pop(0)
-        t2.pop(0)
-    return '../' * (len(b2)-1) + '/'.join(t2)
-
-
-def ensuredir(path):
-    """Ensure that a path exists."""
-    try:
-        os.makedirs(path)
-    except OSError, err:
-        if not err.errno == 17:
-            raise
-
-
-def status_iterator(iterable, colorfunc=lambda x: x, stream=sys.stdout):
-    """Print out each item before yielding it."""
-    for item in iterable:
-        print >>stream, colorfunc(item),
-        stream.flush()
-        yield item
-    print >>stream
-
-
-def get_matching_files(dirname, pattern, exclude=()):
-    """Get all files matching a pattern in a directory, recursively."""
-    # dirname is a normalized absolute path.
-    dirname = path.normpath(path.abspath(dirname))
-    dirlen = len(dirname) + 1    # exclude slash
-    for root, dirs, files in os.walk(dirname):
-        dirs.sort()
-        files.sort()
-        for sfile in files:
-            if not fnmatch.fnmatch(sfile, pattern):
-                continue
-            qualified_name = path.join(root[dirlen:], sfile)
-            if qualified_name in exclude:
-                continue
-            yield qualified_name
-
-
-def get_category(filename):
-    """Get the "category" part of a RST filename."""
-    parts = filename.split('/', 1)
-    if len(parts) < 2:
-        return
-    return parts[0]
-
-
-def shorten_result(text='', keywords=[], maxlen=240, fuzz=60):
-    if not text:
-        text = ''
-    text_low = text.lower()
-    beg = -1
-    for k in keywords:
-        i = text_low.find(k.lower())
-        if (i > -1 and i < beg) or beg == -1:
-            beg = i
-    excerpt_beg = 0
-    if beg > fuzz:
-        for sep in ('.', ':', ';', '='):
-            eb = text.find(sep, beg - fuzz, beg - 1)
-            if eb > -1:
-                eb += 1
-                break
-        else:
-            eb = beg - fuzz
-        excerpt_beg = eb
-    if excerpt_beg < 0:
-        excerpt_beg = 0
-    msg = text[excerpt_beg:beg+maxlen]
-    if beg > fuzz:
-        msg = '... ' + msg
-    if beg < len(text)-maxlen:
-        msg = msg + ' ...'
-    return msg
-
-
-class attrdict(dict):
-    def __getattr__(self, key):
-        return self[key]
-    def __setattr__(self, key, val):
-        self[key] = val
-    def __delattr__(self, key):
-        del self[key]

Copied: doctools/trunk/sphinx/util/console.py (from r56508, doctools/trunk/sphinx/console.py)
==============================================================================
--- doctools/trunk/sphinx/console.py	(original)
+++ doctools/trunk/sphinx/util/console.py	Tue Jul 24 12:25:53 2007
@@ -1,7 +1,7 @@
 # -*- coding: utf-8 -*-
 """
-    sphinx.console
-    ~~~~~~~~~~~~~~
+    sphinx.util.console
+    ~~~~~~~~~~~~~~~~~~~
 
     Format colored console output.
 

Copied: doctools/trunk/sphinx/util/json.py (from r56508, doctools/trunk/sphinx/json.py)
==============================================================================
--- doctools/trunk/sphinx/json.py	(original)
+++ doctools/trunk/sphinx/util/json.py	Tue Jul 24 12:25:53 2007
@@ -1,7 +1,7 @@
 # -*- coding: utf-8 -*-
 """
-    sphinx.json
-    ~~~~~~~~~~~
+    sphinx.util.json
+    ~~~~~~~~~~~~~~~~
 
     Minimal JSON module that generates small dumps.
 

Copied: doctools/trunk/sphinx/util/stemmer.py (from r56508, doctools/trunk/sphinx/stemmer.py)
==============================================================================
--- doctools/trunk/sphinx/stemmer.py	(original)
+++ doctools/trunk/sphinx/util/stemmer.py	Tue Jul 24 12:25:53 2007
@@ -1,8 +1,8 @@
 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 """
-    sphinx.stemmer
-    ~~~~~~~~~~~~~~
+    sphinx.util.stemmer
+    ~~~~~~~~~~~~~~~~~~~
 
     Porter Stemming Algorithm
 

Modified: doctools/trunk/sphinx/web/wsgiutil.py
==============================================================================
--- doctools/trunk/sphinx/web/wsgiutil.py	(original)
+++ doctools/trunk/sphinx/web/wsgiutil.py	Tue Jul 24 12:25:53 2007
@@ -24,7 +24,7 @@
 from cStringIO import StringIO
 
 from .util import lazy_property
-from .json import dump_json
+from ..util.json import dump_json
 
 
 HTTP_STATUS_CODES = {

Modified: doctools/trunk/sphinx/writer.py
==============================================================================
--- doctools/trunk/sphinx/writer.py	(original)
+++ doctools/trunk/sphinx/writer.py	Tue Jul 24 12:25:53 2007
@@ -12,7 +12,7 @@
 from docutils import nodes
 from docutils.writers.html4css1 import Writer, HTMLTranslator as BaseTranslator
 
-from .smartypants import sphinx_smarty_pants
+from .util.smartypants import sphinx_smarty_pants
 
 
 class HTMLWriter(Writer):


More information about the Python-checkins mailing list