[Python-checkins] r56518 - in doctools/trunk: HACKING TODO sphinx/__init__.py sphinx/builder.py sphinx/console.py sphinx/json.py sphinx/search.py sphinx/smartypants.py sphinx/stemmer.py sphinx/util sphinx/util.py sphinx/util/__init__.py sphinx/util/console.py sphinx/util/json.py sphinx/util/smartypants.py sphinx/util/stemmer.py sphinx/web/wsgiutil.py sphinx/writer.py
georg.brandl
python-checkins at python.org
Tue Jul 24 12:25:54 CEST 2007
Author: georg.brandl
Date: Tue Jul 24 12:25:53 2007
New Revision: 56518
Added:
doctools/trunk/HACKING
doctools/trunk/sphinx/util/
doctools/trunk/sphinx/util/__init__.py
- copied unchanged from r56508, doctools/trunk/sphinx/util.py
doctools/trunk/sphinx/util/console.py
- copied, changed from r56508, doctools/trunk/sphinx/console.py
doctools/trunk/sphinx/util/json.py
- copied, changed from r56508, doctools/trunk/sphinx/json.py
doctools/trunk/sphinx/util/smartypants.py
- copied unchanged from r56508, doctools/trunk/sphinx/smartypants.py
doctools/trunk/sphinx/util/stemmer.py
- copied, changed from r56508, doctools/trunk/sphinx/stemmer.py
Removed:
doctools/trunk/sphinx/console.py
doctools/trunk/sphinx/json.py
doctools/trunk/sphinx/smartypants.py
doctools/trunk/sphinx/stemmer.py
doctools/trunk/sphinx/util.py
Modified:
doctools/trunk/TODO
doctools/trunk/sphinx/__init__.py
doctools/trunk/sphinx/builder.py
doctools/trunk/sphinx/search.py
doctools/trunk/sphinx/web/wsgiutil.py
doctools/trunk/sphinx/writer.py
Log:
Move utils to separate package, add coding document.
Added: doctools/trunk/HACKING
==============================================================================
--- (empty file)
+++ doctools/trunk/HACKING Tue Jul 24 12:25:53 2007
@@ -0,0 +1,140 @@
+.. -*- mode: rst -*-
+
+===============
+Coding overview
+===============
+
+This document tries to give you a cursory overview of the doctools code.
+
+
+Dependencies
+------------
+
+The converter doesn't have any dependencies except Python 2.5.
+
+Sphinx needs Python 2.5, Docutils 0.4 (not SVN, because of API changes), Jinja
+>= 1.1 (which is at the moment included as an SVN external) and Pygments >= 0.8
+(which is optional and can be installed from the cheese shop).
+
+
+The converter
+-------------
+
+There's not too much to say about the converter. It's quite as finished as
+possible, and as it has to only work with the body of documentation found in the
+Python core, it doesn't have to be as general as possible.
+
+(If other projects using the LaTeX documentation toolchain want to convert their
+docs to the new format, the converter will probably have to be amended.)
+
+In ``restwriter.py``, there's some commentary about the inner works of the
+converter concerning a single file.
+
+The ``filenamemap.py`` file tells the converter how to rearrange the converted
+files in the reST source directories. There, for example, the tutorial is split
+up in several files, and old or unusable files are flagged as not convertable.
+Also, non-LaTeX files, such as code include files, are listed to be copied into
+corresponding directories.
+
+The directory ``newfiles`` contains a bunch of files that didn't exist in the
+old distribution, such as the documentation of Sphinx markup, that will be
+copied to the reST directory too.
+
+
+Sphinx
+------
+
+Sphinx consists of two parts:
+
+* The builder takes the reST sources and converts them into an output format.
+ (Presently, HTML, HTML Help or webapp-usable pickles.)
+
+* The web application takes the webapp-usable pickles, which mainly contain the
+ HTML bodies converted from reST and some additional information, and turns them
+ into a WSGI application, complete with commenting, navigation etc.
+ (The subpackage ``web`` is responsible for this.)
+
+An overview of the source files:
+
+addnodes.py
+ Contains docutils node classes that are not part of standard docutils. These
+ node classes must be handled by every docutils writer that gets one of our
+ nodetrees.
+
+ (The docutils parse a reST document into a tree of "nodes". This nodetree can
+ then be converted into an internal representation, XML or anything a Writer
+ exists for.)
+
+builder.py
+ Contains the Builder classes, which are responsible for the process of building
+ the output files from docutils node trees.
+
+ The builder is called by ``sphinx-build.py``.
+
+directives.py
+ Directive functions that transform our custom directives (like ``.. function::``)
+ into doctree nodes.
+
+environment.py
+ The "build environment", a class that holds metadata about all doctrees, and is
+ responsible for building them out of reST source files.
+
+ The environment is stored, in a pickled form, in the output directory, in
+ order to enable incremental builds if only a few source files change, which
+ usually is the case.
+
+highlighting.py
+ Glue to the Pygments highlighting library. Will use no highlighting at all if
+ that is not installed. Probably a stripped down version of the Pygments Python
+ lexer and HTML formatter could be included.
+
+htmlhelp.py
+ HTML help builder helper methods.
+
+_jinja.py, jinja
+ The Jinja templating engine, used for all HTML-related builders.
+
+refcounting.py
+ Helper to keep track of reference count data for the C API reference,
+ which is maintained as a separate file.
+
+roles.py
+ Role functions that transform our custom roles (like ``:meth:``) into doctree
+ nodes.
+
+search.py
+ Helper to create a search index for the offline search.
+
+style
+ Directory for all static files for HTML-related builders.
+
+templates
+ Directory for Jinja templates, ATM only for HTML.
+
+util
+ General utilities.
+
+writer.py
+ The docutils HTML writer subclass which understands our additional nodes.
+
+
+Code style
+----------
+
+PEP 8 (http://www.python.org/dev/peps/pep-0008) must be observed, with the
+following exceptions:
+
+* Line length is limited to 90 characters.
+* Relative imports are used, using with the new-in-2.5 'leading dot' syntax.
+
+The file encoding is UTF-8, this should be indicated in the file's first line
+with ::
+
+ # -*- coding: utf-8 -*-
+
+
+Python 3.0 compatibility
+------------------------
+
+As it will be used for Python 3.0 too, the toolset should be kept in a state
+where it is fully usable Python 3 code after one run of the ``2to3`` utility.
Modified: doctools/trunk/TODO
==============================================================================
--- doctools/trunk/TODO (original)
+++ doctools/trunk/TODO Tue Jul 24 12:25:53 2007
@@ -2,6 +2,7 @@
===========
- discuss and debug comments system
+- navigation links at the bottom too
- write new Makefile, handle automatic version info and checkout
- write a "printable" builder (export to latex, most probably)
- discuss the default role
Modified: doctools/trunk/sphinx/__init__.py
==============================================================================
--- doctools/trunk/sphinx/__init__.py (original)
+++ doctools/trunk/sphinx/__init__.py Tue Jul 24 12:25:53 2007
@@ -14,7 +14,7 @@
from os import path
from .builder import builders
-from .console import nocolor
+from .util.console import nocolor
__version__ = '$Revision: 5369 $'
@@ -99,6 +99,10 @@
elif opt == '-N':
nocolor()
+ if sys.platform == 'win32':
+ # Windows' cmd box doesn't understand ANSI sequences
+ nocolor()
+
if builder is None:
print 'No builder selected, using default: html'
builder = 'html'
Modified: doctools/trunk/sphinx/builder.py
==============================================================================
--- doctools/trunk/sphinx/builder.py (original)
+++ doctools/trunk/sphinx/builder.py Tue Jul 24 12:25:53 2007
@@ -29,7 +29,7 @@
from .util import (get_matching_files, attrdict, status_iterator,
ensuredir, get_category, relative_uri)
from .writer import HTMLWriter
-from .console import bold, purple, green
+from .util.console import bold, purple, green
from .htmlhelp import build_hhx
from .environment import BuildEnvironment
from .highlighting import pygments, get_stylesheet
Deleted: /doctools/trunk/sphinx/console.py
==============================================================================
--- /doctools/trunk/sphinx/console.py Tue Jul 24 12:25:53 2007
+++ (empty file)
@@ -1,53 +0,0 @@
-# -*- coding: utf-8 -*-
-"""
- sphinx.console
- ~~~~~~~~~~~~~~
-
- Format colored console output.
-
- :copyright: 2007 by Georg Brandl.
- :license: Python license.
-"""
-
-codes = {}
-
-def nocolor():
- codes.clear()
-
-def colorize(name, text):
- return codes.get(name, '') + text + codes.get('reset', '')
-
-def create_color_func(name):
- def inner(text):
- return colorize(name, text)
- globals()[name] = inner
-
-_attrs = {
- 'reset': '39;49;00m',
- 'bold': '01m',
- 'faint': '02m',
- 'standout': '03m',
- 'underline': '04m',
- 'blink': '05m',
-}
-
-for name, value in _attrs.items():
- codes[name] = '\x1b[' + value
-
-_colors = [
- ('black', 'darkgray'),
- ('darkred', 'red'),
- ('darkgreen', 'green'),
- ('brown', 'yellow'),
- ('darkblue', 'blue'),
- ('purple', 'fuchsia'),
- ('turquoise', 'teal'),
- ('lightgray', 'white'),
-]
-
-for i, (dark, light) in enumerate(_colors):
- codes[dark] = '\x1b[%im' % (i+30)
- codes[light] = '\x1b[%i;01m' % (i+30)
-
-for name in codes:
- create_color_func(name)
Deleted: /doctools/trunk/sphinx/json.py
==============================================================================
--- /doctools/trunk/sphinx/json.py Tue Jul 24 12:25:53 2007
+++ (empty file)
@@ -1,72 +0,0 @@
-# -*- coding: utf-8 -*-
-"""
- sphinx.json
- ~~~~~~~~~~~
-
- Minimal JSON module that generates small dumps.
-
- This is not fully JSON compliant but enough for the searchindex.
- And the generated files are smaller than the simplejson ones.
-
- Uses the basestring encode function from simplejson.
-
- :copyright: 2007 by Armin Ronacher, Bob Ippolito.
- :license: Python license.
-"""
-
-import re
-
-ESCAPE = re.compile(r'[\x00-\x19\\"\b\f\n\r\t]')
-ESCAPE_ASCII = re.compile(r'([\\"]|[^\ -~])')
-ESCAPE_DICT = {
- '\\': '\\\\',
- '"': '\\"',
- '\b': '\\b',
- '\f': '\\f',
- '\n': '\\n',
- '\r': '\\r',
- '\t': '\\t',
-}
-for i in range(0x20):
- ESCAPE_DICT.setdefault(chr(i), '\\u%04x' % (i,))
-
-
-def encode_basestring_ascii(s):
- def replace(match):
- s = match.group(0)
- try:
- return ESCAPE_DICT[s]
- except KeyError:
- n = ord(s)
- if n < 0x10000:
- return '\\u%04x' % (n,)
- else:
- # surrogate pair
- n -= 0x10000
- s1 = 0xd800 | ((n >> 10) & 0x3ff)
- s2 = 0xdc00 | (n & 0x3ff)
- return '\\u%04x\\u%04x' % (s1, s2)
- return '"' + str(ESCAPE_ASCII.sub(replace, s)) + '"'
-
-
-def dump_json(obj, key=False):
- if key:
- if not isinstance(obj, basestring):
- obj = str(obj)
- return encode_basestring_ascii(obj)
- if obj is None:
- return 'null'
- elif obj is True or obj is False:
- return obj and 'true' or 'false'
- elif isinstance(obj, (int, long, float)):
- return str(obj)
- elif isinstance(obj, dict):
- return '{%s}' % ','.join('%s:%s' % (
- dump_json(key, True),
- dump_json(value)
- ) for key, value in obj.iteritems())
- elif isinstance(obj, (tuple, list, set)):
- return '[%s]' % ','.join(dump_json(x) for x in obj)
- elif isinstance(obj, basestring):
- return encode_basestring_ascii(obj)
- raise TypeError(type(obj))
Modified: doctools/trunk/sphinx/search.py
==============================================================================
--- doctools/trunk/sphinx/search.py (original)
+++ doctools/trunk/sphinx/search.py Tue Jul 24 12:25:53 2007
@@ -13,8 +13,8 @@
from collections import defaultdict
from docutils.nodes import Text, NodeVisitor
-from .stemmer import PorterStemmer
-from .json import dump_json
+from .util.stemmer import PorterStemmer
+from .util.json import dump_json
word_re = re.compile(r'\w+(?u)')
Deleted: /doctools/trunk/sphinx/smartypants.py
==============================================================================
--- /doctools/trunk/sphinx/smartypants.py Tue Jul 24 12:25:53 2007
+++ (empty file)
@@ -1,263 +0,0 @@
-r"""
-This is based on SmartyPants.py by `Chad Miller`_.
-
-Copyright and License
-=====================
-
-SmartyPants_ license::
-
- Copyright (c) 2003 John Gruber
- (http://daringfireball.net/)
- All rights reserved.
-
- Redistribution and use in source and binary forms, with or without
- modification, are permitted provided that the following conditions are
- met:
-
- * Redistributions of source code must retain the above copyright
- notice, this list of conditions and the following disclaimer.
-
- * Redistributions in binary form must reproduce the above copyright
- notice, this list of conditions and the following disclaimer in
- the documentation and/or other materials provided with the
- distribution.
-
- * Neither the name "SmartyPants" nor the names of its contributors
- may be used to endorse or promote products derived from this
- software without specific prior written permission.
-
- This software is provided by the copyright holders and contributors "as
- is" and any express or implied warranties, including, but not limited
- to, the implied warranties of merchantability and fitness for a
- particular purpose are disclaimed. In no event shall the copyright
- owner or contributors be liable for any direct, indirect, incidental,
- special, exemplary, or consequential damages (including, but not
- limited to, procurement of substitute goods or services; loss of use,
- data, or profits; or business interruption) however caused and on any
- theory of liability, whether in contract, strict liability, or tort
- (including negligence or otherwise) arising in any way out of the use
- of this software, even if advised of the possibility of such damage.
-
-
-smartypants.py license::
-
- smartypants.py is a derivative work of SmartyPants.
-
- Redistribution and use in source and binary forms, with or without
- modification, are permitted provided that the following conditions are
- met:
-
- * Redistributions of source code must retain the above copyright
- notice, this list of conditions and the following disclaimer.
-
- * Redistributions in binary form must reproduce the above copyright
- notice, this list of conditions and the following disclaimer in
- the documentation and/or other materials provided with the
- distribution.
-
- This software is provided by the copyright holders and contributors "as
- is" and any express or implied warranties, including, but not limited
- to, the implied warranties of merchantability and fitness for a
- particular purpose are disclaimed. In no event shall the copyright
- owner or contributors be liable for any direct, indirect, incidental,
- special, exemplary, or consequential damages (including, but not
- limited to, procurement of substitute goods or services; loss of use,
- data, or profits; or business interruption) however caused and on any
- theory of liability, whether in contract, strict liability, or tort
- (including negligence or otherwise) arising in any way out of the use
- of this software, even if advised of the possibility of such damage.
-
-.. _Chad Miller: http://web.chad.org/
-"""
-
-import re
-
-
-def sphinx_smarty_pants(t):
- t = t.replace('"', '"')
- t = educateDashesOldSchool(t)
- t = educateQuotes(t)
- t = t.replace('"', '"')
- return t
-
-# Constants for quote education.
-
-punct_class = r"""[!"#\$\%'()*+,-.\/:;<=>?\@\[\\\]\^_`{|}~]"""
-close_class = r"""[^\ \t\r\n\[\{\(\-]"""
-dec_dashes = r"""–|—"""
-
-# Special case if the very first character is a quote
-# followed by punctuation at a non-word-break. Close the quotes by brute force:
-single_quote_start_re = re.compile(r"""^'(?=%s\\B)""" % (punct_class,))
-double_quote_start_re = re.compile(r"""^"(?=%s\\B)""" % (punct_class,))
-
-# Special case for double sets of quotes, e.g.:
-# <p>He said, "'Quoted' words in a larger quote."</p>
-double_quote_sets_re = re.compile(r""""'(?=\w)""")
-single_quote_sets_re = re.compile(r"""'"(?=\w)""")
-
-# Special case for decade abbreviations (the '80s):
-decade_abbr_re = re.compile(r"""\b'(?=\d{2}s)""")
-
-# Get most opening double quotes:
-opening_double_quotes_regex = re.compile(r"""
- (
- \s | # a whitespace char, or
- | # a non-breaking space entity, or
- -- | # dashes, or
- &[mn]dash; | # named dash entities
- %s | # or decimal entities
- &\#x201[34]; # or hex
- )
- " # the quote
- (?=\w) # followed by a word character
- """ % (dec_dashes,), re.VERBOSE)
-
-# Double closing quotes:
-closing_double_quotes_regex = re.compile(r"""
- #(%s)? # character that indicates the quote should be closing
- "
- (?=\s)
- """ % (close_class,), re.VERBOSE)
-
-closing_double_quotes_regex_2 = re.compile(r"""
- (%s) # character that indicates the quote should be closing
- "
- """ % (close_class,), re.VERBOSE)
-
-# Get most opening single quotes:
-opening_single_quotes_regex = re.compile(r"""
- (
- \s | # a whitespace char, or
- | # a non-breaking space entity, or
- -- | # dashes, or
- &[mn]dash; | # named dash entities
- %s | # or decimal entities
- &\#x201[34]; # or hex
- )
- ' # the quote
- (?=\w) # followed by a word character
- """ % (dec_dashes,), re.VERBOSE)
-
-closing_single_quotes_regex = re.compile(r"""
- (%s)
- '
- (?!\s | s\b | \d)
- """ % (close_class,), re.VERBOSE)
-
-closing_single_quotes_regex_2 = re.compile(r"""
- (%s)
- '
- (\s | s\b)
- """ % (close_class,), re.VERBOSE)
-
-def educateQuotes(str):
- """
- Parameter: String.
-
- Returns: The string, with "educated" curly quote HTML entities.
-
- Example input: "Isn't this fun?"
- Example output: “Isn’t this fun?”
- """
-
- # Special case if the very first character is a quote
- # followed by punctuation at a non-word-break. Close the quotes by brute force:
- str = single_quote_start_re.sub("’", str)
- str = double_quote_start_re.sub("”", str)
-
- # Special case for double sets of quotes, e.g.:
- # <p>He said, "'Quoted' words in a larger quote."</p>
- str = double_quote_sets_re.sub("“‘", str)
- str = single_quote_sets_re.sub("‘“", str)
-
- # Special case for decade abbreviations (the '80s):
- str = decade_abbr_re.sub("’", str)
-
- str = opening_single_quotes_regex.sub(r"\1‘", str)
- str = closing_single_quotes_regex.sub(r"\1’", str)
- str = closing_single_quotes_regex_2.sub(r"\1’\2", str)
-
- # Any remaining single quotes should be opening ones:
- str = str.replace("'", "‘")
-
- str = opening_double_quotes_regex.sub(r"\1“", str)
- str = closing_double_quotes_regex.sub(r"”", str)
- str = closing_double_quotes_regex_2.sub(r"\1”", str)
-
- # Any remaining quotes should be opening ones.
- str = str.replace('"', "“")
-
- return str
-
-
-def educateBackticks(str):
- """
- Parameter: String.
- Returns: The string, with ``backticks'' -style double quotes
- translated into HTML curly quote entities.
- Example input: ``Isn't this fun?''
- Example output: “Isn't this fun?”
- """
- return str.replace("``", "“").replace("''", "”")
-
-
-def educateSingleBackticks(str):
- """
- Parameter: String.
- Returns: The string, with `backticks' -style single quotes
- translated into HTML curly quote entities.
-
- Example input: `Isn't this fun?'
- Example output: ‘Isn’t this fun?’
- """
- return str.replace('`', "‘").replace("'", "’")
-
-
-def educateDashesOldSchool(str):
- """
- Parameter: String.
-
- Returns: The string, with each instance of "--" translated to
- an en-dash HTML entity, and each "---" translated to
- an em-dash HTML entity.
- """
- return str.replace('---', "—").replace('--', "–")
-
-
-def educateDashesOldSchoolInverted(str):
- """
- Parameter: String.
-
- Returns: The string, with each instance of "--" translated to
- an em-dash HTML entity, and each "---" translated to
- an en-dash HTML entity. Two reasons why: First, unlike the
- en- and em-dash syntax supported by
- EducateDashesOldSchool(), it's compatible with existing
- entries written before SmartyPants 1.1, back when "--" was
- only used for em-dashes. Second, em-dashes are more
- common than en-dashes, and so it sort of makes sense that
- the shortcut should be shorter to type. (Thanks to Aaron
- Swartz for the idea.)
- """
- return str.replace('---', "–").replace('--', "—")
-
-
-
-def educateEllipses(str):
- """
- Parameter: String.
- Returns: The string, with each instance of "..." translated to
- an ellipsis HTML entity.
-
- Example input: Huh...?
- Example output: Huh…?
- """
- return str.replace('...', "…").replace('. . .', "…")
-
-
-__author__ = "Chad Miller <smartypantspy at chad.org>"
-__version__ = "1.5_1.5: Sat, 13 Aug 2005 15:50:24 -0400"
-__url__ = "http://wiki.chad.org/SmartyPantsPy"
-__description__ = \
- "Smart-quotes, smart-ellipses, and smart-dashes for weblog entries in pyblosxom"
Deleted: /doctools/trunk/sphinx/stemmer.py
==============================================================================
--- /doctools/trunk/sphinx/stemmer.py Tue Jul 24 12:25:53 2007
+++ (empty file)
@@ -1,344 +0,0 @@
-#!/usr/bin/env python
-# -*- coding: utf-8 -*-
-"""
- sphinx.stemmer
- ~~~~~~~~~~~~~~
-
- Porter Stemming Algorithm
-
- This is the Porter stemming algorithm, ported to Python from the
- version coded up in ANSI C by the author. It may be be regarded
- as canonical, in that it follows the algorithm presented in
-
- Porter, 1980, An algorithm for suffix stripping, Program, Vol. 14,
- no. 3, pp 130-137,
-
- only differing from it at the points maked --DEPARTURE-- below.
-
- See also http://www.tartarus.org/~martin/PorterStemmer
-
- The algorithm as described in the paper could be exactly replicated
- by adjusting the points of DEPARTURE, but this is barely necessary,
- because (a) the points of DEPARTURE are definitely improvements, and
- (b) no encoding of the Porter stemmer I have seen is anything like
- as exact as this version, even with the points of DEPARTURE!
-
- Release 1: January 2001
-
- :copyright: 2001 by Vivake Gupta <v at nano.com>.
- :license: Public Domain (?).
-"""
-
-class PorterStemmer(object):
-
- def __init__(self):
- """The main part of the stemming algorithm starts here.
- b is a buffer holding a word to be stemmed. The letters are in b[k0],
- b[k0+1] ... ending at b[k]. In fact k0 = 0 in this demo program. k is
- readjusted downwards as the stemming progresses. Zero termination is
- not in fact used in the algorithm.
-
- Note that only lower case sequences are stemmed. Forcing to lower case
- should be done before stem(...) is called.
- """
-
- self.b = "" # buffer for word to be stemmed
- self.k = 0
- self.k0 = 0
- self.j = 0 # j is a general offset into the string
-
- def cons(self, i):
- """cons(i) is TRUE <=> b[i] is a consonant."""
- if self.b[i] == 'a' or self.b[i] == 'e' or self.b[i] == 'i' \
- or self.b[i] == 'o' or self.b[i] == 'u':
- return 0
- if self.b[i] == 'y':
- if i == self.k0:
- return 1
- else:
- return (not self.cons(i - 1))
- return 1
-
- def m(self):
- """m() measures the number of consonant sequences between k0 and j.
- if c is a consonant sequence and v a vowel sequence, and <..>
- indicates arbitrary presence,
-
- <c><v> gives 0
- <c>vc<v> gives 1
- <c>vcvc<v> gives 2
- <c>vcvcvc<v> gives 3
- ....
- """
- n = 0
- i = self.k0
- while 1:
- if i > self.j:
- return n
- if not self.cons(i):
- break
- i = i + 1
- i = i + 1
- while 1:
- while 1:
- if i > self.j:
- return n
- if self.cons(i):
- break
- i = i + 1
- i = i + 1
- n = n + 1
- while 1:
- if i > self.j:
- return n
- if not self.cons(i):
- break
- i = i + 1
- i = i + 1
-
- def vowelinstem(self):
- """vowelinstem() is TRUE <=> k0,...j contains a vowel"""
- for i in range(self.k0, self.j + 1):
- if not self.cons(i):
- return 1
- return 0
-
- def doublec(self, j):
- """doublec(j) is TRUE <=> j,(j-1) contain a double consonant."""
- if j < (self.k0 + 1):
- return 0
- if (self.b[j] != self.b[j-1]):
- return 0
- return self.cons(j)
-
- def cvc(self, i):
- """cvc(i) is TRUE <=> i-2,i-1,i has the form consonant - vowel - consonant
- and also if the second c is not w,x or y. this is used when trying to
- restore an e at the end of a short e.g.
-
- cav(e), lov(e), hop(e), crim(e), but
- snow, box, tray.
- """
- if i < (self.k0 + 2) or not self.cons(i) or self.cons(i-1) or not self.cons(i-2):
- return 0
- ch = self.b[i]
- if ch == 'w' or ch == 'x' or ch == 'y':
- return 0
- return 1
-
- def ends(self, s):
- """ends(s) is TRUE <=> k0,...k ends with the string s."""
- length = len(s)
- if s[length - 1] != self.b[self.k]: # tiny speed-up
- return 0
- if length > (self.k - self.k0 + 1):
- return 0
- if self.b[self.k-length+1:self.k+1] != s:
- return 0
- self.j = self.k - length
- return 1
-
- def setto(self, s):
- """setto(s) sets (j+1),...k to the characters in the string s, readjusting k."""
- length = len(s)
- self.b = self.b[:self.j+1] + s + self.b[self.j+length+1:]
- self.k = self.j + length
-
- def r(self, s):
- """r(s) is used further down."""
- if self.m() > 0:
- self.setto(s)
-
- def step1ab(self):
- """step1ab() gets rid of plurals and -ed or -ing. e.g.
-
- caresses -> caress
- ponies -> poni
- ties -> ti
- caress -> caress
- cats -> cat
-
- feed -> feed
- agreed -> agree
- disabled -> disable
-
- matting -> mat
- mating -> mate
- meeting -> meet
- milling -> mill
- messing -> mess
-
- meetings -> meet
- """
- if self.b[self.k] == 's':
- if self.ends("sses"):
- self.k = self.k - 2
- elif self.ends("ies"):
- self.setto("i")
- elif self.b[self.k - 1] != 's':
- self.k = self.k - 1
- if self.ends("eed"):
- if self.m() > 0:
- self.k = self.k - 1
- elif (self.ends("ed") or self.ends("ing")) and self.vowelinstem():
- self.k = self.j
- if self.ends("at"): self.setto("ate")
- elif self.ends("bl"): self.setto("ble")
- elif self.ends("iz"): self.setto("ize")
- elif self.doublec(self.k):
- self.k = self.k - 1
- ch = self.b[self.k]
- if ch == 'l' or ch == 's' or ch == 'z':
- self.k = self.k + 1
- elif (self.m() == 1 and self.cvc(self.k)):
- self.setto("e")
-
- def step1c(self):
- """step1c() turns terminal y to i when there is another vowel in the stem."""
- if (self.ends("y") and self.vowelinstem()):
- self.b = self.b[:self.k] + 'i' + self.b[self.k+1:]
-
- def step2(self):
- """step2() maps double suffices to single ones.
- so -ization ( = -ize plus -ation) maps to -ize etc. note that the
- string before the suffix must give m() > 0.
- """
- if self.b[self.k - 1] == 'a':
- if self.ends("ational"): self.r("ate")
- elif self.ends("tional"): self.r("tion")
- elif self.b[self.k - 1] == 'c':
- if self.ends("enci"): self.r("ence")
- elif self.ends("anci"): self.r("ance")
- elif self.b[self.k - 1] == 'e':
- if self.ends("izer"): self.r("ize")
- elif self.b[self.k - 1] == 'l':
- if self.ends("bli"): self.r("ble") # --DEPARTURE--
- # To match the published algorithm, replace this phrase with
- # if self.ends("abli"): self.r("able")
- elif self.ends("alli"): self.r("al")
- elif self.ends("entli"): self.r("ent")
- elif self.ends("eli"): self.r("e")
- elif self.ends("ousli"): self.r("ous")
- elif self.b[self.k - 1] == 'o':
- if self.ends("ization"): self.r("ize")
- elif self.ends("ation"): self.r("ate")
- elif self.ends("ator"): self.r("ate")
- elif self.b[self.k - 1] == 's':
- if self.ends("alism"): self.r("al")
- elif self.ends("iveness"): self.r("ive")
- elif self.ends("fulness"): self.r("ful")
- elif self.ends("ousness"): self.r("ous")
- elif self.b[self.k - 1] == 't':
- if self.ends("aliti"): self.r("al")
- elif self.ends("iviti"): self.r("ive")
- elif self.ends("biliti"): self.r("ble")
- elif self.b[self.k - 1] == 'g': # --DEPARTURE--
- if self.ends("logi"): self.r("log")
- # To match the published algorithm, delete this phrase
-
- def step3(self):
- """step3() dels with -ic-, -full, -ness etc. similar strategy to step2."""
- if self.b[self.k] == 'e':
- if self.ends("icate"): self.r("ic")
- elif self.ends("ative"): self.r("")
- elif self.ends("alize"): self.r("al")
- elif self.b[self.k] == 'i':
- if self.ends("iciti"): self.r("ic")
- elif self.b[self.k] == 'l':
- if self.ends("ical"): self.r("ic")
- elif self.ends("ful"): self.r("")
- elif self.b[self.k] == 's':
- if self.ends("ness"): self.r("")
-
- def step4(self):
- """step4() takes off -ant, -ence etc., in context <c>vcvc<v>."""
- if self.b[self.k - 1] == 'a':
- if self.ends("al"): pass
- else: return
- elif self.b[self.k - 1] == 'c':
- if self.ends("ance"): pass
- elif self.ends("ence"): pass
- else: return
- elif self.b[self.k - 1] == 'e':
- if self.ends("er"): pass
- else: return
- elif self.b[self.k - 1] == 'i':
- if self.ends("ic"): pass
- else: return
- elif self.b[self.k - 1] == 'l':
- if self.ends("able"): pass
- elif self.ends("ible"): pass
- else: return
- elif self.b[self.k - 1] == 'n':
- if self.ends("ant"): pass
- elif self.ends("ement"): pass
- elif self.ends("ment"): pass
- elif self.ends("ent"): pass
- else: return
- elif self.b[self.k - 1] == 'o':
- if self.ends("ion") and (self.b[self.j] == 's' \
- or self.b[self.j] == 't'): pass
- elif self.ends("ou"): pass
- # takes care of -ous
- else: return
- elif self.b[self.k - 1] == 's':
- if self.ends("ism"): pass
- else: return
- elif self.b[self.k - 1] == 't':
- if self.ends("ate"): pass
- elif self.ends("iti"): pass
- else: return
- elif self.b[self.k - 1] == 'u':
- if self.ends("ous"): pass
- else: return
- elif self.b[self.k - 1] == 'v':
- if self.ends("ive"): pass
- else: return
- elif self.b[self.k - 1] == 'z':
- if self.ends("ize"): pass
- else: return
- else:
- return
- if self.m() > 1:
- self.k = self.j
-
- def step5(self):
- """step5() removes a final -e if m() > 1, and changes -ll to -l if
- m() > 1.
- """
- self.j = self.k
- if self.b[self.k] == 'e':
- a = self.m()
- if a > 1 or (a == 1 and not self.cvc(self.k-1)):
- self.k = self.k - 1
- if self.b[self.k] == 'l' and self.doublec(self.k) and self.m() > 1:
- self.k = self.k -1
-
- def stem(self, p, i, j):
- """In stem(p,i,j), p is a char pointer, and the string to be stemmed
- is from p[i] to p[j] inclusive. Typically i is zero and j is the
- offset to the last character of a string, (p[j+1] == '\0'). The
- stemmer adjusts the characters p[i] ... p[j] and returns the new
- end-point of the string, k. Stemming never increases word length, so
- i <= k <= j. To turn the stemmer into a module, declare 'stem' as
- extern, and delete the remainder of this file.
- """
- # copy the parameters into statics
- self.b = p
- self.k = j
- self.k0 = i
- if self.k <= self.k0 + 1:
- return self.b # --DEPARTURE--
-
- # With this line, strings of length 1 or 2 don't go through the
- # stemming process, although no mention is made of this in the
- # published algorithm. Remove the line to match the published
- # algorithm.
-
- self.step1ab()
- self.step1c()
- self.step2()
- self.step3()
- self.step4()
- self.step5()
- return self.b[self.k0:self.k+1]
Deleted: /doctools/trunk/sphinx/util.py
==============================================================================
--- /doctools/trunk/sphinx/util.py Tue Jul 24 12:25:53 2007
+++ (empty file)
@@ -1,109 +0,0 @@
-# -*- coding: utf-8 -*-
-"""
- sphinx.util
- ~~~~~~~~~~~
-
- Utility functions for Sphinx.
-
- :copyright: 2007 by Georg Brandl.
- :license: Python license.
-"""
-
-import os
-import sys
-import fnmatch
-from os import path
-
-
-def relative_uri(base, to):
- """Return a relative URL from ``base`` to ``to``."""
- b2 = base.split('/')
- t2 = to.split('/')
- # remove common segments
- for x, y in zip(b2, t2):
- if x != y:
- break
- b2.pop(0)
- t2.pop(0)
- return '../' * (len(b2)-1) + '/'.join(t2)
-
-
-def ensuredir(path):
- """Ensure that a path exists."""
- try:
- os.makedirs(path)
- except OSError, err:
- if not err.errno == 17:
- raise
-
-
-def status_iterator(iterable, colorfunc=lambda x: x, stream=sys.stdout):
- """Print out each item before yielding it."""
- for item in iterable:
- print >>stream, colorfunc(item),
- stream.flush()
- yield item
- print >>stream
-
-
-def get_matching_files(dirname, pattern, exclude=()):
- """Get all files matching a pattern in a directory, recursively."""
- # dirname is a normalized absolute path.
- dirname = path.normpath(path.abspath(dirname))
- dirlen = len(dirname) + 1 # exclude slash
- for root, dirs, files in os.walk(dirname):
- dirs.sort()
- files.sort()
- for sfile in files:
- if not fnmatch.fnmatch(sfile, pattern):
- continue
- qualified_name = path.join(root[dirlen:], sfile)
- if qualified_name in exclude:
- continue
- yield qualified_name
-
-
-def get_category(filename):
- """Get the "category" part of a RST filename."""
- parts = filename.split('/', 1)
- if len(parts) < 2:
- return
- return parts[0]
-
-
-def shorten_result(text='', keywords=[], maxlen=240, fuzz=60):
- if not text:
- text = ''
- text_low = text.lower()
- beg = -1
- for k in keywords:
- i = text_low.find(k.lower())
- if (i > -1 and i < beg) or beg == -1:
- beg = i
- excerpt_beg = 0
- if beg > fuzz:
- for sep in ('.', ':', ';', '='):
- eb = text.find(sep, beg - fuzz, beg - 1)
- if eb > -1:
- eb += 1
- break
- else:
- eb = beg - fuzz
- excerpt_beg = eb
- if excerpt_beg < 0:
- excerpt_beg = 0
- msg = text[excerpt_beg:beg+maxlen]
- if beg > fuzz:
- msg = '... ' + msg
- if beg < len(text)-maxlen:
- msg = msg + ' ...'
- return msg
-
-
-class attrdict(dict):
- def __getattr__(self, key):
- return self[key]
- def __setattr__(self, key, val):
- self[key] = val
- def __delattr__(self, key):
- del self[key]
Copied: doctools/trunk/sphinx/util/console.py (from r56508, doctools/trunk/sphinx/console.py)
==============================================================================
--- doctools/trunk/sphinx/console.py (original)
+++ doctools/trunk/sphinx/util/console.py Tue Jul 24 12:25:53 2007
@@ -1,7 +1,7 @@
# -*- coding: utf-8 -*-
"""
- sphinx.console
- ~~~~~~~~~~~~~~
+ sphinx.util.console
+ ~~~~~~~~~~~~~~~~~~~
Format colored console output.
Copied: doctools/trunk/sphinx/util/json.py (from r56508, doctools/trunk/sphinx/json.py)
==============================================================================
--- doctools/trunk/sphinx/json.py (original)
+++ doctools/trunk/sphinx/util/json.py Tue Jul 24 12:25:53 2007
@@ -1,7 +1,7 @@
# -*- coding: utf-8 -*-
"""
- sphinx.json
- ~~~~~~~~~~~
+ sphinx.util.json
+ ~~~~~~~~~~~~~~~~
Minimal JSON module that generates small dumps.
Copied: doctools/trunk/sphinx/util/stemmer.py (from r56508, doctools/trunk/sphinx/stemmer.py)
==============================================================================
--- doctools/trunk/sphinx/stemmer.py (original)
+++ doctools/trunk/sphinx/util/stemmer.py Tue Jul 24 12:25:53 2007
@@ -1,8 +1,8 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
- sphinx.stemmer
- ~~~~~~~~~~~~~~
+ sphinx.util.stemmer
+ ~~~~~~~~~~~~~~~~~~~
Porter Stemming Algorithm
Modified: doctools/trunk/sphinx/web/wsgiutil.py
==============================================================================
--- doctools/trunk/sphinx/web/wsgiutil.py (original)
+++ doctools/trunk/sphinx/web/wsgiutil.py Tue Jul 24 12:25:53 2007
@@ -24,7 +24,7 @@
from cStringIO import StringIO
from .util import lazy_property
-from .json import dump_json
+from ..util.json import dump_json
HTTP_STATUS_CODES = {
Modified: doctools/trunk/sphinx/writer.py
==============================================================================
--- doctools/trunk/sphinx/writer.py (original)
+++ doctools/trunk/sphinx/writer.py Tue Jul 24 12:25:53 2007
@@ -12,7 +12,7 @@
from docutils import nodes
from docutils.writers.html4css1 import Writer, HTMLTranslator as BaseTranslator
-from .smartypants import sphinx_smarty_pants
+from .util.smartypants import sphinx_smarty_pants
class HTMLWriter(Writer):
More information about the Python-checkins
mailing list