[Doc-SIG] Ambiguity in default output for publish_string

Jeffrey C. Jacobs docutils.org.timehorse at neverbox.com
Wed Mar 4 16:22:20 CET 2009


The two reStructuredText files:

--------
This paragraph has a very funny **indent**    after that word, right?
--------

and:

--------
his paragraph has a very funny **indent
after that word, right?**
--------

are theoretically different.  The first puts strong emphasis only on the
word **indent**, which is followed by exactly 4 spaces, where as the
other puts strong emphasis on the entire expression "indent after that
word, right?", where there is a line feed between "indent" and "after".

However, when publish_string is called to output the tree for both of
these expressions, they both return:

<document source="<string>">
    <paragraph>
        This paragraph has a very funny 
        <strong>
            indent
            after that word, right?

which is not different.  As far as I can tell, the internal node structure
is correct, it's just when the node structure is displayed in string form,
the default function of publish_string.  Since this output is a
serialization of the node structure, it seems that the output to
publish_string should not be ambiguous in terms of what it truly
represents.  Or, is there a better way to represent the internal doc tree
unambiguously as a string?



More information about the Doc-SIG mailing list