rst and pypandoc

Chris Angelico rosuav at gmail.com
Mon Mar 2 17:51:59 EST 2015


On Tue, Mar 3, 2015 at 9:30 AM, alb <al.basili at gmail.com> wrote:
> Hi Dave,
>
> Dave Angel <davea at davea.name> wrote:
> []
>>> Rst escapes with "\", but unfortunately python also uses "\" for escaping!
>>
>> Only when the string is in a literal.  If you've read it from a file, or
>> built it by combining other strings, or...  then the backslash is just
>> another character to Python.
>
> Holy s***t! that is enlightning. I'm not going to ask why is that so,
> but essentially this changes everything. Indeed I'm passing some strings
> as literal (as my example), some others are simply read from a file
> (well the file is read into a list of dictionaries and then I convert
> one of those keys into latex).

You have two different things happening here. The first is the concept
of a "string literal", and the second is how pandoc handles things.

Python's string literals come in a few different forms, but the most
common is the one that looks the same as in several other languages.
You start with a quote character, you put all your stuff in the
middle, and you finish with another quote:

"Hello, world!"

Trouble is, this makes it really hard to put quotes into your string:

"I said, "Hello, world!""

That's not going to work properly! So we need to tell Python that
those interior quotes aren't the end of the string. That's done with a
backslash:

"I said, \"Hello, world!\""

And of course, that means you have to escape the backslash if you want
to have one in the text. But all of this is just for putting *string
literals* into your source code. If it's not Python source code, these
rules don't apply. You can read a line of text from the user and it'll
be unchanged:

>>> msg = input("Enter a string: ")
Enter a string: This is a string, but not a "string literal".
>>> print(msg)
This is a string, but not a "string literal".

(in Python 2, use raw_input instead of input)

Same applies to reading from a file, or anywhere else. If it's not
Python source code, it doesn't matter what characters are in the
string, they're all just characters.

> unfortunately when I pass that to pypandoc, as if it was restructured
> text, I get the following:
>
> In [36]: f = open('test.txt', 'r')
>
> In [37]: s = f.read()
>
> In [38]: print s
> this is \some restructured text.
>
>
> In [39]: print pypandoc.convert(s, 'latex', format='rst')
> this is some restructured text.
>
> what happened to my backslash???

That's something you'll have to figure out with pypandoc. I don't know
how it interprets the backslash, so you'll have to dig into its
documentation. At least now, though, you can print out your string and
see that it really does have its backslash in it.

ChrisA



More information about the Python-list mailing list