rst and pypandoc

Dave Angel davea at davea.name
Mon Mar 2 09:08:17 EST 2015


On 03/02/2015 08:51 AM, alb wrote:
> Hi Steven,
>
> Steven D'Aprano <steve+comp.lang.python at pearwood.info> wrote:
> []
>> Since \r is an escape character, that will give you carriage return followed
>> by "ef{fig:abc".
>>
>> The solution to that is to either escape the backslash:
>>
>> i = '\\ref{fig:abc}'
>>
>>
>> or use a raw string:
>>
>> i = r'\\ref{fig:abc}'

Actually that'd be:
    i = r'\ref{fig:abc}'


>
> ok, maybe I wasn't clear from the very beginning, but searching for a
> solution is a journey that takes time and patience.
>
> The worngly named variable i (as noted below), contains the *i*nput of
> my text which is supposed to be restructured text. The output is what
> pypandoc spits out after conversion:
>
> i = "\\begin{tag}{%s}{%s}\n %s\n \\end{tag}" % (some, restructured, text)
> o = pypandoc.convert(i, 'latex', format='rst')
>
> Now if i contains some inline text, i.e. text I do not want to convert
> in any other format, I need my text to be formatted accordingly in order
> to inject some escape symbols in i.
>
> Rst escapes with "\", but unfortunately python also uses "\" for escaping!

Only when the string is in a literal.  If you've read it from a file, or 
built it by combining other strings, or...  then the backslash is just 
another character to Python.

>
>>
>> Oh, by the way, "i" is normally a terrible variable name for a string. Not
>> only doesn't it explain what the variable is for, but there is a very
>> strong convention in programming circles (not just Python, but hundreds of
>> languages) that "i" is a generic variable name for an integer. Not a
>> string.
>
> I'm not in the position to argue about good practices, I simply found
> more appropriate to have i for input and o for output, considering they
> are used like this:
>
> i = "some string"
> o = pypandoc.convert(i, ...)
> f.write(o)
>
> with very little risk to cause misunderstanding.

How about "in" and "out"?  Or perhaps some name that indicates what 
semantics the string represents, like   "rst_string"  and "html_string" 
or whatever they actually are?

>
>> Can you show what you are doing? Escaping the backslash with another
>> backslash does work:
>>
>> py> for c in '\\ref':
>> ...     print(c, ord(c))
>> ...
>> \ 92
>> r 114
>> e 101
>> f 102
>>
>> so either you are doing something wrong, or the error lies elsewhere.
>
> As said above, the string is converted by pandoc first and then printed.
> At this point the escaping becomes tricky (at least to me).
>
> In [17]: inp = '\\ref{fig:abc}'
>
> In [18]: print pypandoc.convert(inp, 'latex', format='rst')
> ref\{fig:abc\}
>

What did you expect/desire the pyandoc output to be?  Now that you don't 
have the embedded 0x0a, is there something else that's wrong?

If it's in the internals of pyandoc, I'll probably be of no help.  But 
your first question was about escaping;  I'm not sure what it's about now.

-- 
DaveA



More information about the Python-list mailing list