newbie raw text question

Ian Sparks Ian.Sparks at etrials.com
Tue Feb 4 09:44:26 EST 2003


Thanks for the reply Dennis. Your breakdown of the meaning of the RTF codes is pretty-much spot on. However, I'm still not "getting it". You say :

>>
What escaped characters? The \ is a tag introducer (for lack of a 
better word) and is part of the actual data. "\rtf1" is NOT <cr>tf1. 
<<

So here's a simple command-line test :

>>> print "\rtf1"

tf1
>>> print r"\rtf1"
\rtf1
>>>

Looks to me like \rtf1 *is* <cr>tf1 unless you define the string as a raw string and then it can contain the "\" character.

This is all very well for strings you define at the command line but what if a variable "x" contains "\rtf1" (NOT a raw string). Now how can you deal with it?

>>> print x

tf1
>>> print rx   #attempt to turn x into a raw string for printing.
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
NameError: name 'rx' is not defined
>>> 

How can I print x as though it were a raw string? Like I said, its probably pretty obvious, I just don't "get it".




-----Original Message-----
From: Dennis Lee Bieber [mailto:wlfraed at ix.netcom.com]
Sent: Monday, February 03, 2003 11:33 PM
To: python-list at python.org
Subject: Re: newbie raw text question


Ian Sparks fed this fish to the penguins on Monday 03 February 2003 
12:11 pm:

> I'm confused about this one. I'm reading some RTF formatted data from
> a database. The resulting string is :
> 
> {\rtf1\ansi\ansicpg1252\deff0\deftab720{\fonttbl{\f0\fswiss MS Sans
> {Serif;}{\f1\froman\fcharset2 Symbol;}{\f2\fswiss Arial;}{\f3\fswiss
> {Arial;}} \colortbl\red0\green0\blue0;}
> \deflang1033\pard\plain\f3\fs16 Some text
> }
> 
> obviously this is chock-full of escaped characters. I need to strip
> the RTF codes and all my regular expressions are expecting raw strings
> but I don't see a way of converting an escaped string to a raw string
> to use in the regex.
>
        What escaped characters? The \ is a tag introducer (for lack of a 
better word) and is part of the actual data. "\rtf1" is NOT <cr>tf1. 
What I see in your sample (and I've not studied RFT) is:

RTF version 1 (hypothetical this)
ANSI
Codepage 1252
define font 0 (guessing) define tab 720 decipoints (1inch)(guessing, 
might be centipoints/0.1inch)
        font table
                font 0 "swiss" font (san serif) is MS San Serif
                font 1 "roman" font (serif) is character set 2 Symbol
                font 2 "swiss" font is Arial
                font 3 "swiss" font is Arial
        color table
                red 0
                green 0
                blue 0
define language 1033
????
plain (not bold or italic)
use font 3
font size 16

 
> There must be some way out of here...
> 
>  
> 

-- 
 > ============================================================== <
 >   wlfraed at ix.netcom.com  | Wulfraed  Dennis Lee Bieber  KD6MOG <
 >      wulfraed at dm.net     |       Bestiaria Support Staff       <
 > ============================================================== <
 >        Bestiaria Home Page: http://www.beastie.dm.net/         <
 >            Home Page: http://www.dm.net/~wulfraed/             <

-- 
http://mail.python.org/mailman/listinfo/python-list





More information about the Python-list mailing list