Embedded 'C' problem?

Bengt Richter bokr at accessone.com
Sat Jun 2 21:01:16 EDT 2001


On 2 Jun 2001 00:09:22 -0700, samschul at pacbell.net (Samuel
Schulenburg) wrote:

>Given the following string:
>EB70:|30 E8 8A 17  00 00 66 89  04 9F 83 C4  08 43 3B 5C   
>0.....f......C;\
>
>and using the following 'C' wsprintf() function to build ucPyString
>causes a print format error.
>
>wsprintf(ucPyString,"print \"\"\"%s\"\"\"",ucMsgStr);
>PyRun_SimpleString(ucPyString);
>
>ERROR:
>
>  File "<string>", line 1
>print """EB70:|30 E8 8A 17  00 00 66 89  04 9F 83 C4  08 43 3B 5C
>0.....f......C;\"""
>                                                                      
>                   ^
>SyntaxError: invalid token
>
>The problem is final "\" in the origional string is interfering with
>the final tripple quote forming a \""" which is interprited as an
>escape sequence.
>
Looks to me like you'd have a problem with """ embedded in ucMsgStr
too. Apparently what you want is a python statement which when
executed will put exactly the original characters of ucMsgStr to
stdout. So the question is how to represent the string when you don't
know what it contains. Any escape or quotes characters could
potentially interfere. Also if you have special characters
 OTTOMH, [no warranty, not checked!] I'd try
something like the following to escape double quotes and backslashes:

#include <string.h>
...
#define UCPYSTRINGMAX -- whatever your buffer size, min 40 for err msg
unsigned int i = wsprintf( ucPyString,"print \"\"\"" );

for(const char *p= ucMsgStr; *p; ++p ){
    if( i >= UCPYSTRINGMAX-4-3-1 ){ //4hex+3"'s+term
        i = wsprintf( ucPyString,"print \"\"\"Insufficient Buffer\""
);
        break;
    }
    if( !isprint(*p) ){
        i += wsprintf( ucPyString+i,"\\x%02x",*p ); // hex if weird
    } else {
        if( *p == '\\' || *p == '"' ){  // escape dquote and backslash
            i += wsprintf( ucPyString+i,"\\%c",*p );    // escaped
        } else {
            i += wsprintf( ucPyString+i,"%c",*p );      // plain
        }
    }
}
i += wsprintf( ucPyString+i, "\"\"\"" );    // indent and \n up to you
...

wsprintf is kind of overkill, but if you need to change to wide chars
maybe it's easier than leaner string routines.

Alternatively you could brute force escape every character in hex.
Either way MAKE SURE YOU HAVE THE SPACE for worst case expansion if
you don't know what's coming.

The raw string format would still leave you with an escaping problem
somewhere, UIAM.
>My question is. How can I generate a format specifier so I can have a
>ucMsgStr that contains any printable characters, and does not interfer
>with the Python print function?
>
It's not the print function per se, it's the interpreter
reading the print statement source that you've generated, and
interpreting the string literal in it to make the string constant
that becomes the argument for print when it executes. It doesn't
like the generated string syntax. If you put "s=" in place of "print"
I would expect the same problem.




More information about the Python-list mailing list