How to convert a raw string r'\xdd' to '\xdd' more gracefully?

Jach Feng jfong at ms4.hinet.net
Fri Dec 9 21:06:54 EST 2022


Weatherby,Gerard 在 2022年12月9日 星期五晚上9:36:18 [UTC+8] 的信中寫道:
> That’s actually more of a shell question than a Python question. How you pass certain control characters is going to depend on the shell, operating system, and possibly the keyboard you’re using. (e.g. https://www.alt-codes.net). 
> 
> Here’s a sample program. The dashes are to help show the boundaries of the string 
> 
> #!/usr/bin/env python3 
> import argparse 
> import logging 
> 
> 
> parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter) 
> parser.add_argument('data') 
> args = parser.parse_args() 
> print(f'Input\n: -{args.data}- length {len(args.data)}') 
> for c in args.data: 
> print(f'{ord(c)} ',end='') 
> print() 
> 
> 
> Using bash on Linux: 
> 
> ./cl.py '^M 
> ' 
> Input 
> - 
> - length 3 
> 13 32 10
> From: Python-list <python-list-bounces+gweatherby=uchc... at python.org> on behalf of Jach Feng <jf... at ms4.hinet.net> 
> Date: Thursday, December 8, 2022 at 9:31 PM 
> To: pytho... at python.org <pytho... at python.org> 
> Subject: Re: How to convert a raw string r'xdd' to 'xdd' more gracefully? 
> *** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. ***
> Jach Feng 在 2022年12月7日 星期三上午10:23:20 [UTC+8] 的信中寫道: 
> > s0 = r'\x0a' 
> > At this moment it was done by 
> > 
> > def to1byte(matchobj): 
> > ....return chr(int('0x' + matchobj.group(1), 16)) 
> > s1 = re.sub(r'\\x([0-9a-fA-F]{2})', to1byte, s0) 
> > 
> > But, is it that difficult on doing this simple thing? 
> > 
> > --Jach 
> The whold story is, 
> 
> I had a script which accepts an argparse's positional argument. I like this argument may have control character embedded in when required. So I make a post "How to enter escape character in a positional string argument from the command line? on DEC05. But there is no response. I assume that there is no way of doing it and I have to convert it later after I get the whole string from the command line. 
> 
> I made this convertion using the chr(int(...)) method but not satisfied with. That why this post came out. 
> 
> At this moment the conversion is done almost the same as Peter's codecs.decode() method but without the need of importing codecs module:-) 
> 
> def to1byte(matchobj): 
> ....return matchobj.group(0).encode().decode("unicode-escape")
> -- 
> https://urldefense.com/v3/__https://mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!hcg9ULzmtVUzMJ87Emlfsf6PGAfC-MEzUs3QQNVzWwK4aWDEtePG34hRX0ZFVvWcqZXRcM67JkkIg-l-K9vB$<https://urldefense.com/v3/__https:/mail.python.org/mailman/listinfo/python-list__;!!Cn_UX_p3!hcg9ULzmtVUzMJ87Emlfsf6PGAfC-MEzUs3QQNVzWwK4aWDEtePG34hRX0ZFVvWcqZXRcM67JkkIg-l-K9vB$>

> That’s actually more of a shell question than a Python question. How you pass certain control characters is going to depend on the shell, operating system, and possibly the keyboard you’re using. (e.g. https://www.alt-codes.net).

You are right, that's why I found later that it's easier to enter it using a preferred pattern. But there is a case, as moi mentioned in his previous post, will cause failure when a Windows path in the form of \xdd just happen in the string:-(


More information about the Python-list mailing list