[Tutor] clean text

Emile van Sebille emile at fenx.com
Tue May 19 20:40:30 CEST 2009


On 5/19/2009 11:22 AM spir said...
<snip>
> I thought at this solution (having a dict for all chars). But I cannot use it because later I will extend the app to cope with unicode (~ 100_000 chars). So that I really need to filter which chars have to be converted.

That seems somewhat of a premature optimization.  Dicts are very 
efficient -- I don't imagine 100k+ entries will slow it down, but then 
that's what should be tested so you'll know.

> A useful help I guess would be to have a builtin func that returns conventional char/string repr without "'...'" around.

Like this?

 >>> print repr(''.join(chr(ii) for ii in range(20,40)))
'\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\''
 >>>

Emile



More information about the Tutor mailing list