Sub-classing unicode: getting the unicode value
Torsten Bronger
bronger at physik.rwth-aachen.de
Sun Dec 30 17:45:39 EST 2007
Hallöchen!
John Machin writes:
> On Dec 31, 8:08 am, Torsten Bronger <bron... at physik.rwth-aachen.de>
> wrote:
>
>> [...]
>>
>> But then it is not unicode but Excerpt which I don't want. The
>> idea is to buffer the unicode representation in order to gain
>> efficiency. Otherwise, a lot of unicode conversion would take
>> place.
>
> I'm confused: you are subclassing unicode, and want to use several
> unicode methods, but the object's value appears to be an 8-bit
> string!?!?
No, a unicode. Never said something else ...
> Care to divulge the code for your __init__ method?
It has no __init__, only a __new__. But I doubt that it is of much
use here:
def __new__(cls, excerpt, mode, url=None,
pre_substitutions=None, post_substitutions=None):
if mode == "NONE":
instance = unicode.__new__(cls, excerpt)
elif mode == "PRE":
preprocessed_text, original_positions, escaped_positions = \
cls.apply_pre_input_method(excerpt, url, pre_substitutions)
instance = unicode.__new__(cls, preprocessed_text)
instance.original_text = unicode(excerpt)
instance.original_positions = original_positions
instance.escaped_positions = escaped_positions
instance.post_substitutions = post_substitutions
elif mode == "POST":
postprocessed_text, original_positions, escaped_positions = \
cls.apply_post_input_method(excerpt)
instance = unicode.__new__(cls, postprocessed_text)
instance.original_positions = original_positions
instance.escaped_positions = escaped_positions
instance.original_text = excerpt.original_text
instance.post_substitutions = post_substitutions
instance.__escaped_text = None
instance.__unicode = None
return instance
Tschö,
Torsten.
--
Torsten Bronger, aquisgrana, europa vetus
Jabber ID: bronger at jabber.org
(See http://ime.webhop.org for further contact info.)
More information about the Python-list
mailing list