[Python-Dev] Decoding incomplete unicode

Walter Dörwald walter at livinglogic.de
Thu Aug 26 22:13:17 CEST 2004


Martin v. Löwis wrote:

> M.-A. Lemburg wrote:
> 
>> Martin, there are two reasons for hiding away these details:
>>
>> 1. we need to be able to change the codec state without
>>    breaking the APIs
 >
> That will be possible with the currently-proposed patch.
> The _codecs methods are not public API, so changing them
> would not be an API change.

Exactly.

>> 2. we don't want the state to be altered by the user
> 
> We are all consenting adults, and we can't *really*
> prevent it, anyway. For example, the user may pass an
> old state, or a state originating from a different codec
> (instance). We need to support this gracefully (i.e. with
> a proper Python exception).

The state communicated in the UTF-7 decoder is just a bunch
of values. Checking the type is done via PyArg_ParseTuple().

>> A single object serves this best and does not create
>> a whole plethora of new APIs in the _codecs module.
>> This is not over-design, but serves a reason.
> 
> It does put a burden on codec developers, which need
> to match the "official" state representation policy.
> Of course, if they are allowed to return a tuple
> representing their state, that would be fine with
> me.

Looking at the UTF-7 decoder this seems to be the
simplest option.

Bye,
    Walter Dörwald




More information about the Python-Dev mailing list