[Python-Dev] [Python-3000] Betas today - I hope

Walter Dörwald walter at livinglogic.de
Fri Jun 13 11:32:21 CEST 2008


M.-A. Lemburg wrote:
> On 2008-06-12 16:59, Walter Dörwald wrote:
>> M.-A. Lemburg wrote:
>>> .transform() and .untransform() use the codecs to apply same-type
>>> conversions. They do apply type checks to make sure that the
>>> codec does indeed return the same type.
>>>
>>> E.g. text.transform('xml-escape') or data.transform('base64').
>>
>> So what would a base64 codec do with the errors argument?
> 
> It could use it to e.g. try to recover as much data as possible
> from broken input data.
> 
> Currently (in Py2.x), it raises an exception if you pass in anything
> but "strict".
> 
>>>> I think for transformations we don't need the full codec machinery:
>>>  > ...
>>>
>>> No need to invent another wheel :-) The codecs already exist for
>>> Py2.x and can be used by the .encode()/.decode() methods in Py2.x
>>> (where no type checks occur).
>>
>> By using a new API we could get rid of old warts. For example: Why 
>> does the stateless encoder/decoder return how many input 
>> characters/bytes it has consumed? It must consume *all* bytes anyway!
> 
> No, it doesn't and that's the point in having those return values :-)
> 
> Even though the encoder/decoders are stateless, that doesn't mean
> they have to consume all input data. The caller is responsible to
> make sure that all input data was in fact consumed.
> 
> You could for example have a decoder that stops decoding after
> having seen a block end indicator, e.g. a base64 line end or
> XML closing element.

So how should the UTF-8 decoder know that it has to stop at a closing 
XML element?

> Just because all codecs that ship with Python always try to decode
> the complete input doesn't mean that the feature isn't being used.

I know of no other code that does. Do you have an example for this use.

> The interface was designed to allow for the above situations.

Then could we at least have a new codec method that does:

def statelesencode(self, input):
    (output, consumed) = self.encode(input)
    assert len(input) == consumed
    return output

Servus,
    Walter



More information about the Python-Dev mailing list