[Python-Dev] Internal representation of strings and Micropython

Mark Lawrence breamoreboy at yahoo.co.uk
Fri Jun 6 15:30:18 CEST 2014


On 06/06/2014 09:53, Hrvoje Niksic wrote:
> On 06/04/2014 05:52 PM, Mark Lawrence wrote:
>> On 04/06/2014 16:32, Steve Dower wrote:
>>>
>>> If copying into a separate list is a problem (memory-wise),
>>> re.finditer('\\S+', string) also provides the same behaviour and
>>> gives me the sliced string, so there's no need to index for anything.
>>>
>>
>> Out of idle curiosity is there anything that stops MicroPython, or any
>> other implementation for that matter, from providing views of a string
>> rather than copying every time?  IIRC memoryviews in CPython rely on the
>> buffer protocol at the C API level, so since strings don't support this
>> protocol you can't take a memoryview of them.  Could this actually be
>> implemented in the future, is the underlying C code just too
>> complicated, or what?
>>
>
> Memory view of Unicode strings is controversial for two reasons:
>
> 1. It exposes the internal representation of the string. If memoryviews
> of strings were supported in Python 3, PEP 393 would not have been
> possible (without breaking that feature).
>
> 2. Even if it were OK to expose the internal representation, it might
> not be what the users expect. For example, memoryview("Hrvoje") would
> return a view of a 6-byte buffer, while memoryview("Nikšić") would
> return a view of a 12-byte UCS-2 buffer. The user of a memory view might
> expect to get UCS-2 (or UCS-4, or even UTF-8) in all cases.
>
> An implementation that decided to export strings as memory views might
> be forced to make a decision about internal representation of strings,
> and then stick to it.
>
> The byte objects don't have these issues, which is why in Python 2.7
> memoryview("foo") works just fine, as does memoryview(b"foo") in Python 3.
>

Thanks for the explanation :)

-- 
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.

Mark Lawrence

---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com




More information about the Python-Dev mailing list