[IPython-dev] Buffers

Brian Granger ellisonbg at gmail.com
Tue Jul 27 15:23:37 EDT 2010


On Tue, Jul 27, 2010 at 11:34 AM, Fernando Perez <fperez.net at gmail.com>wrote:

> On Tue, Jul 27, 2010 at 11:14 AM, Brian Granger <ellisonbg at gmail.com>
> wrote:
> >
> > Yes, I hadn't though about the fact that unicode objects are buffers as
> > well.  But, we could raise a TypeError when a user tries to send a
> unicode
> > object (str in python 3).  IOW, don't treat unicode as buffers and force
> > them to encode/de ode.  Does this make sense or should we allow unicode
> to
> > be sent as buffers.
>
> Well, the problem I explained about a possible mismatch in internal
> unicode storage format rears its ugly head if we allow
> unicode-as-buffer.  I was precisely worried about sending 3.x strings
> as buffers, since the two ends may not agree on what the buffer means.
>  I may be worrying about a non-problem, but at some point it might be
> worth veryfing this.  The test is a bit cumbersome to set up, because
> you have to build two versions of Python, one with ucs-2 and one with
> ucs-4, and see what happens if they try to send each other stuff.  But
> I think it's a test worth making, so we know for sure whether this is
> a problem or not, as it will dictate design decisions for 3.x on all
> string handling.
>
>
This is definitely an issue.  Also, someone could set their own custom
unicode encoding by hand and that would mess this up as well.


> If it is a problem, then there are some options:
>
> - disallow communication between ucs 2/4 pythons.
>

But this doesn't account for other encoding/decoding setups.


> - detect a mismatch and encode/decode all unicode strings to utf-8 on
> send/receive, but allow raw buffer sending if there's no mismatch.
>

This will be tough though if users set their own encoding.


> - *always* encode/decode.
>
>
I think this is the option that I prefer (having users to this in their
application code).


> The middle option seems appealing because it avoids the overhead of
> encoding/decoding on all sends, but I'm worried it may be too brittle.
>
>
Brian


> Cheers,
>
>
> f
>



-- 
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu
ellisonbg at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20100727/b82bba6e/attachment.html>


More information about the IPython-dev mailing list