[Python-3000] iostack and sock2

tomer filiba tomerfiliba at gmail.com
Mon Jun 5 19:16:40 CEST 2006


> > well, it's too hard to design for a nonexisting module. select is all there
> > is that's platform independent.
>
> It is /relatively/ platform independent.

if it runs on windows, linux, *bsd, solaris, it's virtually platform
independent.
i don't consider the nokia N60 or whatever the name was, as well as other
esoteric environments, as "platforms", at least not such that should be taken
into consideration when designing APIs and standard modules.

> I didn't know why at first, but I've figured
> out that it is a combination of "I enjoy writing wire protocols" and "it
> would be very nice if my old socket/file software continued to work in
> py3k".
[...]
> Rather than changing what people expect with the current .read() method,
> why not offer a different method called .readexact(n), which will read
> exactly n bytes, performing buffering as necessary.

okay, i give up on read(n) returning n bytes. that being said, and taking into
account the "helpers" i suggested (a function named file/open that is
API-compliant to today's file) -- i'd assume 80% of the code would be
compatible.

after all, the major use-cases of IO are files and sockets. if we keep
those looking the same, at least the core APIs, most code should
work fine.

again, don't forget sock2 is separate from iostack, and can be used by
itself. it has send/recv like normal sockets, is select()able, etc... the
only adaptation needed for legacy code is converting "import socket"
to "import sock2" (which would be unncecessary if it became the
standard socket module), as well as converting
s = socket.socket()
s.connect(...)
to
s = socket.TcpSocket(...)

grepping through the source can pinpoint these locations.

> > random idea:
> > when compiled with universal line support, python unicode should
> > equate "\n" to any of the forementioned characters.
> > i.e.
> >
> > u"\n" == u"\u2028" # True
>
> I'm glad that you later decided for yourself that such a thing would be
> utterly and completely foolish.

it's not foolish, it's bad. these are different things (foolish being "lacking
a proper rationale", and bad being "destroying the very foundations of
python"). but again, it was kept "for the record".

> > f.position = -10
> > raises a ValueError, which is logical
>
> Raising a ValueError on an unseekable stream would be confusing.

true, but so are TypeErrors for ArgumentErrors, or TypeErrors for HashErrors,
etc. besides, why shouldn't attributes raise IOError? after all you are working
with *IO*, so "s.position = -10" raising an IOError isn't all too strange.
anyway, that's a technicality and the rest of the framework can suffer delaying
that decision for later.

> > class NetworkStream(InputStream, OutputStream):
> >    ...
> >
> > which version of close() gets called?
>
> Both, you use super().

if an InputStream and OutputStream are just interfaces, that's fine,
but still, i don't find it acceptable for one method to be defined by
two interfaces, and then have it intersected in a deriving class.

perhaps the hierarchy should be

class Stream:
    def close
    property closed
    def seek
    def tell

class InputStream(Stream):
    def read
    def readexact
    def readall

class OutputStream(Stream):
    def write

but then, most of the streams, like files, pipes and sockets,
would need to derive from both InputStream and OutputStream.

another issue:

class  InputFile(InputStream)
    ...
class OutputFile(OutputStream):
    ...
class File(InputStream, OutputStream):
    ....

i think there's gonna be much duplication of code, because FIle can't
inherit from InputFile and OutputFile, as they are each a separate stream,
while File is a single InOutStream.

and a huge class hierarchy makes attribute lookups slower.


-tomer

On 6/4/06, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> "tomer filiba" <tomerfiliba at gmail.com> wrote:
> [snip]
> > > - interaction with (replacement of?) the select module
> >
> > well, it's too hard to design for a nonexisting module. select is all there
> > is that's platform independent.
>
> It is /relatively/ platform independent.
>
> > random idea:
> > * select is virtually platform independent
> > * improved polling is inconsistent
> >     * kqueue is BSD-only
> >     * epoll is linux-only
> >     * windows has none of those
>
> Windows doesn't currently have a module designed to do this kind of
> thing, but it is possible to have a higher-performance method for
> Windows using various bits from the win32file module from pywin32 (I
> have been contemplating writing one, but I haven't had the time).
>
> [snip]
>
> > - - - - -
> >
> > > e.g an alternative approach would be to
> > > define InputStream and OutputStream, and then have an IOStream that inherited
> > > from both of them).
> >
> > hrrm... i need to think about this more. one problem i already see:
> >
> > class InputStream:
> >    def close(self):....
> >    def read(self, count): ...
> >
> > class OutputStream:
> >    def close(self):....
> >    def write(self, data)...
> >
> > class NetworkStream(InputStream, OutputStream):
> >    ...
> >
> > which version of close() gets called?
>
> Both, you use super().
>
> > - - - - -
> >
> > > e.g. the 'position' property is
> > > probably a bad idea, because x.position may then raise an IOError
> >
> > i guess it's reasonable approach, but i'm a "usability beats purity" guy.
> > f.position = 0
> > or
> > f.position += 10
> >
> > is so much more convenient than seek()ing and tell()ing. we can also
> > optimize += by defining a Position type where __iadd__(n) uses
> > seek(n, "curr") instead of seek(n + tell(), "start")
> >
> > btw, you can first test the "seakable" attribute, to see if positioning
> > would work.
> >
> > and in the worst case, i'd vote for converting IOErrors to ValueErrors...
> >
> > def _set_pos(self, n)
> >     try:
> >        self.seek(n)
> >     except IOError:
> >        raise ValueError("invalid position value", n)
> >
> > so that
> > f.position = -10
> > raises a ValueError, which is logical
>
> Raising a ValueError on an unseekable stream would be confusing.
>
> [snip]
> > - - - - -
> >
> > random idea:
> > when compiled with universal line support, python unicode should
> > equate "\n" to any of the forementioned characters.
> > i.e.
> >
> > u"\n" == u"\u2028" # True
>
> I'm glad that you later decided for yourself that such a thing would be
> utterly and completely foolish.
>
> > - - - - -
> >
> > > I can see that behaviour being seriously annoying when you get to the end of
> > > the stream. I'd far prefer for the stream to just give me the last bit when I
> > > ask for it and then tell me *next* time that there isn't anything left.
> >
> > well, today it's done like so:
> >
> > while True:
> >    x = f.read(100)
> >    if not x:
> >       break
> >
> > in iostack, that would be done like so:
> >
> > try:
> >     while True:
> >         x = f.read(100)
> > except EOFError:
> >     last_x = f.readall() # read all the leftovers (0 <= leftovers < 100)
> >
> > a little longer, but not illogical
> >
> > > If you want a method with the other behaviour, add a "readexact" API, rather
> > > than changing the semantics of "read" (although I'd be really curious to hear
> > > the use case for the other behaviour).
> >
> > well, when i work with files/sockets, i tend to send data structures over them,
> > like records, frames, protocols, etc. if a record is said to be x bytes long,
> > and read(x) returns less than x bytes, my code has to loop until it gets
> > enough bytes.
>
> Rather than changing what people expect with the current .read() method,
> why not offer a different method called .readexact(n), which will read
> exactly n bytes, performing buffering as necessary.  You can then
> optimize by using cStringIOs, lists of strings, resizable bytes, or
> whatever other method you want (but be careful never to .read(bignum)
> unless you change the underlying .read() implementation; right now it
> allocates a buffer of size bignum, which can cause huge amounts of
> malloc/realloc thrashing, and generally causes MemoryErrors).
>
> [snip]
>
>  - Josiah
>
>


More information about the Python-3000 mailing list