[Python-Dev] file objects guarantees

Guido van Rossum guido at python.org
Mon Apr 28 20:21:03 CEST 2014


File objects historically have a pretty vague spec. E.g. it's not defined
which methods are supported beyond read() and readline() (for readers) or
write() (for writers).

Short writes shouldn't be allowed (a regular file object's write() doesn't
even return a value for this reason). This means that a raw (unbuffered)

Short reads *should* be expected, since behavior around these varies
widely. If you see code that doesn't expect them please file bugs.

The glossary item doesn't provide much guidance for would-be implementers
of compliant file file objects, and the interface defined in the io module
is too large (e.g. nobody cares to implement readinto()).

I think we should clarify that raw (unbuffered) file objects are not safe.
I don't care about preventing this explicitly though -- when you see "file
object" you should think "duck typing" and program accordingly. (Reading
and writing are already distinct interfaces; ditto for text vs. bytes.)


On Mon, Apr 28, 2014 at 10:54 AM, Charles-François Natali <
cf.natali at gmail.com> wrote:

> Hi,
>
> What's meant exactly by a "file object"?
>
> Let me be more specific: for example, pickle.dump() accepts a "file
> object".
>
> Looking at the code, it doesn't check the return value of its write()
> method.
>
> So it assumes that write() should always write the whole data (not
> partial write).
>
> Same thing for read, it assumes there won't be short reads.
>
> A sample use case would be passing a socket.makefile() to pickle: it
> works, because makefile() returns a BufferedReader/Writer which takes
> care of short read/write.
>
> But the documentation just says "file object". And if you have a look
> the file object definition in the glossary:
> https://docs.python.org/3.5/glossary.html#term-file-object
>
> """
> There are actually three categories of file objects: raw binary files,
> buffered binary files and text files. Their interfaces are defined in
> the io module. The canonical way to create a file object is by using
> the open() function.
> """
>
> So someone passing e.g. a raw binary file - which doesn't handle short
> reads/writes - would run into trouble.
>
> It's the same thing for e.g. GzipFile, and probably many others.
>
> Would it make sense to add a note somewhere?
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140428/d4d3dbcc/attachment.html>


More information about the Python-Dev mailing list