Basic Python Query

Steven D'Aprano steve+comp.lang.python at pearwood.info
Thu Aug 22 23:28:12 EDT 2013


On Thu, 22 Aug 2013 13:54:14 +0200, Ulrich Eckhardt wrote:

> Firstly, there is one observation: The Python object of type Thread is
> one thing, the actual thread is another thing. This is similar to the
> File instance and the actual file. The Python object represents the
> other thing (thread or file) but it "is not" this thing. It is rather a
> handle to the file or thread. This is different for e.g. a string, where
> the Python object is the string.

Well, not quite. To users coming from other languages, "string" has a 
clear and common meaning; it's an array of characters, possibly fixed-
width in older languages, but these days usually variable-width but 
prefixed with the length (as in Pascal) or suffixed with a delimiter 
(usually \0, as in C). Or occasionally both.

So as far as those people are concerned, Python strings aren't just a 
string. They are rich objects, with an object header. For example, we can 
see that there is a whole bunch of extra "stuff" required of a Python 
string before you even get to the array-of-characters:

py> sys.getsizeof('')
25

25 bytes to store an empty string. Even if it had a four-byte length, and 
a four-byte NUL character at the end, that still leaves 17 bytes 
unaccounted for. So obviously Python strings contain a whole lot more 
than just low-level Pascal/C strings.

So while I agree that it is sometimes useful to distinguish between a 
Python Thread object and the underlying low-level thread data structure 
it wraps, we can do the same with strings (and floats, and lists, and 
everything really). In any case, it's rare to need to do so.

 
> Due to this pairing between the actual thing and the handle, there is
> also some arity involved. For a single thread or file, there could be
> multiple Python objects for handling it, or maybe even none.

I don't think this is correct for threads. I don't believe there is any 
way to handle a low-level thread in Python except via an object of some 
sort. (With files, you can use the os module to work with low-level OS 
file descriptors, which are just integers.)


> When the
> Python object goes away, it doesn't necessarily affect the thread or
> file it represents. 

That's certainly not true with file objects. When the file object goes 
out of scope, the underlying low-level file is closed.


> This already casts a doubt on the habit of deriving
> from the Thread type, just like deriving from the File type is highly
> unusual, as you are just deriving from a handle class.

In Python 3, there is no "File" type. There are *multiple* file types, 
depending on whether you open a file for reading or writing in binary or 
text mode:

py> open('/tmp/junk', 'wb')
<_io.BufferedWriter name='/tmp/junk'>
py> open('/tmp/junk', 'rb')
<_io.BufferedReader name='/tmp/junk'>
py> open('/tmp/junk', 'w')
<_io.TextIOWrapper name='/tmp/junk' mode='w' encoding='UTF-8'>


But even if we limit the discussion to Python 2, it is unusual to inherit 
from File because File already does everything we normally want from a 
file. There's no need to override methods, so why make your own subclass? 
On the other hand, threads by their very nature have to be customized. 
The documentation is clear that there are two acceptable ways to do this:

    This class represents an activity that is run in a separate 
    thread of control. There are two ways to specify the activity: 
    by passing a callable object to the constructor, or by 
    overriding the run() method in a subclass.

http://docs.python.org/2/library/threading.html#thread-objects


So to some degree, it is just a matter of taste which you use.

 
[...]
> In summary, I find that modelling something to "use a thread" is much
> clearer than modelling it as "is a thread".

The rest of your arguments seem good to me, but not compelling. I think 
they effectively boil down to personal taste. I write lots of non-OOP 
code, but when it comes to threads, I prefer to subclass Thread.


-- 
Steven



More information about the Python-list mailing list