Basic Python Query

Ulrich Eckhardt ulrich.eckhardt at dominolaser.com
Thu Aug 22 07:54:14 EDT 2013


Am 21.08.2013 20:58, schrieb Johannes Bauer:
> On 21.08.2013 11:11, Ulrich Eckhardt wrote:
>
>> That said, there is never a need for deriving
>> from the Thread class, you can also use it to run a function without
>> that. That way is IMHO clearer because the threading.Thread instance is
>> not the thread, just like a File instance is not a file. Both just
>> represent handles for manipulating the actual thing.
>
> Huh? That I find most curious.
>
> I *always* derive from threading.Thread and really like the way that
> thread setup works (instanciate Thread handle, call start). Very
> intuitive, never had the problems with clarity that you mentioned. Could
> you elaborate on your suggestion? I don't seem to quite get it I'm afraid.

What is clear, convenient or not is largely a matter of taste. I'll try 
to explain my motivations though, maybe it helps...


Firstly, there is one observation: The Python object of type Thread is 
one thing, the actual thread is another thing. This is similar to the 
File instance and the actual file. The Python object represents the 
other thing (thread or file) but it "is not" this thing. It is rather a 
handle to the file or thread. This is different for e.g. a string, where 
the Python object is the string.

Due to this pairing between the actual thing and the handle, there is 
also some arity involved. For a single thread or file, there could be 
multiple Python objects for handling it, or maybe even none. When the 
Python object goes away, it doesn't necessarily affect the thread or 
file it represents. This already casts a doubt on the habit of deriving 
from the Thread type, just like deriving from the File type is highly 
unusual, as you are just deriving from a handle class.


Secondly, a thread is even less a "thing" than a file but rather a 
process or an ongoing operation. As such, it runs code and uses data but 
it is neither code nor data. Also, it doesn't care which code or data it 
currently uses. Similarly, the code and data don't care which thread 
uses them (synchronization problems in multithreaded apps aside). You 
will find that most of the code called in a thread doesn't use the 
thread handle, which is another sign that it doesn't care. For that 
reason, it is unnecessary that "self" references a Thread object. This 
reduces coupling, as the same code could be called synchronously and 
asynchronously. The code shouldn't know or care from which thread it is 
called.

In some cases, I even find it unnecessary to have a "self" at all, a 
thread can just as well run a non-member function. Also, even if it runs 
a memberfunction initially, it doesn't have to eventually. I find that 
forcing an OOP approach on things is flawed (OOP is a tool and not a 
goal) and prefer to make this a decision, but that is a different 
(although slightly related) issue.


Thirdly, when you derive a class from Thread, you are exposing this 
baseclass' interface to the public, too, even if you don't intend to. 
This has both the unwanted aspect that you expose all public functions 
of the baseclass and that even if you mean "is a thread", it actually 
means "is a handle to a thread", which is even less expressive. 
Curously, you do that in order to override a single function that is 
only invoked once. I prefer passing "instance.function" as callable 
argument to a plain Thread instance for running this, which keeps the 
two nicely separate.

For example, I have a TCP client class here that uses a service thread 
to handle the data transfer. The fact that there is a background thread 
should not be of concern to the user of my TCP client class. If I 
extended this to use two threads, it would even be impossible to derive 
from Thread for both of them.


In summary, I find that modelling something to "use a thread" is much 
clearer than modelling it as "is a thread".

Greetings from Hamburg!

Uli




More information about the Python-list mailing list