Dataclasses, immutability(?), and ChatGPT

dn PythonList at DancesWithMice.info
Wed Apr 12 00:58:56 EDT 2023


Are dataclasses (or instances thereof) mutable or immutable?
- and in what sense?


Have been experimenting with ChatGPT. In particular: its possibilities 
as a trainer, good ideas for methods of introducing new topics, its 
capability for drawing-up demonstrations or examples, its interpretation 
of assessment questions (ie evaluating my own work, eg do they test 
desired specifics, is it possible/likely to misinterpret the question, ...)


Conversed with ChatGPT about the differences between (standard) 
custom-classes and dataclasses, and when one might prefer one over the 
other.

We(?) started discussing how dataclasses might be introduced to trainees 
currently only familiar with Python's built-ins: "...how they are 
superior to using built-in data-constructs..."

Amongst the four benefits ChatGPT listed was:
«
Immutable instances: By default, instances of dataclasses are immutable, 
which means that once created, their attributes cannot be modified. This 
can help to prevent unintended modifications to the data.
»

Huh? If we'd been discussing namedtuples over (say) dictionaries, I'd 
perhaps have accepted the reply.


The next step was to write some code to double-check my understanding 
(reproduced below). No change there.
(given that I'm perfect...cough, cough, splutter)


Thus, back to ChatGPT to seek clarification:
«
You said "Immutable instances: By default, instances of dataclasses are 
immutable, which means that once created, their attributes cannot be 
modified. This can help to prevent unintended modifications to the 
data." The attribute-values within a data-class can be changed. In what 
way are Python dataclasses immutable?
»

ChatGPT did indeed clarify that:
«
You're correct that the attribute values within a dataclass can be 
changed, even though the instance itself is considered immutable.
»

That's better. The machine is as smart as I am - or I, as smart as it - 
or something(?)


However, the supporting-explanation then appears to argue against its 
own clarification (and the assertion in the question):
«
So to clarify, when we say that instances of dataclasses are immutable, 
we mean that the attributes of an instance cannot be changed after it is 
created.

In other words, if you create an instance of a dataclass, its attributes 
cannot be modified by any external code. If you want to change the value 
of an attribute, you need to create a new instance of the dataclass with 
the updated attribute value.

... [code excised - see below]

This is because the instance is considered immutable, and its attributes 
cannot be changed directly. Instead, we need to create a new instance of 
the dataclass with the updated attribute value, like this...

Now, new_person is a new instance of the Person dataclass with the 
updated attribute values, while the original person instance remains 
unchanged.

So to summarize, while the attribute values within a dataclass can be 
changed, instances of dataclasses are considered immutable because their 
attributes cannot be modified directly once they are created. Any 
updates to the attributes require the creation of a new instance of the 
dataclass.
»

[Note]
In and amongst the above explanatory-paragraphs it gives example 
code-snippets. Those are condensed into:

 >>> from dataclasses import dataclass
 >>>
 >>> @dataclass
... class Person:
...     name: str
...     age: int
...
 >>> person = Person(name='Alice', age=20)
 >>>
 >>> person.name = 'Bob'  # raises AttributeError: can't set attribute
 >>> person.age = 21      # raises AttributeError: can't set attribute

- note the erroneous comments about exceptions being raised
(they weren't/aren't)!


Here's a home-grown version illustrating all the points made during the 
investigation, by way of proof/disproof:

""" PythonTraining:dataclass.py
     Prove/disprove claim made by Chat-GPT.
"""

__author__ = "dn, IT&T Consultant"
__python__ = "3.11"
__created__ = "PyCharm, 12 Apr 2023"
__copyright__ = "Copyright © 2023~"
__license__ = "MIT"

# PSL
from dataclasses import dataclass


@dataclass
class Coordinates():
     """Sample dataclass. """
     x:int
     y:int


if __name__ == "__main__":
     print( "\nCommencing execution\n" )

     coordinates = Coordinates( 1, 2, )
     print( coordinates, id( coordinates ), )
     coordinates.x = 3
     print( coordinates, id( coordinates ), )
     coordinates.z = 4
     print( coordinates, id( coordinates ), )
     print( coordinates.x, coordinates.y, coordinates.z, )


### output:
Commencing execution

Coordinates(x=1, y=2) 140436963150928
Coordinates(x=3, y=2) 140436963150928
Coordinates(x=3, y=2) 140436963150928
3 2 4

Terminating
###

Not only are a dataclass instance's attribute-values mutable, but 
further attributes can be dynamically-added to the object-instance!

Yes, if the code included:

coordinates = Coordinates( 5, 6, )

the new "coordinates" identifier would point to a different id() 
'address', ie a fresh immutable-instance.


The 'book of words' (https://docs.python.org/3/library/dataclasses.html) 
does mention immutability (wrt to dataclasses) in that it is possible to 
add a __hash__() method (any object defined with is (technically) 
immutable). However, apart from the default_factory argument, there 
doesn't appear to be other discussion of [im]mutability.


Anything I've 'missed'?
- or a salutary tale of not depending upon ChatGPT etc?

-- 
Regards,
=dn


More information about the Python-list mailing list