[Python-ideas] `to_file()` method for strings

Wed Mar 23 01:43:09 EDT 2016

On 23 March 2016 at 13:33, Andrew Barnert via Python-ideas
<python-ideas at python.org> wrote:
> On Mar 22, 2016, at 20:06, Nick Eubank <nickeubank at gmail.com> wrote:
>>
>> As a social scientists trying to help other social scientists move from language like R, Stata, and Matlab into Python, one of the behaviors I've found unnecessarily difficult to explain is the "file.open()/file.close()" idiom (or, alternatively, context managers). In normal operating systems, and many high level languages, saving is a one-step operation.
>>
>>  I understand there are situations where an open file handle is useful, but it seems a simple `to_file` method on strings (essentially wrapping a context-manager) would be really nice, as it would save users from learning this idiom.
>
> Funny, I've seen people looking for a one-step file-load more often than file-save. But really, the two go together; any environment that provided one but not the other would seem strange to me.
>
> The question is, if you stick the save as a string method s.to_file(path), where do you stick the load? Is it a string class method str.from_file(path)? That means you have to teach novices about class methods very early on... Alternatively, they could both be builtins, but adding two new builtins is a pretty heavy-duty change. Anything else seems like it would be too non-parallel to be discoverable by novices or remembered by occasional Python users.
>
> I'd assume you'd also want this on bytes (and probably bytearray) for dealing with binary files.

A key part of the problem here is that from a structured design
perspective, in-memory representation, serialisation and persistence
are all separate concerns. If you're creating a domain specific
language (and yes, I count R, Stata and MATLAB as domain specific),
then you can make some reasonable assumptions for all of those, and
hide more of the complexities from your users.

In Python, for example, the Pandas IO methods are closer to what other
data analysis and modelling environments are able to offer:
http://pandas.pydata.org/pandas-docs/stable/io.html
NumPy similarly has some dedicated IO routines:
http://docs.scipy.org/doc/numpy/reference/routines.io.html

However, for a general purpose language, providing convenient "I don't
care about the details, just do something sensible" defaults gets
trickier, as not only are suitable defaults often domain specific, you
have a greater responsibility to help folks figure out when they have
ventured into territory where those defaults are no longer
appropriate. If you can't figure out a reasonable set of default
behaviours, or can't figure out how to nudge people towards
alternatives when they start hitting the limits of the default
behaviour, then you're often better off ducking the question entirely
and getting people to figure it out for themselves.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia