Generating generations of files

Chris Angelico rosuav at gmail.com
Mon Apr 29 20:10:38 EDT 2019


On Tue, Apr 30, 2019 at 9:54 AM DL Neil <PythonList at danceswithmice.info> wrote:
>
> On 30/04/19 8:04 AM, Chris Angelico wrote:
> > On Tue, Apr 30, 2019 at 6:00 AM DL Neil <PythonList at danceswithmice.info> wrote:
> >>
> >> Are you aware of a library/utility which will generate and maintain the
> >> file names of multiple generations of a file?
> >>
> >
> > Commit it to a git repository. All the generations have the same name,
> > but you can compare them, explore past versions, etc, etc, etc, with a
> > rich set of tools. And it's easy to invoke git from a program (if you
> > need information from stdout, most git commands accept a "--porcelain"
> > parameter), so you can fully automate.
>
>
> Great idea!
> (and all 'the work' is already done)
>
> I've seen various Py + GIT entries in the library, so can I assume the
> ability to output a file directly into a GIT repo/branch, instead of the
> file-system?
>
> Is this a common use-case?

Common? Not sure. Possible? Absolutely! This is a script that creates
a commit on a branch other than the currently checked-out one:

https://github.com/Rosuav/shed/blob/master/git-deploy#L48

Most of the time, you'll just edit the file and then commit, since
you'll generally want to have the most current version visible in the
file system. But you can create the new version first, and then
separately "deploy" that version by moving the branch pointer.

> Three immediate thoughts:
>
> 1 I'd have to put GIT onto one of the users' storage servers (see
> comment elsewhere about re-thinking location of file-storage). OK, so
> I'm lazy - but I work hard at it!

Fair point. Not too hard though.

> 2 We'd need a user-friendly GIT-client for the users. The words "Linus"
> and "user friendly" don't often sit together happily in the same
> sentence (and if he ever reads this...)

There are plenty of user-friendly git clients that do "normal"
operations (for various definitions of "normal", but generally that
includes committing changes on the file system to the current branch).
If your use-case differs, you can easily create your own UI that calls
on lower-level git tools (such as the script linked above - you can
run that as if "git deploy" were a core git subcommand).

> 3 The users' case for retaining older-versions is so that they can
> compare between runs. (I blame myself, showing them filecmp, noting that
> they could quickly employ Calc/Excel...)
> Easiest explained example = some slight change in a variable results in
> a 'surprise' result. (which scenario is almost identical to the
> 'sensation' when a user throws some data at *our code* and finds some
> 'error' - schadenfreude means there's always a silver lining!)

Depending on exactly how this works, it might be easiest to just
commit the "current state" (for some definition of "current"), and
then just edit the file directly - "git diff" will tell you what's
changed since commit. That's how I track changes to other people's
files; for instance, whenever Counter-Strike: Global Offensive has an
update, I check its changes and then create a commit to accept those
changes.

> Earlier, I had suggested dropping the o/p of each 'run' into its own
> sub-dir of the client-/experiment-directory (perhaps using date-time)
> and thereby completely avoiding any fileNM 'collisions'. Lead balloon!
> Sigh...

Okay, so this needs to be as smooth and invisible as possible. If you
don't need any concurrency, the easiest way might be for each run to
automatically do a "git commit -a" at the end, and then you can
compare the most recent run to the run prior to that with "git show".

I would recommend planning out a workflow on the assumption that git
is magically able to do whatever you want it to, and then try to
figure out how to make git do it. Chances are you can do most of it
with simple high-level commands (perhaps automatically executed), and
then if there's something you can't do that way, it might be time to
delve into the lower level tools. It's almost certainly possible.

ChrisA



More information about the Python-list mailing list