Generating generations of files

Avi Gross avigross at verizon.net
Tue Apr 30 23:42:06 EDT 2019


TOPIC: how to save, retain and retrieve files as they change.

There seem to be several approaches and one proposal here involved using
altered file names with a numeric suffix.

Can we compare two different concepts? A typographical approach with little
or no built-in support like searching for a number at the end of a file name
and incrementing it, is very different than older approaches. VMS had file
names with three fixed-length parts in some kind of struct as stored in the
file system. The main file name had a maximum length. The file extension,
such as EXE, was limited to three characters and the period between them was
probably not stored but artificially added so that "filename" and "exe"
became filename.exe. The third component was a tad unusual as it consisted
of a sequence number and may even have been stored in a few bits as a small
integer, perhaps a byte and would be optionally shown with a semi-colon as a
separator:

filename.file-type;version

DOS had an 8by3 version with no support for keeping track of sequence
numbers. UNIX flowed the fields together with a limit of 14 that included an
actual optional period as a counted character. Since then, many systems have
allowed huge file names including with multiple periods as in
open_me.tar.gz.

So looking at the VMS model may not be as helpful for trying to come up with
a way for a Python program to maintain a sequence of file names. VMS support
for version numbers was coded widely within the OS and applications. I
regularly had to purge and keep only recent versions because I kept running
out of disk space. Many current OS do not have any such support and saving a
file simply overwrites the earlier version by default.

Some questions:  What would a proposed scheme do with a filename that
already ended legitimately with a number such as the address "broad23"?
Would it save another version as broad24 or as broad231 or append more
characters as in broad23-1? What do you do with file.txt as "file.txt1"
would not work well or be recognized as a registered type?

There are many more modern ways to manage versions of files and many of them
involve placing full copies of the file somewhere else where they can be
retrieved or making some kind of diff on the previous version that can allow
reconstitution with some calculation using an original file and doing what
modifications (additions, replacements, deletions) it takes to transition an
original file to version1, then version 2 and so on. Other methods may
involve squirrelling away complete copies along with metadata that lets them
be searched and retrieved. Using just a filename change is probably easier
for some things like a circular group of log files, though.

FWIW, I see lots of computer languages (including Python) implementing what
I consider typographical techniques where you implement a feature by taking
a variable name and adding changes to get a result, such as is done with the
name mangling feature where __NAME (double underscore) actually gets mapped
to _someclass__NAME in the actual code. I see similar gimmicks in many
languages like SCALA and RUST. This is sometimes a reasonable way to take a
single string and add functionality with some manipulation -- but with some
care as a user choosing odd names may cause unusual problems in some cases.







More information about the Python-list mailing list