python implementation of a new integer encoding algorithm.

Laura Creighton lac at openend.se
Wed Feb 18 05:32:42 EST 2015


Hi Jan.

I'm an old fart.  In the late 1970s, when I started programming these
things, and memory was non-existant, we came up with all sorts of data
compression algorithms which were absolutely necessary to get any work
done whatsoever.  Should you ever need an assembler programmer for
quick and dirty hacks for the PDP-11 line (11/20 and 11/05 preferred
as it is harder) I am still the woman for the job.  Indeed, I spent
most of my 20s finding better and better ways to fit programs into
smaller and smaller memory footprints.

I perfectly understand the intellectual thrill of doing such things.
As puzzles go, it is about as cool a one as exists, and its all for
things that matter -- for real.

However, in the matter of financial compensation and world recognition,
you have just laid a very large goose-egg.   The VAX-11/780 was introduced
on October 25, 1977, according to wikipedia.  But in my world, it was 1982
before I got to see the first one.  And it was godly more expensive than
a pdp-11, but the writing was on the wall.  The thing could page, and
so all the techniques we learned for making our code concise -- let alone
the dirty tricks I specialised in -- were no longer relevant.

>From the mid 1980s onward I have been telling people 'your code is
ugly, please tighten it up by refactoring it <here> and <here>' and
when I am their instructor they grumble and do it, and otherwise they
flip me the bird.  In their eyes, it doesn't matter how the code
_looks_ as long as it does the job.  And I deeply sympathise.  But
what I am going for is not a 'death - by looking unfashionable' but
rather a demand that good code is clear to understand.  Because what
I have learned, that Brian Kernighan expressed a long time ago is
that:

	Debugging is twice as hard as writing the code in the
	first place. Therefore, if you write the code as cleverly
	as possible, you are, by definition, not smart enough to debug it.

Your proposed encoding scheme (if it does as you say, I have not
analysed it) scores very, very high in the _cleverness_ department.
Enough that a lot of people, who aren't as clever as you do, have
no hope in hell of ever being clever enough to debug something that
uses it.  Therefore, you will never see widespread adoption of your
scheme, no matter how brilliantly it does as you say, because we all
need things that are easier to debug more than we need better compression.

So now you are sad.  I was sad, too, but the sooner I learned this the
sooner I could stop wasting my time creating algorithms that provided
cool functionality that people hated for the same reasons I found them
cool.

You need to find a different sort of algorithm that people like to use
if you want to get widespread success in the world of widely used
algorithms.  If you have found a way to improve on Lemel-Ziv, then this
will count.

But it may be that your next step is 'how to encode things that
are not phonetic language'.  Go look -- for the next few months --
at how MIDI stores sounds.  You will find plenty of places for
improvement, but the idea is not to improve the standard but to learn
it well enough that you can see things in the non-alphabetic world.

So then, now what?

If you are still fired up with the desire to compress things, then
there is a huge, _very well paying_ market I want to introduce you
to.  And this is _tech support for porn sites_.  Porn sites make a
ton of money, indeed the numbers are scary.  And here the idea of
'I saved 2% of time/bandwidth/disk space/' really matters.  You
can really save money for them, and it really matters to them.
Since I have never found sex 'dirty' and indeed consider it one
of the great joys in life, I have never found anything wrong with
working for porn sites.

And, hell, out of male porn Jimmy Wales made wikipedia.  Haven't
we all benefitted?

But, right now, you are, alas for you, about 45 years too late for the
ideas you are sprouting.  I had similar ones about 30 years too late
and, well, they only worked for me for about 3-5 years.  Sucks to be
you, friend -- you needed to be your grandfather, I fear.


Laura Creighton



More information about the Python-list mailing list