[Tutor] Re:Base 207 compression algorithm
Blake Winton
bwinton@latte.ca
Thu Jun 26 23:10:04 2003
* Jeff Shannon <jeff@ccvcorp.com> [030626 20:39]:
> >>However, you're not going to have any luck in actually doing any math
> >>with either of these strings.
> >Sure you can. You convert back to decimal.
> No, because your computer can't do math on a string of decimal digits.
> It needs to convert that into a binary number somehow before it can do
> math. And it can *store* it as a binary number a lot more efficiently
> than it can store it as a string of digit characters, no matter what
> encoding scheme you use for those characters.
Well, that's not entirely true...
_I_ know how to do arithmatic in decimal. It's kind of painful,
but it's possible.
i.e.:
def stringPlus( x, y ):
if len(x) < len(y):
x = ("0" * (len(y)-len(x))) + x
elif len(y) < len(x):
y = ("0" * (len(x)-len(y))) + y
carry = "0"
answer = ""
digitCount = len(x)
for i in range(digitCount):
xDigit = x[len(x)-1-i]
yDigit = y[len(x)-1-i]
(newDigit, carry) = add( xDigit, yDigit, carry )
answer = newDigit + answer
if carry != "0":
answer = carry + answer
return answer
def add( x, y, carry ):
(newDigit,carry) = addTwo( x, y )
(newDigit,carry) = addTwo( newDigit, carry )
return (newDigit,carry)
def addTwo( x, y ):
if x == "0":
return y,"0"
elif y == "0":
return x,"0"
elif x == "1":
if y == "1":
return "2","0"
if y == "2":
return "3","0"
# The rest of this elif left as an exercise for the reader.
elif x == "2":
if y == "1":
return "3","0"
if y == "2":
return "4","0"
elif x == "3":
if y == "1":
return "4","0"
if y == "2":
return "5","0"
# Aw, heck, you can fill in the rest of this method too.
if __name__ == "__main__":
print stringPlus( "123", "11" )
Notice that I didn't even cheat and use "ord" and "chr". :)
> Like I said, learn how your computer works at the level of registers,
> and how the floating-point unit operates, and you'll understand this a
> bit better.
This, of course, is always good advice.
> >So you admit I have reduced the file size of 5000 digits of pi to 2100
Let me just theorize for a second here on how I might compress
5000 digits of pi. First, I'm going to assume that those 5000
digits are in base 10. So, I can get an integer by multiplying
that number by 10^5000. Follow me?
Here's some code:
>>> # Okay, so I'll approximate pi, so sue me.
>>> x = "3141592654" + "0"*4990
>>> len(x)
5000
>>> y = long(x)
>>> # So now we have pi as a number. Let's shift it right by a bit.
>>> y >> 16607
1L
>>> # Neat, so we now know that y takes 16607 bits to store.
>>> # Each byte is 8 bits, so we can store y as 16608/8 bytes.
>>> 16608/8
2076
So, the smallest you can store 5000 random digits as is 2076 bytes.
If you try to get it any smaller, you'll need to try and store more
than 8 bits in a byte, and since there are only 8 bits in a byte,
you've got a problem. Cino, if you want to mail me privately about
this, I would be more than happy to try and explain it further. But
I think that Jeff's right, and it should really go off-list now.
Later,
Blake.
--
11:04pm up 42 days, 11:11, 4 users, load average: 0.00, 0.00, 0.00