hex dump w/ or w/out utf-8 chars

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Jul 13 05:49:10 EDT 2013


On Sat, 13 Jul 2013 00:56:52 -0700, wxjmfauth wrote:

> You are confusing the knowledge of a coding scheme and the intrisinc
> information a "coding scheme" *may* have, in a mandatory way, to work
> properly. These are conceptualy two different things.

*May* have, in a *mandatory* way?

JMF, I know you are not a native English speaker, so you might not be 
aware just how silly your statement is. If it *may* have, it is optional, 
since it *may not* have instead. But if it is optional, it is not 
mandatory.

You are making so much fuss over such a simple, obvious implementation 
for strings. The language Pike has done the same thing for probably a 
decade or so.

Ironically, Python has done the same thing for integers for many versions 
too. They just didn't call it "Flexible Integer Representation", but 
that's what it is. For integers smaller than 2**31, they are stored as C 
longs (plus object overhead). For integers larger than 2**31, they are 
promoted to a BigNum implementation that can handle unlimited digits.

Using Python 2.7, where it is more obvious because the BigNum has an L 
appended to the display, and a different type:

py> for n in (1, 2**20, 2**30, 2**31, 2**65):
...     print repr(n), type(n), sys.getsizeof(n)
...
1 <type 'int'> 12
1048576 <type 'int'> 12
1073741824 <type 'int'> 12
2147483648L <type 'long'> 18
36893488147419103232L <type 'long'> 22


You have been using Flexible Integer Representation for *years*, and it 
works great, and you've never noticed any problems.



-- 
Steven



More information about the Python-list mailing list