Newbie question about text encoding

Dave Angel davea at davea.name
Tue Feb 24 15:41:44 EST 2015


On 02/24/2015 02:57 PM, Laura Creighton wrote:
> Dave Angel
> are you another Native English speaker living in a world where ASCII
> is enough?

I'm a native English speaker, and 7 bits is not nearly enough.  Even if 
I didn't currently care, I have some history:

No.  CDC display code is enough. Who needs lowercase?

No.  Baudot code is enough.

No, EBCDIC is good enough.  Who cares about other companies.

No, the "golf-ball" only holds this many characters.  If we need more, 
we can just get the operator to switch balls in the middle of printing.

No. 2 digit years is enough.  This world won't last till the millennium 
anyway.

No.  2k is all the EPROM you can have.  Your code HAS to fit in it, and 
only 1.5k RAM.

No.  640k is more than anyone could need.

No, you cannot use a punch card made on a model 26 keypunch in the same 
deck as one made on a model 29.  Too bad, many of the codes are 
different.  (This one cost me travel back and forth between two 
different locations with different model keypunches)

No. 8 bits is as much as we could ever use for characters.  Who could 
possibly need names or locations outside of this region?  Or from 
multiple places within it?

35 years ago I helped design a serial terminal that "spoke" Chinese, 
using a two-byte encoding.  But a single worldwide standard didn't come 
until much later, and I cheered Unicode when it was finally unveiled.

I've worked with many printers that could only print 70 or 80 unique 
characters.  The laser printer, and even the matrix printer are 
relatively recent inventions.

Getting back on topic:

According to:
    http://support.esri.com/cn/knowledgebase/techarticles/detail/27345

"""ArcGIS Desktop applications, such as ArcMap, are Unicode based, so 
they support Unicode to a certain level. The level of Unicode support 
depends on the data format."""

That page was written about 2004, so there was concern even then.

And according to another, """In the header of each shapefile (.DBF), a 
reference to a code page is included."""

-- 
DaveA



More information about the Python-list mailing list