Pound sign problem

Steve D'Aprano steve+python at pearwood.info
Tue Apr 11 14:06:02 EDT 2017


On Wed, 12 Apr 2017 02:23 am, Lew Pitcher wrote:

> I recommend whatever encoding is appropriate for the output. 

There are multiple encodings that are appropriate for ASCII + pound sign.
How should the OP choose between them without guidance? If he understood
the issue well enough to make an informed decision, he wouldn't have needed
to ask for help.


> That's not up 
> to you or me to decide; that's a question that only the OP can answer.

Nobody is asking you to *decide*. But you can make a recommendation. Do you
really think that the OP is capable of making an informed decision about
this issue on his own? If he was, he wouldn't have needed to ask for help
solving this problem in the first place.

If you're going to help, actually *help*, and don't just pretend to help:

"Hi, I'm a stranger in town and I'm trying to get to the post office. What's
the best way for me to get there please?"

"Well, that depends on whether you're flying the Space Shuttle, travelling
by sailing ship, dog sled, or advanced alien hyperdrive. You should take
whatever route is most appropriate for your transportation. You're
welcome."

I'm sorry to be so negative when you're only trying to be helpful, but I too
have been on the receiving end of poor-quality "advice" that leaves me just
as much in the dark as before I asked the question, so I'm quite sensitive
to it.

"What should I do here?"

"Do whatever you see fit."

(I'm not specifically referring to this community, just making a general
observation.)



> (Imagine, python on an IBM Zseries running ZOS; 

I can imagine many unlikely things that have come to pass, but that's not
one of them.

The OP is using Pandas, which requires Python 2.7 or better.

https://pypi.python.org/pypi/pandas

There is an unofficial, unmaintained(?), third-party port of Python 2.4 to
Z/OS, which appears to have had no attention for more than a decade.

http://www.teaser.fr/~jymengant/mvspython/mvsPythonPort.html


I suppose it is just barely within the realm of possibility that the OP has
hacked together his own port of Python 2.7 and Pandas to Z/OS. If so, he'd
have already had to deal with some much bigger problems relating to ASCII
versus EBCDIC, and if he managed to solve that, it's unlikely that he'd be
puzzled by a pound sign in his data.

But... even if I grant you your scenario that he's running on Big Iron, that
is irrelevant! Using Unicode for his data files is still the better idea.


> the "native" characterset 
> is one of the EBCDIC variants. Would UTF-8 be a better choice there? )

Yes it would.

The OP is using Unicode strings so regardless of the OS's native character
set, it is better to use Unicode rather than some 8-bit encoding. Today the
OP needs a pound sign. Tomorrow he may need a Greek Σ, yen sign, CJK
ideograph, or Arabic character. Possibly all in the same document. Using
legacy encodings, whether based on EBCDIC or ASCII, should be avoided.



-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list