Py3.3 unicode literal and input()

Dave Angel d at davea.name
Mon Jun 18 10:42:31 EDT 2012


On 06/18/2012 10:00 AM, jmfauth wrote:
> <SNIP>

> A string is a string, a "piece of text", period. I do not see why a
> unicode literal and an (well, I do not know how the call it) a "normal
> class <str>" should behave differently in code source or as an answer
> to an input(). 

Wrong.  The rules for parsing source code are NOT applied in general to
Python 3's input data, nor to file I/O done with methods like
myfile.readline().  We do not expect the runtime code to look for def
statements, nor for class statements, and not for literals.  A literal
is a portion of source code where there are specific rules applied,
starting with the presence of some quote characters.

This is true of nearly all languages, and in most languages, the
difference is so obvious that the question seldom gets raised.  For
example, in C code a literal is evaluated at compile time, and by the
time an end user sees an input prompt, he probably doesn't even have a
compiler on the same machine.

When an end user types in his data (into an input statement, typically),
he does NOT use quote literals, he does not use hex escape codes, he
does not escape things with backslash.  If he wants an o with an umlaut
on it, he'd better have such a character available on his keyboard.

i'd suggest playing around a little with literal assignments and input
statements and print functions.  In those literals, try entering escape
sequences (eg. "ab\x41cd")   Run such programs from the command line,
and observe the output from the prints.  Do this without using the
interactive interpreter, as by default it "helpfully" displays
expressions with the repr() function, which confuses the issue.


> Should a user write two derived functions? input_for_entering_text()
> and input_if_you_are_entering_a_text_as_litteral() --- Side effect
> from the unicode litteral reintroduction. I do not mind about this,
> but I expect it does work logically and correctly. And it does not. PS
> English is not my native language. I never know to reply to an
> (interro)-negative sentence. jmf 

The user doesn't write functions, the programmer does.  Until you learn
to distinguish between those two phases, you'll continue having this
confusion.

If you (the programmer) want a function that asks the user to enter a
literal at the input prompt, you'll have to write a post-processing for
it, which looks for prefixes, for quotes, for backslashes, etc., and
encodes the result.  There very well may be such a decoder in the Python
library, but input does nothing of the kind.


The literal modifiers (u""  or r"") are irrelevant here.  The "problem"
you're having is universal, and not new.  The characters in source code
have different semantic meanings than those entered in input, or read
from file I/O.


-- 

DaveA




More information about the Python-list mailing list