[Tutor] Coma separated instead TAB separated

Steven D'Aprano steve at pearwood.info
Mon Jun 9 03:05:21 CEST 2014


On Sun, Jun 08, 2014 at 12:56:40PM -0600, Mario Py wrote:
> Hi everyone, this is very basic/beginner question.
> 
> I'm reading TXT file, two words per line that are separated by TAB:
> 
> question, rightAnswer = line.strip().split('\t')

For a simple format like this, that's perfectly acceptable, but for more 
advanced data, you should investigate the csv module.

https://docs.python.org/3/library/csv.html
‎

> I would like to use TXT file that it would be separated by coma.
> How do I change that line of code?
> 
> I tried these two versions but it is not working:
> 
> question, rightAnswer = line.strip().split('\c')    # c for coma?
> question, rightAnswer = line.strip().split('\,')    # , for coma?

How about "," for comma?

You only need a backslash-escape for special characters that are 
impossible to type or otherwise tricky or inconvenient to include in 
strings. Tabs are tricky, because they're invisible and look like 
spaces.

Here is a list of the escape sequences allowed:

\a	BEL (bell)
\b	BS (backspace)
\f	FF (formfeed)
\n	LF (linefeed or newline)
\r	CR (carriage return)
\t	HT (horizontal tab)
\v	VT (vertical tab)
\0	NUL (that's a zero, not the letter Oh)
\\	Backslash
\'	Single quote
\"	Double quote


Of these, the most common one by far is \n.

There are also escape sequences for arbitrary characters:

\0dd	Character dd (one or two digits) in octal (base eight)
\xdd	Character dd (two digits) in hexadecimal (base sixteen)

In both the \0dd and \xdd cases, the value is limited to the range 0 
through 255. In octal, that's 0 through 377, or in hex it is 0 to FF.

A backslash followed by a newline (end of line) is a line continuation, 
that is, the newline is ignored:

s = "this is a really, really, really \
long string."

In Unicode strings, you can also use:

\udddd		Unicode code point U+dddd (four digits) in hexadecimal
\Uddddddd	Same, but eight digits
\N{name}	Unicode character called "name"

(They must be exactly 4 digits or 8 digits, nothing in between).

Last but not least, any other backslash escape \c gets left alone.



-- 
Steven


More information about the Tutor mailing list