[Tutor] Splitting strings and undefined variables

Andreas Kostyrka andreas at kostyrka.org
Tue Feb 10 19:45:47 CET 2009


Am Mon, 9 Feb 2009 12:05:33 -0800
schrieb Moos Heintzen <iwasroot at gmail.com>:

> Hello all,
> 
> I was looking at this:
> http://www.debian.org/doc/manuals/reference/ch-program.en.html#s-python
> 
> I have a question about the line of code that uses split()
> 
> With the python version, the line below only works if there are three
> fields in line.
> 
> (first, last, passwd) = line.split()
> 
> Also, since the variables are used like this:
> 
> lineout = "%s:%s:%d:%d:%s %s,,/home/%s:/bin/bash\n" %  \
>                  (user, passwd, uid, gid, first, last, user)
> 
> I can't use ":".join(line.split())
> But maybe a dictionary could be used for string substitution.
> 
> In the perl version (above the python version in the link), the script
> works with the input line having one to three fields. Like "fname
> lname pw" or "fname lname"
> 
> ($n1, $n2, $n3) = split / /;
> 
> Is there a better way to extract the fields from line in a more
> flexible way, so that the number of fields could vary?
> I guess we could use conditionals to check each field, but is there a
> more elegant (or pythonic!) way to do it?

Well, the problem you are getting is probably a ValueError, meaning that
the number of items does not match the expected number:

>>> a, b = 1, 2, 3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: too many values to unpack
>>> a, b, c, d = 1, 2, 3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: need more than 3 values to unpack

Assuming that you want to have empty strings as the default you can use:

a, b, c = (line.split() + ["", ""])[:3]

Ok, step by step:

line.split() produces a list with at least one string, assuming that
line is a string.

line.split() + ["", ""] creates a new list with two empty strings added
to the end.

[:3] gives you the first three strings of that list.

Generally speaking, the above should be only used when you really
really know that you want to threat your data such a way (ignore all
later fields, add empty strings), it has a real potential for later
debugging pains, when the data turns out to be different than what you
expected.

Another way would be:

a, b, c = "defaultA", "defaultB", "defaultC"
try:
	flds = line.split()
	a = flds[0]
	b = flds[1]
	c = flds[2]
except IndexError:
	pass

That still ignores any errors coming your way, usually it's better to
check on len(flds).

Generally, defensive programming (as in processing ANY data given,
e.g. HTML parsing in browsers) is sometimes necessary, but often not
such a good idea (one usually prefers an error message than faulty
output data. Nothing more embarrasing then contacting your customers to
tell them that the billing program was faulty and you billed them to
much the last two years *g*).

Andreas



> 
> Moos
> 
> P.S. I'm not a Perl user, I was just reading the examples. I've been
> using C and awk for few years, and Python for few months. Also, I know
> blank passwords aren't very practical, but I'm just asking this to
> explore possibilities :)
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor


More information about the Tutor mailing list