Weird bahaviour from shlex - line no
Dave Angel
davea at davea.name
Sat Sep 28 09:29:19 EDT 2013
On 28/9/2013 02:26, Daniel Stojanov wrote:
> Can somebody explain this. The line number reported by shlex depends
> on the previous token. I want to be able to tell if I have just popped
> the last token on a line.
>
I agree that it seems weird. However, I don't think you have made
clear why it's not what you (and I) expect.
import shlex
def parseit(string):
print
print "Parsing -", string
first = shlex.shlex(string)
token = "dummy"
while token:
token = first.get_token()
print token, " -- line", first.lineno
parseit("word1 word2\nword3") #first
parseit("word1 word2,\nword3") #second
parseit("word1 word2,word3\nword4")
parseit("word1 word2+,?\nword3")
This will display the lineno attribute for every token.
shlex is documented at:
http://docs.python.org/2/library/shlex.html
And lineno is documented on that page as:
"""shlex.lineno
Source line number (count of newlines seen so far plus one).
"""
It's not at all clear what "seen so far" is intended to mean, but in
practice, the line number is incremented for the last token on the
line. Thus your first example
Parsing - word1 word2
word3
word1 -- line 1
word2 -- line 2
word3 -- line 2
-- line 2
word2 has the incremented line number.
But when the token is neither whitespace nor ASCII letters, then it
doesn't increment lineno. Thus second example:
Parsing - word1 word2,
word3
word1 -- line 1
word2 -- line 1
, -- line 1 #we would expect this to be "line 2"
word3 -- line 2 -- line 2
Anybody else have some explanation or advice for Daniel, other than
preprocessing the string by stripping any non letters off the end of the
line?
--
DaveA
More information about the Python-list
mailing list