Tokenizing a string
Fredrik Lundh
effbot at telia.com
Sat Mar 18 10:41:42 EST 2000
Michael Dartt <mad96 at hampshire.edu> wrote:
> I've got a string I'd like to tokenize, but it's not in a file, and it'd
> be rather inefficient to write it to a file just to tokenize it. Is
> there any function I can use to pass this string to
> tokenize.tokenize()?
the "tokenize" function takes any method which returns
a new line of code for each call, and an empty string when
it runs out of data.
the easiest way to use this on a string is to wrap the
string in a StringIO object, and pass the readline method
to the tokenizer:
import tokenize
import StringIO
prog = "print 'hello'\n"
tokenize.tokenize(StringIO.StringIO(prog).readline)
## this prints:
##
## 1,0-1,5: NAME 'print'
## 1,6-1,13: STRING "'hello'"
## 1,13-1,14: NEWLINE '\012'
## 2,0-2,0: ENDMARKER ''
alternatively, you can use your own wrapper, such as:
import string
class Wrapper:
def __init__(self, program):
self.prog = string.split(program, "\n")
if program[-1:] == "\n":
del self.prog[-1] # trim tail
def __call__(self):
try:
return self.prog.pop(0) + "\n"
except IndexError:
return "" # end of list
tokenize.tokenize(Wrapper(prog))
hope this helps!
</F>
<!-- (the eff-bot guide to) the standard python library:
http://www.pythonware.com/people/fredrik/librarybook.htm
-->
More information about the Python-list
mailing list