ka-ping yee tokenizer.py
Karl Kobata
karl.kobata at syncira.com
Wed Sep 17 14:02:06 EDT 2008
Aaran,
Thanks for your input. Your examples gave me other alternatives for what I
wanted to do and it seems to work.
Thanks all for your help.
On Sep 16, 2:48 pm, "Karl Kobata" <karl.kob... at syncira.com
<http://mail.python.org/mailman/listinfo/python-list> > wrote:
> Hi Fredrik,
>
> This is exactly what I need. Thank you.
> I would like to do one additional function. I am not using the tokenizer
to
> parse python code. It happens to work very well for my application.
> However, I would like either or both of the following variance:
> 1) I would like to add 2 other characters as comment designation
> 2) write a module that can readline, modify the line as required, and
> finally, this module can be used as the argument for the tokenizer.
>
> Def modifyLine( fileHandle ):
> # readline and modify this string if required
> ...
>
> For token in tokenize.generate_tokens( modifyLine( myFileHandle ) ):
> Print token
>
> Anxiously looking forward to your thoughts.
> karl
>
> -----Original Message-----
> From: python-list-bounces+kkobata=syncira.... at python.org
<http://mail.python.org/mailman/listinfo/python-list>
>
> [mailto:python-list-bounces+kkobata=syncira.... at python.org
<http://mail.python.org/mailman/listinfo/python-list> ] On Behalf Of
> Fredrik Lundh
> Sent: Monday, September 15, 2008 2:04 PM
> To: python-l... at python.org
<http://mail.python.org/mailman/listinfo/python-list>
> Subject: Re: ka-ping yee tokenizer.py
>
> Karl Kobata wrote:
>
> > I have enjoyed using ka-ping yee's tokenizer.py. I would like to
> > replace the readline parameter input with my own and pass a list of
> > strings to the tokenizer. I understand it must be a callable object and
> > iteratable but it is obvious with errors I am getting, that this is not
> > the only functions required.
>
> not sure I can decipher your detailed requirements, but to use Python's
> standard "tokenize" module (written by ping) on a list, you can simple
> do as follows:
>
> import tokenize
>
> program = [ ... program given as list ... ]
>
> for token in tokenize.generate_tokens(iter(program).next):
> print token
>
> another approach is to turn the list back into a string, and wrap that
> in a StringIO object:
>
> import tokenize
> import StringIO
>
> program = [ ... program given as list ... ]
>
> program_buffer = StringIO.StringIO("".join(program))
>
> for token in tokenize.generate_tokens(program_buffer.readline):
> print token
>
> </F>
>
> --http://mail.python.org/mailman/listinfo/python-list
>
>
This is an interesting construction:
>>> a= [ 'a', 'b', 'c' ]
>>> def moditer( mod, nextfun ):
... while 1:
... yield mod( nextfun( ) )
...
>>> list( moditer( ord, iter( a ).next ) )
[97, 98, 99]
Here's my point:
>>> a= [ 'print a', 'print b', 'print c' ]
>>> tokenize.generate_tokens( iter( a ).next )
<generator object at 0x009FF440>
>>> tokenize.generate_tokens( moditer( lambda s: s+ '#', iter( a ).next
).next )
It adds a '#' to the end of every line, then tokenizes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20080917/625fe4b1/attachment-0001.html>
More information about the Python-list
mailing list