Parsing Indented Text (like parsing Python)

Mike Schinkel mikeschinkel at gmail.com
Sun Mar 11 10:01:13 EDT 2007


Gabriel Genellina wrote:
> > The problem is, how do I figure out how 
> > many spaces represent a tab? 
> 
> You can't, unless you have more context.

How does Python do it?

> > one case, someone could have their editor configured to
> > allow tabs to use 3 spaces and the user could
> > intermingle tabs and spaces. In other cases, a user
> > might have their editor configured to have a tab equal
> > 8 spaces yet also intermingle tabs and spaces. When a
> > human looks at the document it is obvious the setting
> > but how can I make it obvious to my program?
> > 
> "it is obvious the setting?" How do you infer that? From
> other properties of the document, semantics? Just from
> the content, and the number of tabs and spaces, you can't
> get anything.

>From looking at it. It might require changing tab widths in the editor, but
one setting will make it clear.

> > I could force the user to specify tabwidth at the top
> > of the file, but I'd rather not. And since Python
> > doesn't either, I know it is possible to write a parser
> > to do this. I just don't know how.
> > 
> Python simply assumes 8 spaces per tab. If your Python
> source ONLY uses tabs, or ONLY spaces, it doesn't matter.
> If you mix tabs+spaces, Python simple replaces each tab
> by 8 spaces. If you edited that using 4 spaces, Python
> will get it wrong. That's why all people always say
> "never mix tabs and spaces"

Ah.  I didn't know that.  Like I said at first, I am new to Python (but I've
been reading a lot about it!  3 books thus far. Nutshell, Cookbook,
wxPython)

Okay, I'll just use a directive at the top of the file and let the user
specify.  Not perfect, but such is life.

Thanks again.

-- 
-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org
http://atlanta-web.org - http://t.oolicio.us

 




More information about the Python-list mailing list