stumped by tricky logic

Mon Jan 30 00:03:16 EST 2006

"Dave" <davidworley at gmail.com> wrote in message
news:1138553712.522747.250050 at z14g2000cwz.googlegroups.com...
> So I'm trying to write a CSS preprocessor.
>
> I want to add the ability to append a selector onto other selectors.
> So, given the following code:
> =========================================
> #selector {
>
>                           { property: value; property: value; }
>         .other_selector   { property: value; property: value; }
>
>         #selector_2 {
>
>                  .more_selector { property: value; }
>
>         }
>
> }
> =========================================
>
> I want to return the following:
> =========================================
> #selector { property: value; property: value; }
> #selector .other_selector { property: value; property: value; }
> #selector #selector_2 .more_selector { property: value; }
> =========================================

Dave -

Since other posters have suggested parsing, here is a pyparsing stab at your
problem.  Pyparsing allows you to construct your grammar using readable
construct names, and can generate structured parse results.   Pyparsing also
has built-in support for skipping over comments.

This paper describes a prior use of pyparsing to parse CSS style sheets:
http://dyomedea.com/papers/2004-extreme/paper.pdf.  Google for "pyparsing
CSS" for some other possible references.

This was really more complex than I expected.  The grammar was not
difficult, but the recursive routine was trickier than I thought it would
be.  Hope this helps.

Download pyparsing at http://pyparsing.sourceforge.net.
-- Paul

=========================
data = """
#selector {

                          { property: value; /* a nasty comment */
                          property: value; }
        .other_selector   { property: value; property: value; }

        #selector_2 {
                 /* another nasty comment */
                 .more_selector { property: value; /* still another nasty
comment */ }

        }

}
"""

from pyparsing import Literal,Word,Combine,Group,alphas,nums,alphanums,\
                       Forward,ZeroOrMore,cStyleComment,ParseResults

# define some basic symbols - suppress grouping and delimiting punctuation
# and let grouping do the rest
lbrace = Literal("{").suppress()
rbrace = Literal("}").suppress()
colon  = Literal(":").suppress()
semi   = Literal(";").suppress()
pound  = Literal("#")
dot    = Literal(".")

# define identifiers, property pattern, valid property values, and property
list
ident        = Word(alphas,alphanums+"_")
pound_ident  = Combine(pound + ident)
dot_ident    = Combine(dot + ident)
prop_value   = Word(nums) | Word(alphanums)  # expand this as needed
property_def = Group( ident + colon + prop_value + semi )
prop_list    = Group( lbrace + ZeroOrMore( property_def ) +
                                   rbrace ).setResultsName("propList")

# define selector - must use Forward since selector is recursive
selector = Forward()
selector_contents = (prop_list) | Group( dot_ident.setResultsName("name") +

prop_list ) | selector
selector << Group( pound_ident.setResultsName("name") +
                                  lbrace +
                                  Group(ZeroOrMore(
selector_contents )).setResultsName("contents") +
                                  rbrace )

# C-style comments should be ignored
selector.ignore(cStyleComment)

# parse the data - this only works if data *only* contains a single selector
results = selector.parseString(data)

# use pprint to display list - you can navigate the results to construct the
various selectors
import pprint
pprint.pprint( results[0].asList() )
print

# if scanning through text containing other text than just selectors,
# use scanString, which returns a generator, yielding a tuple
# for each occurrence found
#
# for results,start,end in selector.scanString(cssSourceText):
#    pprint.pprint(results.asList())

# a recursive function to print out the names and property lists
def printSelector(res,namePath=[]):
    if res.name != "":
        subpath = namePath + [res.name]
        if res.contents != "":
            for c in res.contents:
                printSelector(c, subpath)
        elif res.propList != "":
            print " ".join(subpath),"{", " ".join([ "%s : %s;" % tuple(p)
for p in res.propList ]),"}"
        else:
            print " ".join(subpath),"{", " ".join([ "%s : %s;" % tuple(r)
for r in res ]),"}"
    else:
        print " ".join(namePath),"{", " ".join([ "%s : %s;" % tuple(r) for r
in res]),"}"

printSelector( results[0] )

=========================
This prints:
['#selector',
 [[['property', 'value'], ['property', 'value']],
  ['.other_selector', [['property', 'value'], ['property', 'value']]],
  ['#selector_2', [['.more_selector', [['property', 'value']]]]]]]

#selector { property : value; property : value; }
#selector .other_selector { property : value; property : value; }
#selector #selector_2 .more_selector { property : value; }