[Tutor] regex grouping/capturing

Andreas Perstinger andipersti at gmail.com
Fri Jun 14 14:23:46 CEST 2013


On 14.06.2013 10:48, Albert-Jan Roskam wrote:
> I am trying to create a pygments  regex lexer.

Well, writing a lexer is a little bit more complex than your original 
example suggested.

 > Here's a simplfied example of the 'set' command that I would like to 
 > parse.
>>>> s = 'set workspace = 6148 header on.'

As I understand it the order of the parts following "set" is arbitrary, 
i. e.
set workspace = 6148 header on.
is equivalent to
set header on workspace = 6148.
correct?

I'm not sure if a single regex can capture this.
But looking at the pygments docs I think you need something along the 
lines of (adapt the token names to your need):

class ExampleLexer(RegexLexer):
     tokens = {
         'root': [
             (r'\s+', Text),
             (r'set', Keyword),
             (r'workspace|header', Name),
             (r'\S+', Text),
         ]
     }

Does this help?

Bye, Andreas


More information about the Tutor mailing list