historic grail python browser "semi-recovered"

MRAB python at mrabarnett.plus.com
Thu Jun 10 14:17:24 EDT 2010


lkcl wrote:
> On Jun 9, 11:03 pm, rantingrick <rantingr... at gmail.com> wrote:
>> On Jun 9, 4:29 pm, lkcl <luke.leigh... at gmail.com> wrote:
>>
>>> um, please don't ask me why but i foundgrail, the python-based web
>>> browser, and have managed to hack it into submission sufficiently to
>>> view e.g.http://www.google.co.uk.  out of sheer apathy i happened to
>>> have python2.4 still installed which was the only way i could get it
>>> to run without having to rewrite regex expressions (which i don't
>>> understand).
>>> if anyone else would be interested in resurrecting this historic web
>>> browser, just for fits and giggles, please let me know.
>> Hi lkcl,
>>
>> My current conquest to bring a new (or fix the current GUI) in
>> Python's stdlib is receiving much resistance. I many need a project to
>> convince my opponents of my worth. Tell you what i do, send me a text
>> file with a pathname and all the line numbers that have broken regexs
>> using a common sep --space is fine for me-- and i'll fix them for you.
>> Here is a sample...
> 
>  ok i've committed a file REGEX.CONVERSIONS.REQUIRED into the git
> repository,
> http://github.com/lkcl/grailbrowser
> git://github.com/lkcl/grailbrowser.git
> 
>  i used "grep -n" so it's filename:lineno:  {ignore the actual stuff}
> 
>  unfortunately, SGMLLexer.py contains some _vast_ regexs spanning 5-6
> lines, which means that a simple grep ain't gonna cut it.  there's a
> batch of regex's spanning from line 650 to line 699 and a few more
> besides.
> 
>  of course, it has to be borne in mind that this code was written for
> python 1.5 initially, at a time when python xml/sax/dom/sgml code
> probably didn't exist.
> 
>  but leaving aside the fact that it all needs to be ripped up and
> modernised i'm more concerned about getting these 35,000 lines of code
> operational, doing as small transitions as possible.
> 
The regex module was called 'regex'. I see that the name 're' is used as
a name in the code.

As for the regexes themselves, the equivalents for the current 're'
module are:

     regex                    re
     \(                       (
     \)                       )
     \|                       |
     (                        \(
     )                        \)
     |                        \)
     casefold                 IGNORECASE
     regex.match(...) >= 0    re.match(...)



More information about the Python-list mailing list