Ideas for parsing this text?
Paul McGuire
ptmcg at austin.rr.com
Wed Apr 23 22:05:35 EDT 2008
On Apr 23, 8:00 pm, "Eric Wertman" <ewert... at gmail.com> wrote:
> I have a set of files with this kind of content (it's dumped from WebSphere):
>
> [propertySet "[[resourceProperties "[[[description "This is a required
> property. This is an actual database name, and its not the locally
> catalogued database name. The Universal JDBC Driver does not rely on
> ...
A couple of comments first:
- What is the significance of '"[' vs. '[' ? I stripped them all out
using
text = text.replace('"[','[')
- Your input text was missing 5 trailing ]'s.
Here's the parser I used, using pyparsing:
from pyparsing import nestedExpr,Word,alphanums,QuotedString
from pprint import pprint
content = Word(alphanums+"_.") | QuotedString('"',multiline=True)
structure = nestedExpr("[", "]", content).parseString(text)
pprint(structure.asList())
Prints (I've truncated the long lines, but the long quoted strings do
parse intact):
[['propertySet',
[['resourceProperties',
[[['description',
'This is a required \nproperty. This is an actual data...
['name', 'databaseName'],
['required', 'true'],
['type', 'java.lang.String'],
['value', 'DB2Foo']],
[['description',
'The JDBC connectivity-type of a data \nsource. If you...
['name', 'driverType'],
['required', 'true'],
['type', 'java.lang.Integer'],
['value', '4']],
[['description',
'"The TCP/IP address or host name for the DRDA server."'],
['name', 'serverName'],
['required', 'false'],
['type', 'java.lang.String'],
['value', 'ServerFoo']],
[['description',
'The TCP/IP port number where the \nDRDA server resides.'],
['name', 'portNumber'],
['required', 'false'],
['type', 'java.lang.Integer'],
['value', '007']],
[['description', '"The description of this datasource."'],
['name', 'description'],
['required', 'false'],
['type', 'java.lang.String'],
['value', []]],
[['description',
'The DB2 trace level for logging to the \nlogWriter ...
['name', 'traceLevel'],
['required', 'false'],
['type', 'java.lang.Integer'],
['value', []]],
[['description',
'The trace file to store the trace output. \nIf you ...
]]]]]]]
-- Paul
The pyparsing wiki is at http://pyparsing.wikispaces.com.
More information about the Python-list
mailing list