Feature Structures in Python (Part I)
James Tauber
jtauber at jtauber.com
Thu Jun 19 01:31:41 EDT 2003
This is the first part of a tutorial/cookbook I'm writing up on
implementing feature structures in Python.
I'd love any feedback (particularly on the Pythonicity of my approach).
James
###
### FEATURE STRUCTURES IN PYTHON (Part I)
###
### James Tauber - http://jtauber.com/
### Updated: 2003-06-19
#
# This is the first part of a guide to implementing feature structures
# in Python.
#
#
## BASICS
#
# Python dictionaries are a natural way to represent feature
# structures.
#
# For example, the feature structure
#
# [ NUMBER sg ]
# [ PERSON 3 ]
#
# can be represented:
fs1 = {
"NUMBER": "sg",
"PERSON": 3
}
#
# Similarly, nested feature structures can be represented using nested
# dictionaries. The feature structure
#
# [ CAT np ]
# [ AGR [ NUMBER sg ] ]
# [ [ PERSON 3 ] ]
#
# can be represented:
fs2 = {
"CAT": "np",
"AGR": {
"NUMBER": "sg",
"PERSON": 3
}
}
#
# Tests of feature structure equality work as expected
assert fs1 != fs2
assert fs1 == fs2["AGR"]
#
# It may be useful to later be able to add metadata to a feature, for
# example descriptions or constraints on the type of values
# allowed. To facilitate this, let's subclass the built-in string type.
class feature(str):
pass
NUMBER = feature("NUMBER")
PERSON = feature("PERSON")
NUMBER.some_field = "some value"
fs3 = {
NUMBER: "sg",
PERSON: 3
}
#
# Note that, by subclassing string we not only get the constructor for
# free but equality testing and compatibility with the plain string
# approach:
assert feature("NUMBER") == feature("NUMBER")
assert NUMBER == feature("NUMBER")
assert NUMBER == "NUMBER"
assert fs1 == fs3
##
## REENTRANCY
#
# Reentrancy can be achieved by using the same dictionary (i.e. the
# same id) in two places
CAT = feature("CAT")
HEAD = feature("HEAD")
AGR = feature("AGR")
SUBJ = feature("SUBJ")
fs4 = {
NUMBER: "sg",
PERSON: 3
}
fs5 = {
CAT: "s",
HEAD: {
AGR: fs4,
SUBJ: {
AGR: fs4
}
}
}
assert fs5[HEAD][AGR] == fs5[HEAD][SUBJ][AGR]
assert id(fs5[HEAD][AGR]) == id(fs5[HEAD][SUBJ][AGR])
assert fs5[HEAD][AGR] == fs4
assert id(fs5[HEAD][AGR]) == id(fs4)
assert fs5[HEAD][AGR] == fs1
assert id(fs5[HEAD][AGR]) != id(fs1)
#
# Special care must be taken though. If the feature structure bound to
# fs4 is to be modified, the following:
fs4 = {
NUMBER: "pl",
PERSON: 1
}
# won't work as all this does is bind fs4 to a new feature structure.
assert fs5[HEAD][AGR] != fs4
assert id(fs5[HEAD][AGR]) != id(fs4)
#
# Instead, the object must be modified in-place. This can be achieved
# with update():
fs5[HEAD][AGR].update(fs4)
assert fs5[HEAD][AGR] == fs4
#
# Note that the following:
fs5[HEAD][AGR] = fs4
# destroys the reentrancy:
assert id(fs5[HEAD][AGR]) != id(fs5[HEAD][SUBJ][AGR])
# and so update() should be used instead (unless destroying the
# reentrancy is desired).
#
# The del operator can be used to remove features. For example
del fs5[HEAD][AGR][NUMBER]
assert fs5[HEAD][AGR] == {PERSON: 1}
#
# If you wish to delete all features in a particular structure or
# sub-structure, you can iterate over the keys of the dictionary and
# delete each one.
for f in fs5[HEAD][SUBJ][AGR].keys():
del fs5[HEAD][SUBJ][AGR][f]
assert fs5[HEAD][SUBJ][AGR] == {}
#
# Note that, even though you can (in recent Python) iterate directly
# over a dictionary, this will cause a RuntimeError due to the
# dictionary changing size during iteration. Therefore, the use of
# keys() is required.
#
# This is likely to be common enough to warrant a function:
def del_features(feature_structure):
for f in feature_structure.keys():
del feature_structure[f]
del_features(fs5[HEAD])
assert fs5[HEAD] == {}
### CONTINUED IN Part II (subsumption, unification)
--
James Tauber
http://jtauber.com/
More information about the Python-list
mailing list