Feature Structures in Python (Part I)

James Tauber jtauber at jtauber.com
Thu Jun 19 01:31:41 EDT 2003


This is the first part of a tutorial/cookbook I'm writing up on
implementing feature structures in Python.

I'd love any feedback (particularly on the Pythonicity of my approach).

James

###
### FEATURE STRUCTURES IN PYTHON (Part I)
###
### James Tauber - http://jtauber.com/
### Updated: 2003-06-19
#
# This is the first part of a guide to implementing feature structures
# in Python.
#
#
## BASICS
#
# Python dictionaries are a natural way to represent feature
# structures.
#
# For example, the feature structure
#
#       [  NUMBER   sg  ]
#       [  PERSON   3   ]
#
# can be represented:

fs1 = {
    "NUMBER": "sg",
    "PERSON": 3
}

#
# Similarly, nested feature structures can be represented using nested
# dictionaries. The feature structure
#
#        [ CAT     np              ]
#        [ AGR     [  NUMBER sg ]  ]
#        [         [  PERSON 3  ]  ]
#
# can be represented:

fs2 = {
    "CAT": "np",
    "AGR": {
        "NUMBER": "sg",
        "PERSON": 3
    }
}

#
# Tests of feature structure equality work as expected

assert fs1 != fs2
assert fs1 == fs2["AGR"]

#
# It may be useful to later be able to add metadata to a feature, for
# example descriptions or constraints on the type of values
# allowed. To facilitate this, let's subclass the built-in string type.

class feature(str):
    pass

NUMBER = feature("NUMBER")
PERSON = feature("PERSON")

NUMBER.some_field = "some value"

fs3 = {
    NUMBER: "sg",
    PERSON: 3
}

#
# Note that, by subclassing string we not only get the constructor for
# free but equality testing and compatibility with the plain string
# approach: 

assert feature("NUMBER") == feature("NUMBER")
assert NUMBER == feature("NUMBER")
assert NUMBER == "NUMBER"
assert fs1 == fs3

##
## REENTRANCY
#
# Reentrancy can be achieved by using the same dictionary (i.e. the
# same id) in two places

CAT = feature("CAT")
HEAD = feature("HEAD")
AGR = feature("AGR")
SUBJ = feature("SUBJ")

fs4 = {
    NUMBER: "sg",
    PERSON: 3
}

fs5 = {
    CAT: "s",
    HEAD: {
        AGR: fs4,
        SUBJ: {
            AGR: fs4
        }
    }
}

assert fs5[HEAD][AGR] == fs5[HEAD][SUBJ][AGR]
assert id(fs5[HEAD][AGR]) == id(fs5[HEAD][SUBJ][AGR])

assert fs5[HEAD][AGR] == fs4
assert id(fs5[HEAD][AGR]) == id(fs4)

assert fs5[HEAD][AGR] == fs1
assert id(fs5[HEAD][AGR]) != id(fs1)

#
# Special care must be taken though. If the feature structure bound to
# fs4 is to be modified, the following:

fs4 = {
    NUMBER: "pl",
    PERSON: 1
}

# won't work as all this does is bind fs4 to a new feature structure.

assert fs5[HEAD][AGR] != fs4
assert id(fs5[HEAD][AGR]) != id(fs4)

#
# Instead, the object must be modified in-place. This can be achieved
# with update():

fs5[HEAD][AGR].update(fs4)

assert fs5[HEAD][AGR] == fs4

#
# Note that the following:

fs5[HEAD][AGR] = fs4

# destroys the reentrancy:

assert id(fs5[HEAD][AGR]) != id(fs5[HEAD][SUBJ][AGR])

# and so update() should be used instead (unless destroying the
# reentrancy is desired).
#
# The del operator can be used to remove features. For example

del fs5[HEAD][AGR][NUMBER]

assert fs5[HEAD][AGR] == {PERSON: 1}

#
# If you wish to delete all features in a particular structure or
# sub-structure, you can iterate over the keys of the dictionary and
# delete each one.

for f in fs5[HEAD][SUBJ][AGR].keys():
    del fs5[HEAD][SUBJ][AGR][f]

assert fs5[HEAD][SUBJ][AGR] == {}

#
# Note that, even though you can (in recent Python) iterate directly
# over a dictionary, this will cause a RuntimeError due to the
# dictionary changing size during iteration. Therefore, the use of
# keys() is required.
#
# This is likely to be common enough to warrant a function:

def del_features(feature_structure):
    for f in feature_structure.keys():
        del feature_structure[f]

del_features(fs5[HEAD])

assert fs5[HEAD] == {}

### CONTINUED IN Part II (subsumption, unification)
-- 
  James Tauber
  http://jtauber.com/





More information about the Python-list mailing list