Programming games in historical linguistics with Python

Tue Nov 30 19:45:48 EST 2010

2010/11/30 Dax Bloom <bloom.dax at gmail.com>:
> Hello,
>
> Following a discussion that began 3 weeks ago I would like to ask a
> question regarding substitution of letters according to grammatical
> rules in historical linguistics. I would like to automate the
> transformation of words according to complex rules of phonology and
> integrate that script in a visual environment.
> Here follows the previous thread:
> http://groups.google.com/group/comp.lang.python/browse_thread/thread/3c55f9f044c3252f/fe7c2c82ecf0dbf5?lnk=gst&q=evolutionary+linguistics#fe7c2c82ecf0dbf5
>
> Is there a way to refer to vowels and consonants as a subcategory of
> text? Is there a function to remove all vowels? How should one create
> and order the dictionary file for the rules? How to chain several
> transformations automatically from multiple rules? Finally can anyone
> show me what existing python program or phonological software can do
> this?
>
> What function could tag syllables, the word nucleus and the codas? How
> easy is it to bridge this with a more visual environment where
> interlinear, aligned text can be displayed with Greek notations and
> braces as usual in the phonology textbooks?
>
> Best regards,
>
> Dax Bloom
> --
> http://mail.python.org/mailman/listinfo/python-list
>

Hi,
as far as I know, there is no predefined function or library for
distinguishing vowels or consonants, but these can be simply
implemented individually according to the exact needs.

e.g. regular expressions can be used here: to remove vowels, the code
could be (example from the command prompt):

>>> import re
>>> re.sub(r"(?i)[aeiouy]", "", "This is a SAMPLE TEXT")
'Ths s  SMPL TXT'
>>>

See http://docs.python.org/library/re.html
or
http://www.regular-expressions.info/
for the regexp features.

You may eventually try the new development version regex, which adds
many interesting new features and remove some limitations
http://bugs.python.org/issue2636

In some cases regular expressions aren't really appropriate or may
become too complicated.
Sometimes a parsing library like pyparsing may be a more adequate tool:
http://pyparsing.wikispaces.com/

If the rules are simple enough, that they can be formulated for single
characters or character clusters with a regular expression, you can
model the phonological changes as a series of replacements with
matching patterns and the respective replacement patterns.

For character-wise matching and replacing the regular expressions are
very effective; using lookarounds
http://www.regular-expressions.info/lookaround.html
even some combinatorics for conditional changes can be expressed;
however, i would find some more complex conditions, suprasegmentals,
morpheme boundaries etc. rather difficult to formalise this way...

hth,
  vbr