python

Rhodri James rhodri at wildebst.demon.co.uk
Fri May 15 18:39:26 EDT 2009


On Fri, 15 May 2009 14:58:16 +0100, <anica_1069 at hotmail.com> wrote:

> hello, I´m a student of linguistic an I need do this exercises. Can
> anybody help me,please?
> Thanks

Sorry, we're allergic to homework.  If you've got any more specific
questions about how bits of Python work, do ask, but don't ask us to
do your assignments for you.  Most tutors can get downright unreasonable
about that sort of thing.

> ◑ Read in some text from a corpus, tokenize it, and print the list of
> all wh-word types that occur. (wh-words in English are used in
> questions, relative clauses and exclamations: who, which, what, and so
> on.) Print them in order. Are any words duplicated in this list,
> because of the presence of case distinctions or punctuation?

Some questions you should already have the answers to:

* How do you identify wh-words?  I'm guessing for the purposes of
this assignment you've got a list of them.  If that's so, look up the
bit of the tutorial about finding out whether something is in a list.
If you have to parse for them, that's a rather harder problem.

* ...but beware of case distinctions -- look up the string methods to
see how to play with the case of a string.

* Tokenizing the string is just a matter of splitting it up where there
are spaces or punctuation.  Funnily enough, there's this string method
called "split()" that will do this.

* What does "in order" mean?  In the order in which they occur in the
corpus?  In alphabetical order?  If it's the latter, you'll need to
record the wh-words when they occur in a list and sort it at the end
before printing it out.

> ◑ Create a file consisting of words and (made up) frequencies, where
> each line consists of a word, the space character, and a positive
> integer, e.g. fuzzy 53. Read the file into a Python list using open
> (filename).readlines(). Next, break each line into its two fields
> using split(), and convert the number into an integer using int(). The
> result should be a list of the form: [['fuzzy', 53], ...].

So you do that.  Seriously, this is a pretty complete recipe already.
If you can't see how to do it, re-read the tutorial.

-- 
Rhodri James *-* Wildebeeste Herder to the Masses



More information about the Python-list mailing list