convert currency to words
Terry Hancock
hancock at anansispaceworks.com
Tue Dec 31 19:29:46 EST 2002
> On Tue, 31 Dec 2002 15:04:34 +0100, Laura Creighton <lac at strakt.com> wrote:
> >
> >> <URL: http://wiki.tcl.tk/591 > solves this problem.
> >> Translation from the Tcl in which it's expressed to
> >> Python is straightforward.
Cool, cool, cool, cool! They have a *reverse* algorithm, too, which is harder
(at least I found it a pain when I tried it): http://mini.net/tcl/929
I'm going to port that to python, because I have a "mnemonic URL generator"
in Narya that parses a new topic subject, then condenses it into a "unique,
mnemonic, but legal file-name under n characters in length". I wanted to
read numbers and rerepresent them as decimal, so that:
"One thousand and one best books to read"
can become something like:
1001_best_books_read
(I also use an article/preposition reject, common word reject, and an
algorithmic stemmer if necessary). But after getting results like
1_1000_1_best_books_read
I sort of gave up on it for more important problems, as I have plenty of
bigger fish to fry, but this would be cool improvement. (that's pronounced
"kewl").
Anyway, thanks for the link. My mnemonic generator is a piece of Narya, but
you could look in my SF CVS if you want to see it. I'm pretty sure it's
separable. It does currently require PyStemmer, but you can easily remove
that requirement (and frankly I'm not sure if stemming is really that useful,
the results aren't always as mnemonic as I'd like).
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/spacelift/narya/mnemonic_id.py
...
Actually I held this in the tank until I finished -- here's my version of the
algorithm in Python with a few minor changes:
# read_number
"""
Read a number expressed in English and return its numerical value.
"""
# Roughly translated from a TCL version by Dan Smart at
http://mini.net/tcl/929
#
# I've add a few useful synonyms and extended the range a bit.
import re,operator
num = {
'oh':0, 'a': 1, 'an': 1,
'zero':0, 'one':1, 'two':2, 'three':3, 'four':4,
'five':5, 'six':6, 'seven':7, 'eight':8, 'nine':9,
'ten':10, 'eleven':11, 'twelve':12, 'thirteen':13,
'fifteen':15, 'eighteen':18,
'twenty':20, 'thirty':30, 'forty':40, 'fifty':50,
'eighty':80, 'score':20,
'hundred':100, 'thousand':1000, 'million':1000000,
'millions':1000000, 'billion':1000000000,
'billions':1000000000
}
junk_re = re.compile(r'\s+and\s+|\s+|\s*-\s*')
def english_to_number(s):
"""
Determine the value of an English number expression.
"""
express = []
words = re.split(junk_re, s)
for word in words:
if num.has_key(word):
if num[word] >= 1000 and len(express) > 0:
express.append( (operator.mul, num[word], 1) )
elif num[word] >= 99 or num[word]==20 and len(express) > 0:
express.append( (operator.mul, num[word], 0) )
else:
express.append( (operator.add, num[word], 0) )
elif word[-4:]=='teen' and num.has_key(word[:-4]):
express.append( (operator.add, num[word[:-4]] + 10, 0) )
elif word[-2:]=='ty' and num.has_key(word[:-2]):
express.append( (operator.add, num[word[:-2]] * 10, 0) )
else:
return None # Just return None if not a valid expression
value = 0
group = 0
for op in express:
group = op[0](group, op[1])
if op[2]:
value += group
group = 0
value += group
return value
This returns useful values most of the time:
>>> english_to_number("five hundred sixteen million three thousand and five")
516003005
>>> english_to_number("a million and six")
1000006
>>> english_to_number("eleventy one")
111
>>> english_to_number("four score thousand")
80000
although not always:
>>> english_to_number("score a ten")
31
>>> english_to_number("four oh one")
5
This is problem fine for my purposes, but any suggestions welcomed.
Cheers,
Terry
--
Terry Hancock ( hancock at anansispaceworks.com )
Anansi Spaceworks http://www.anansispaceworks.com
"Some things are too important to be taken seriously"
More information about the Python-list
mailing list