convert currency to words

Terry Hancock hancock at anansispaceworks.com
Tue Dec 31 19:29:46 EST 2002


> On Tue, 31 Dec 2002 15:04:34 +0100, Laura Creighton <lac at strakt.com> wrote:
>> >> <URL: http://wiki.tcl.tk/591 > solves this problem.
> >> Translation from the Tcl in which it's expressed to
> >> Python is straightforward.

Cool, cool, cool, cool! They have a *reverse* algorithm, too, which is harder 
(at least I found it a pain when I tried it):  http://mini.net/tcl/929

I'm going to port that to python, because I have a "mnemonic URL generator" 
in Narya that parses a new topic subject, then condenses it into a "unique, 
mnemonic, but legal file-name under n characters in length".  I wanted to 
read numbers and rerepresent them as decimal, so that:

"One thousand and one best books to read"

can become something like:

1001_best_books_read

(I also use an article/preposition reject, common word reject, and an 
algorithmic stemmer if necessary).  But after getting results like

1_1000_1_best_books_read

I sort of gave up on it for more important problems, as I have plenty of 
bigger fish to fry, but this would be cool improvement.  (that's pronounced 
"kewl").

Anyway, thanks for the link.  My mnemonic generator is a piece of Narya, but 
you could look in my SF CVS if you want to see it. I'm pretty sure it's 
separable. It does currently require PyStemmer, but you can easily remove 
that requirement (and frankly I'm not sure if stemming is really that useful, 
the results aren't always as mnemonic as I'd like).

http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/spacelift/narya/mnemonic_id.py

...

Actually I held this in the tank until I finished -- here's my version of the
algorithm in Python with a few minor changes:

# read_number
"""
Read a number expressed in English and return its numerical value.
"""
# Roughly translated from a TCL version by Dan Smart at 
http://mini.net/tcl/929
#
# I've add a few useful synonyms and extended the range a bit.

import re,operator

num  =	{
	'oh':0, 'a': 1, 'an': 1,
	'zero':0, 'one':1, 'two':2, 'three':3, 'four':4,
	'five':5, 'six':6, 'seven':7, 'eight':8, 'nine':9,
	'ten':10, 'eleven':11, 'twelve':12, 'thirteen':13,
	'fifteen':15, 'eighteen':18, 
	'twenty':20, 'thirty':30, 'forty':40, 'fifty':50,
	'eighty':80, 'score':20,
	'hundred':100, 'thousand':1000, 'million':1000000,
	'millions':1000000, 'billion':1000000000,
	'billions':1000000000
	}

junk_re = re.compile(r'\s+and\s+|\s+|\s*-\s*')

def english_to_number(s):
    """
    Determine the value of an English number expression.
    """
    express = []
    words = re.split(junk_re, s)
    for word in words:
	if num.has_key(word):
	    if num[word] >= 1000 and len(express) > 0:
		express.append( (operator.mul, num[word], 1) )
	    elif num[word] >= 99 or num[word]==20 and len(express) > 0:
		express.append( (operator.mul, num[word], 0) )
	    else:
		express.append( (operator.add, num[word], 0) )
	elif word[-4:]=='teen' and num.has_key(word[:-4]):
	    express.append( (operator.add, num[word[:-4]] + 10, 0) )
	elif word[-2:]=='ty' and num.has_key(word[:-2]):
	    express.append( (operator.add, num[word[:-2]] * 10, 0) )
	else:
	    return None	# Just return None if not a valid expression
	
    value = 0
    group = 0
    for op in express:
	group = op[0](group, op[1])
	if op[2]:
	    value += group
	    group = 0
    value += group
	
    return value

This returns useful values most of the time:

>>> english_to_number("five hundred sixteen million three thousand and five")
516003005
>>> english_to_number("a million and six")
1000006
>>> english_to_number("eleventy one")
111
>>> english_to_number("four score thousand")
80000

although not always:

>>> english_to_number("score a ten")
31
>>> english_to_number("four oh one")
5

This is problem fine for my purposes, but any suggestions welcomed.

Cheers,
Terry
	
--
Terry Hancock ( hancock at anansispaceworks.com )
Anansi Spaceworks  http://www.anansispaceworks.com

"Some things are too important to be taken seriously"





More information about the Python-list mailing list