Parsing strings -> numbers

Skip Montanaro skip at pobox.com
Tue Nov 25 10:13:51 EST 2003


    tuang> Thanks for taking a shot at it, but it doesn't appear to work:

    >>> import locale
    >>> locale.atoi("-12,345")
     Traceback (most recent call last):
       File "<interactive input>", line 1, in ?
       File "C:\Python2321\lib\locale.py", line 179, in atoi
         return atof(str, int)
       File "C:\Python2321\lib\locale.py", line 175, in atof
         return func(str)
     ValueError: invalid literal for int(): -12,345
    >>> locale.getdefaultlocale()
     ('en_US', 'cp1252')
    >>> locale.atoi("-12345")
     -12345

Take a look at the output of locale.localeconv() with various locales set.
I think you'll find that locale.localeconv()['tousands_sep'] is '', not ','.
Failing that, you might want to simply replace the commas and dollar signs
with empty strings before passing to int() or float(), as someone else
suggested.

Be careful if you're scraping web pages which might not use the same charset
as you do.  You may find something like:

    $123.456,78

as a quote price on a European website.  I don't know how to tell what the
remote site used as its locale when formatting numeric data.  Perhaps
knowing the charset of the page is sufficient to make an educated guess.

Skip





More information about the Python-list mailing list