[Tutor] Python 3: string to decimal conversion

US usaidov at gmail.com
Sun May 22 21:45:17 EDT 2016


On Sun, May 22, 2016 at 5:36 PM,  <cs at zip.com.au> wrote:
> On 22May2016 08:19, Saidov <usaidov at gmail.com> wrote:
>>
>> Thank you all for the useful feedback. I am new to programming so bear
>> with me while I learn the rules...
>>
>> I have run Cameron's code to print the values and have the traceback
>> results. please see below.
>
> [...]
>>>
>>>            for row in records:
>>
>> [...]
>>>
>>>                try:
>>>                    expenses[ts.Date(row[0]).month] +=
>>
>> decimal.Decimal(row[4])
>>>
>>>                except ValueError:
>>>                    pass
>
> [...]
>>
>> I suggest that you print out the value of row[4] before the "try"
>> statement:
>>  print("row[4] =", repr(row[4]))
>
> [...]
>>
>> + decimal <module 'decimal' from 'C:\mypath\Anaconda3\\lib\\decimal.py'>
>> module
>> + expenses {1: Decimal('0'), 2: Decimal('0'), 3: Decimal('0'), 4:
>> Decimal('0'), 5: Decimal('0'), 6: Decimal('0'), 7: Decimal('0'), 8:
>> Decimal('0'), 9: Decimal('0'), 10: Decimal('0'), 11: Decimal('0'), 12:
>> Decimal('0')} dict
>> row[0] '"1/1/2016"' str
>> row[4] '""' str
>> + ts <module 'timestring' from
>> 'C:\mypath\Anaconda3\\lib\\site-packages\\timestring\\__init__.py'>
>> module
>>
>> ipython traceback:
>>
>> row[4]= ' "" '
>>
>> Traceback (most recent call last):
>>  File "C:\mypath\visual studio
>> 2015\Projects\Budget\Budget\Budget.py", line 28, in <module>
>>    expenses[ts.Date(row[0]).month] += decimal.Decimal(row[4])
>> decimal.InvalidOperation: [<class 'decimal.ConversionSyntax'>]
>>
>> I think the problem may be caused by an empty string value that is
>> passed to decimal.Decimal function. The csv file contains some empty
>> cells and I wanted the code to ignore them. That's why I had the
>> ValueError exception.
>
>
> I have two observations here:
>
> Your strings in row[4] are not empty. If that output is repr(row[4]) then
> row[4] contains the text:
>
>  ""
>
> That is a two character string consisting of two quote characters.
>
> Normally the csv.reader module will handle that for you, but I see that you
> passed the paramater:
>
>  quoting=csv.QUOTE_NONE
>
> to it when setting it up. It looks to me like your CSV file is a
> conventional one with quotes around string values. By passing csv.QUOTE_NONE
> you prevent the csv module from handling those for you and you get the "raw"
> column values. So a column with an empty string is probably written as "" in
> the file.
>
> Could you show is the first line or so from your CSV file? That should tell
> us and you whether the file is "bare" comma separate values or the far more
> common quoted format.
>
> If it is the quoted format, just remove the "quoting=csv.QUOTE_NONE"
> parameter altogether - the default for the csv module is quoted and it is
> _usually_ what is wanted. Of course you need to know one way or the other,
> so examine the CSV file itself.
>

Thanks, the csv.QUOTE_NONE was one of the parameters i changed
earlier. It didn't make any difference, unfortunately. I kept getting
the same error.

I visually inspected the csv file and the empty cells show up empty.
negative numbers are recorded as ($0.00), non-negative numbers as:
$0.00

Here are the first four lines:

Date No. Description Type Debit Credit
1/1/2016 income ex-2387280 CREDIT $303.65
1/3/2016 income ex-4732847 CREDIT $3.00
1/4/2016 insurance DEBIT ($75.59)

> The other observation is that you're trying to catch ValueError, when
> plainly the Decimal module is raising decimal. InvalidOperation. So you
> should catch that instead.
>
> HOWEVER, just catching it and ignoring that row will _silently_ discard good
> input if you program is incorrect. You should almost always emit an error
> message or perform some other very specific action when you catch an
> exception.
>
> Alan has suggested that you test specificly for an empty string.
>
> I agree: you should act on exactly what is expected. By blindly catching
> ValueError or decimal.InvalidOperation and not reporting the string that
> caused it to happen you will silently ignore all kinds of unexpected input.
>
> So I would advocate some code like this, similar to Alan's:
>
>  if not row[4]:
>    # empty column - we expect this and ignore it
>    pass
>  else:
>    try:
>      expenses[ts.Date(row[0]).month] += decimal.Decimal(row[4])
>    except decimal.InvalidOperation as e:
>      print("unexpected expenses value: %r" % (row[4],))
>
> which will report the offending values, and presumably you expect either an
> empty column or a _valid_ expense value.
>
> Cheers,
> Cameron Simpson <cs at zip.com.au>

Thank you both for suggesting a way to handle errors. I have run the
suggested code. What I learned is that all the values (not only empty
cells) seem to be invalid for decimal.Decimal function.

I tried the float() function instead of decimal.Decimal and got an
error message: could not convert string to float: '($75.59)'.

I also checked the type of values in row[4]. All the values passed on
to decimal.Decimal () function are indeed of string type...

Is there anything I can do to the formatting of the csv file to make
it 'readable' for decimal.Decimal function?

Here is my updated code along with the output I got from running it:

----------------------------------------------------
code:
import numpy as np
import csv
import timestring as ts
import decimal

months= [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
expenses = {x: decimal.Decimal() for x in months}
income = {x: decimal.Decimal() for x in months}
test = []

exp_cat = []
income_cat = []

files =['export.csv']

with open("budgetfile.csv","wt") as fw:
    writer = csv.writer(fw)
    for file in files:
        with open(file, newline ='' ) as csvfile:
            records = csv.reader(csvfile)
            print("Processing file {}. \n" .format(file))
            header = next(records)
            for row in records:
                print("row[4] =", repr(row[4]), "value type =", type(row[4]))

                if not row[4]:
                    pass
                else:
                    try:
                        expenses[ts.Date(row[0]).month] +=
decimal.Decimal(row[4])
                    except decimal.InvalidOperation as e:
                        print("unexpected expenses value:  %r" % (row[4],))
----------------------------------------------------------------------------------------------------------------
last 4 lines of output:
[4] = '($10.00)' value type = <class 'str'>
unexpected expenses value:  '($10.00)'
row[4] = '($287.42)' value type = <class 'str'>
unexpected expenses value:  '($287.42)'
--------------------------------------------------------------------------------------------------------------


More information about the Tutor mailing list