help I'm getting delimited

J. Cliff Dyer jcd at sdf.lonestar.org
Thu Dec 18 10:03:44 EST 2008


On Wed, 2008-12-17 at 06:28 -0800, aka wrote:
> Hi John, thanks.
> You're right, I didn't past the method header because I thought it
> didn't matter when the input filename is hardcoded.
> The try/except isn't very helpful indeed so I commented it out.
> You're right I wrongly referred to the UnicodeReader
> class in my first post because that's ultimately where I want to go
> so
> I outcommented it here for you to see.
> The fact is that neither csv.reader nor the UnicodeReader will read
> the file, while writing with the UnicodeWriter
> works like a charm.
> That's why I put str() around roles to see any content.
> I simplified the csv-file by cutting off columns without result. The
> file looks now like:
> 
> id;company;department
> 12;Cadillac;Research
> 11;Ford;Accounting
> 10;Chrysler;Sales
> 
> 
> The dictionary on the return is because this code is part of my
> TurboGears application.
> The entire method is:
> 
> 
> import csv
> from utilities.urw       import UnicodeWriter, UnicodeReader
> 
> 
>     @expose(allow_json=True)
>     def import_roles(self, input=None, *args, **kwargs):
>         inp = 'C:/temp/test.csv'
>         roles = []
>         msg = ''
>         ## try:
>         fp = open(inp, 'rb')
>         reader = csv.reader(fp, dialect='excel', delimiter=';')
>         ## reader = UnicodeReader(fp, dialect='excel', delimiter=';')
>         for r in reader:
>             roles.append(r[0])
>         fp.close()
>         ## except:
>             ## msg = "Something's wrong with the csv.reader"
>         return dict(filepath=inp,
>                     roles=str(roles),
>                     msg=msg)
> 
> 
> csv.reader results in: for r in reader: Error: line contains NULL
> byte
> 
> 
> Use of UnicodeReader results in: UnicodeDecodeError: 'utf8' codec
> can't decode byte 0xff in position 0: unexpected code byte
> 

This looks like the problem might be in your choice of codec.  A UTF-8
file will never have 0xff in it, and would be unlikely to have 0x00
either.  My guess is that you will need to decode your input from
UTF-16.  (and then use the UnicodeReader).  

> 
> Will post only complete code from now on thanks.
> 
> --
> http://mail.python.org/mailman/listinfo/python-list
> 




More information about the Python-list mailing list