[melbourne-pug] Unicode for windows dummies
Mike Dewhirst
miked at dewhirst.com.au
Mon Aug 15 21:01:30 EDT 2016
If anyone can point me to the appropriate advice for resolving the error
below I would be most appreciative. Really very appreciative.
I think I understand Unicode in theory and have reread a lot of articles
including ...
* https://docs.python.org/3/library/codecs.html#encodings-and-unicode
*
https://pythonconquerstheuniverse.wordpress.com/2010/05/30/unicode-beginners-introduction-for-dummies-made-simple/
*
https://pythonconquerstheuniverse.wordpress.com/2010/06/04/unicode-for-dummies-just-use-utf-8/
* https://en.wikipedia.org/wiki/UTF-8
This is the error which has stumped me ...
(xxex3) C:\Users\mike\env\xxex3\ssds>python
substance/data_imports/map_csv.py
Traceback (most recent call last):
File "substance/data_imports/map_csv.py", line 139, in <module>
csvdata = CsvImport(csvfile, company, start, finish)
File "substance/data_imports/map_csv.py", line 127, in __init__
print("%s" % cells)
File "C:\Users\mike\env\xxex3\lib\encodings\cp850.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2030' in
position 7452: character maps to <undefined>
I have saved the csv file involved as utf-8 using LibreOffice 5 on
Windows 8.1. from the original Microsoft Excel spreadsheet.
This is in Python 3.5 on Windows but it also needs to run in Python 2.7
on Ubuntu 14.04 server (no gui).
map_csv.py [1] is the beginning of a module I want to develop into a
generic data import facility. I'm starting with a specific csv file I
need to import (not mine and its contents are private) and all it does
at the moment is read in the file and print the lines to stdout.
I have tried utf-8 encoding each line and that gets past the error but
just produces a set of chars a snippet of which below [2]. Decoding that
as utf-8 reproduces the error as might be expected. I have also tried
decoding as utf-16 and encoding it as utf-8 but that didn't work either.
Thanks for reading this far
Mike
[1] ...
from __future__ import unicode_literals
import os
class CsvImport(object):
""" Imports a csv file and converts it into a list of lists """
def __init__(self, csvfile, company, start, finish):
self.company = company
self.rows = list()
with open(csvfile, "r") as csv:
i = 0
self.rows = csv.readlines()
for line in self.rows:
i += 1
cells = list(line)
if i >= start:
print("%s" % cells)
if i > finish:
break
if __name__ == "__main__":
company = "Calia Pty Ltd"
dirname = "{0}/csv".format(company.split()[0].lower())
filename = "{0}1.csv".format(company.split()[0].lower())
start = 105
finish = 404
currdir = os.path.realpath(os.path.dirname(__file__)).replace('\\',
'/')
csvfile = os.path.join(currdir, dirname, filename)
csvdata = CsvImport(csvfile, company, start, finish)
[1] ... , 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34, 34,
44, 34, 65, 99, 117, 116, 101, 32, 72, 97, 122, 97, 114, 100, 32, 84,
111, 32, 84, 104, 101, 32, 65, 113, 117, 97, 116, 105, 99, 32, 69, 110,
118, 105, 114, 111, 110, 109, 101, 110, 116, 46, 34, 44, 44, 44, 44, 44,
44, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44,
34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 67, 104, 114, 111, 110, 105, 99,
32, 72, 97, 122, 97, 114, 100, 32, 84, 111, 32, 84, 104, 101, 32, 65,
113, 117, 97, 116, 105, 99, 32, 69, 110, 118, 105, 114, 111, 110, 109,
101, 110, 116, 46, 34, 44, 50, 44, 34, 78, 47, 65, 34, 44, 34, 71, 72,
83, 48, 57, 34, 44, 34, 72, 52, 49, 49, 34, 44, 44, 44, 48, 46, 48, 48,
48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44,
34, 34, 44, 34, 34, 44, 34, 72, 97, 122, 97, 114, 100, 111, 117, 115,
32, 84, 111, 32, 84, 104, 101, 32, 79, 122, 111, 110, 101, 32, 76, 97,
121, 101, 114, 46, 34, 44, 44, 44, 44, 44, 48, 46, 48, 48, 48, 48, 48,
37, 44, 34, 34, 44, 34, 34, 44, 34, 65, 100, 100, 105, 116, 105, 111,
110, 97, 108, 32, 78, 111, 110, 45, 71, 72, 83, 32, 72, 97, 122, 97,
114, 100, 32, 83, 116, 97, 116, 101, 109, 101, 110, 116, 34, 44, 34, 65,
85, 72, 48, 54, 54, 34, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 10]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/melbourne-pug/attachments/20160816/2b36f22c/attachment.html>
More information about the melbourne-pug
mailing list