[Tutor] FW: output not in ANSI, conversing char set to locale.getpreferredencoding()

Thu Aug 16 13:33:13 CEST 2012

> To: tutor at python.org
> From: __peter__ at web.de
> Date: Tue, 14 Aug 2012 16:03:46 +0200
> Subject: Re: [Tutor] output not in ANSI,	conversing char set to locale.getpreferredencoding()
> 
> leon zaat wrote:
> 
> > I get the error:
> > UnicodeDecodeError: 'ascii' codecs can't decode byte 0xc3 in position 7:
> > ordinal not in range(128) for the openbareruimtenaam=u'' +
> > (openbareruimtenaam1.encode(chartype)) line.
> 
> 
> The error message means that database.select() returns a byte string.
> 
> bytestring.encode(encoding)
> 
> implicitly attempts
> 
> bytestring.decode("ascii").encode(encoding)
> 
> and will fail for non-ascii bytestrings no matter what encoding you pass to 
> the encode() method.
>  
> > I know that the default system codecs is ascii and chartype=b'cp1252'
> > But how can i get the by pass the ascii encoding?
> 
> You have to find out the database encoding -- then you can change the 
> failing line to
> 
> database_encoding = ... # you need to find out yourself, but many use the
>                         # UTF-8 -- IMO the only sensible choice these days
> file_encoding = "cp1252"
> 
> openbareruimtenaam = openbareruimtenaam1.decode(
>     database_encoding).encode(file_encoding)
> 
> As you now have a bytestring again you can forget about codecs.open() which 
> won't work anyway as the csv module doesn't support unicode properly in 
> Python 2.x (The csv documentation has the details).
> 

Tried it with:
openbareruimtenaam = openbareruimtenaam1.decode("UTF-8").encode("cp1252")
but still the complains about the ascii error

prior message:
import csv
import codecs
import locale
# Globale variabele
bagObjecten = []
chartype=locale.getpreferredencoding()
#------------------------------------------------------------------------------
# BAGExtractPlus toont het hoofdscherm van de BAG Extract+ tool
#------------------------------------------------------------------------------    
class BAGExtractPlus(wx.Frame):

    #------------------------------------------------------------------------------
    # schrijven van de records
    #------------------------------------------------------------------------------
    def schrijfExportRecord(self, verblijfhoofd,identificatie):

        sql1="";
        sql1="Select openbareruimtenaam, woonplaatsnaam  from nummeraanduiding where identificatie = '" + identificatie "'" 
        num= database.select(sql1);
        for row in num:
            openbareruimtenaam1=row[0]     
            openbareruimtenaam=u'' + (openbareruimtenaam1.encode(chartype))
            woonplaatsnaam1=(row[0]);
            woonplaatsnaam=u'' + (woonplaatsnaam1.encode(chartype))
            newrow=[openbareruimtenaam, woonplaatsnaam];
            verblijfhoofd.writerow(newrow);

    #--------------------------------------------------------------------------------------
    # Exporteer benodigde gegevens
    #--------------------------------------------------------------------------------------
    def ExportBestanden(self, event):
         ofile=codecs.open(r'D:\bestanden\BAG\adrescoordinaten.csv', 'wb', chartype)
        verblijfhoofd = csv.writer(ofile, delimiter=',',    
                 quotechar='"', quoting=csv.QUOTE_NONNUMERIC)
        counterVBO=2;
        identificatie='0014010011066771';
        while 1 < counterVBO:
            hulpIdentificatie= identificatie;            
            sql="Select identificatie, hoofdadres, verblijfsobjectgeometrie  from verblijfsobject where ";
            sql= sql + "identificatie > '" +  hulpIdentificatie ;
            vbo= database.select(sql);
            if not vbo:
                break;
            else:
                for row in vbo:
                    identificatie=row[0];
                    verblijfobjectgeometrie=row[2];
                    self.schrijfExportRecord(verblijfhoofd, identificatie)

I highlighted in red the lines i think that are important.
When i try to convert openbareruimtenaam from  the data below:
"P.J. Noël Bakerstraat";"Groningen"

I get the error:
UnicodeDecodeError: 'ascii' codecs can't decode byte 0xc3 in position 7: ordinal not in range(128) for the openbareruimtenaam=u'' + (openbareruimtenaam1.encode(chartype)) line.

I know that the default system codecs is ascii and chartype=b'cp1252'
But how can i get the by pass the ascii encoding? 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20120816/a2c2113e/attachment.html>