How do I encode and decode this data to write to a file?

cl at isbd.net cl at isbd.net
Mon Apr 29 08:59:26 EDT 2013


Dave Angel <davea at davea.name> wrote:
> On 04/29/2013 05:47 AM, cl at isbd.net wrote:
> 
> A couple of generic comments:  your email program made a mess of the 
> traceback by appending each source line to the location information.
> 
What's me email program got to do with it?  :-)   I'm using a dedicated
newsreader (tin) as I posted via the gmane/usenet interface.  The posting
looks perfectly OK to me when I read it back from usenet.


> Please mention your Python version & OS.  Apparently you're running 2.7 
> on Linux or similar.
> 
Sorry, yes you're spot on.


> > I am debugging some code that creates a static HTML gallery from a
> > directory hierarchy full of images. It's this package:-
> >      https://pypi.python.org/pypi/Gallery2.py/2.0
> >
> >
> > It's basically working and does pretty much what I want so I'm happy to
> > put some effort into it and fix things.
> >
> > The problem I'm currently chasing is that it can't cope with directory
> > names that have accented characters in them, it fails when it tries to
> > write the HTML that creates the page with the thumbnails on.
> >
> > The code that's failing is:-
> >
> >          raw = os.path.join(directory, self.getNameNoExtension()) + ".html"
> >          file = open(raw, "w")
> >          file.write("".join(html).encode('utf-8'))
> 
> You can't encode byte data, it's already encoded. So you're forcing the 
> Python system to implicitly decode it (using ASCII codec) before letting 
> you encode it to utf-8.  If you think it's already in utf-8, then omit 
> the encode() call there.
> 
It's the way the code was as I installed it from pypi.  What you say
makes a lot of sense though, I'll remove the encode().


> Additionally, you can debug things with some simple print statements, at 
> least if you decompose your 3-function line so you can get at the 
> intermediate data.  Split the line into three parts;
>      temp1 = "".join(html)     #temp1 is byte data
>      temp2 = temp1.decode()    #temp2 is unicode data
>      temp3 = temp2.encode("utf-8")  #temp3 is byte data again
>      file.write(temp3)
> 
OK, thanks for this and all the other advice on this thread.

-- 
Chris Green



More information about the Python-list mailing list