Question about working with html entities in python 2 to use them as filenames

Steve D'Aprano steve+python at pearwood.info
Tue Nov 22 20:15:15 EST 2016


On Wed, 23 Nov 2016 07:54 am, Steven Truppe wrote:

> I've made a pastebin with a few examples: http://pastebin.com/QQQFhkRg

Your pastebin appears to be the same code as you've shown here. And, again,
it doesn't seem to be the actual code you are really running.

The only new or helpful information is that you're trying to call

os.mkdir(title)

where title is, well, we have to guess, because you don't tell us.

My *guess* is that the failure happens when processing the title:

"Wizo - Bleib Tapfer / für'n Arsch Full Album - YouTube"

but since we really don't know the actual code you are using, we have to
guess.

My guess is that you have the byte string:

title = "Wizo - Bleib Tapfer / f\xc3\xbcr'n Arsch Full Album - YouTube"

Notice the \xc3 byte? You can check this by printing the repr() of the
title:

py> print repr(title)
'Wizo - Bleib Tapfer / f\xc3\xbcr'n Arsch Full Album - YouTube'



When you try title.decode(), it fails:

py> print title.decode()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 23:
ordinal not in range(128)



But if you tell it to use UTF-8, it succeeds:

py> print title.decode('utf-8')
Wizo - Bleib Tapfer / für'n Arsch Full Album - YouTube



So my guess is that you're tying to create the directory like this:


os.mkdir(title.decode())


and getting the same error. You should try:

os.mkdir(title.decode('utf-8'))

which will at least give you a new error: you cannot use '/' inside a
directory name. So you can start by doing this:


os.mkdir(title.replace('/', '-').decode('utf-8'))


and see what happens.


Beyond that, I cannot guess what you need to do to fix the code I haven't
seen.



-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list