Turnign greek-iso filenames => utf-8 iso

Νικόλαος Κούρας support at superhost.gr
Wed Jun 12 08:42:59 EDT 2013


On 12/6/2013 2:32 μμ, Νικόλαος Κούρας wrote:
> On 12/6/2013 12:37 μμ, Steven D'Aprano wrote:
>> On Wed, 12 Jun 2013 12:24:24 +0300, Νικόλαος Κούρας wrote:
>
>>
>> [1] Correction. While I was typing this, it came good, for about 20
>> seconds, and displayed a hideously ugly background pattern and a cute
>> smiling face waving, and then broke again.
>
>
> Ah sorry Steven i made the change of:
>
> host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0] or 'UnResolved'
>
> to metrites.py isntead of files.py
>
> now i have made both chnages.
>
> you can see the webpage now, eys the one with the cure smile face.
>
> behidn that we should beeen seeing all the files in a table like format
> for uses to downlaod, instead not a single file is being displayed.
>
> here is the print process., actually here is what i have up until now
> after modifications for you to take an overall look.
>
> #
> =================================================================================================================
>
> # Convert wrongly encoded filenames to utf-8
> #
> =================================================================================================================
>
> path = b'/home/nikos/public_html/data/apps/'
> filenames = os.listdir( path )
>
> utf8_filenames = []
>
> for filename in filenames:
>      # Compute 'path/to/filename'
>      filename_bytes = path + filename
>      encoding = guess_encoding( filename_bytes )
>
>      if encoding == 'utf-8':
>          # File name is valid UTF-8, so we can skip to the next file.
>          utf8_filenames.append( filename_bytes )
>          continue
>      elif encoding is None:
>          # No idea what the encoding is. Hit it with a hammer until it
> stops moving.
>          filename = filename_bytes.decode( 'utf-8', 'xmlcharrefreplace' )
>      else:
>          filename = filename_bytes.decode( encoding )
>
>      # Rename the file to something which ought to be UTF-8 clean.
>      newname_bytes = filename.encode('utf-8')
>      os.rename( filename_bytes, newname_bytes )
>      utf8_filenames.append( newname_bytes )
>
>      # Once we get here, the file ought to be UTF-8 clean and the
> Unicode name ought to exist:
>      assert os.path.exists( newname_bytes.decode('utf-8') )
>
>
> # Switch filenames from utf8 bytestrings => unicode strings
> filenames = []
>
> for utf8_filename in utf8_filenames:
>      filenames.append( utf8_filename.decode('utf-8') )
>
> # Check the presence of a database file against the dir files and delete
> record if it doesn't exist
> cur.execute('''SELECT url FROM files''')
> data = cur.fetchall()
>
> for url in data:
>      if url not in filenames:
>          # Delete spurious
>          cur.execute('''DELETE FROM files WHERE url = %s''', url )
>
>
> #
> =================================================================================================================
>
> # Display ALL files, each with its own download button
> #
> =================================================================================================================
>
> print('''<body background='/data/images/star.jpg'>
>           <center><img src='/data/images/download.gif'><br><br>
>           <table border=5 cellpadding=5 bgcolor=green>
> ''')
>
> try:
>      cur.execute( '''SELECT * FROM files ORDER BY lastvisit DESC''' )
>      data = cur.fetchall()
>
>      for row in data:
>          (filename, hits, host, lastvisit) = row
>          lastvisit = lastvisit.strftime('%A %e %b, %H:%M')
>
>          print('''
>          <form method="get" action="/cgi-bin/files.py">
>              <tr>
>                  <td> <center> <input type="submit" name="filename"
> value="%s"> </td>
>                  <td> <center> <font color=yellow size=5> %s </td>
>                  <td> <center> <font color=orange size=4> %s </td>
>                  <td> <center> <font color=silver size=4> %s </td>
>              </tr>
>          </form>
>          ''' % (filename, hits, host, lastvisit) )
>      print( '''</table><br><br>''' )
> except pymysql.ProgrammingError as e:
>      print( repr(e) )
>
> sys.exit(0)
>
> ==========
> ima happy that at elaST IT DOES NOT ERRIGN OUT!


Can you shed some ligth please as to what might the problem be and its 
not displayign the filenames sicn ethey are utf8 encoded correctly?

Something you want me to try?



More information about the Python-list mailing list