From ml at mherrn.de Mon Nov 30 09:33:49 2015 From: ml at mherrn.de (ml at mherrn.de) Date: Mon, 30 Nov 2015 15:33:49 +0100 Subject: [Moin-user] Problem with encoding URLS Message-ID: <52ad129bda00f44dcd2f20c40f353a59.squirrel@mail.sout.de> Hi, I have just moved my wiki to another server. Unfortunately, URLs with special characters (for example german Umlauts) don't work anymore. I have a page with the Name "T?st". It contains the german Umlaut "?". The encoded url looks like "https://mywiki.com/T%C3%B6st" When trying to access this page the apache log tells me: -----/----- mod_wsgi (pid=31988): Exception occurred processing WSGI script '/home/user/www/moin/moin.wsgi'. Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/werkzeug/wsgi.py", line 567, in __call__ cleaned_path = cleaned_path.encode(sys.getfilesystemencoding()) UnicodeEncodeError: 'ascii' codec can't encode character u'\\xf6' in position 2: ordinal not in range(128) -----/----- What is going on here? Do I have a misconfiguration in moinmoin? Why is it trying to encode it in ASCII? Any help is appreciated. Marco From paul at boddie.org.uk Mon Nov 30 10:49:39 2015 From: paul at boddie.org.uk (Paul Boddie) Date: Mon, 30 Nov 2015 16:49:39 +0100 Subject: [Moin-user] Problem with encoding URLS In-Reply-To: <52ad129bda00f44dcd2f20c40f353a59.squirrel@mail.sout.de> References: <52ad129bda00f44dcd2f20c40f353a59.squirrel@mail.sout.de> Message-ID: <201511301649.39859.paul@boddie.org.uk> On Monday 30. November 2015 15.33.49 ml at mherrn.de wrote: > Hi, > > I have just moved my wiki to another server. Unfortunately, URLs with > special characters (for example german Umlauts) don't work anymore. > > I have a page with the Name "T?st". It contains the german Umlaut "?". > The encoded url looks like "https://mywiki.com/T%C3%B6st" So, that's the page name URL-encoded with the original character values being represented using UTF-8. In short... T?st -> 84 (T), 195, 182, 115 (s), 116 (t) (decimal values) 54 (T), C3, B6, 73 (s), 74 (t) (hex values) -> T%C3%B6st Such encoding is perfectly reasonable, since the W3C never got round to specifying non-ASCII characters in URLs, or at least not properly. > When trying to access this page the apache log tells me: > > -----/----- > mod_wsgi (pid=31988): Exception occurred processing WSGI script > '/home/user/www/moin/moin.wsgi'. > Traceback (most recent call last): > File "/usr/lib/python2.7/dist-packages/werkzeug/wsgi.py", line 567, in > __call__ > cleaned_path = cleaned_path.encode(sys.getfilesystemencoding()) > UnicodeEncodeError: 'ascii' codec can't encode character u'\\xf6' in > position 2: ordinal not in range(128) > -----/----- > > What is going on here? Do I have a misconfiguration in moinmoin? Why is it > trying to encode it in ASCII? Here, I imagine that your locale setting isn't helping. What do you get at the Python prompt if you call sys.getfilesystemencoding() ? You may need to look at your system's default locale and/or the user's locale, I guess. I see someone else has experienced this recently, too: https://moinmo.in/MoinMoinBugs/1.9.8NonAsciiURL-UnicodeEncodeError Paul