awesome slugify and unicode
wxjmfauth at gmail.com
wxjmfauth at gmail.com
Thu Jan 23 05:41:47 EST 2014
Le jeudi 23 janvier 2014 10:14:48 UTC+1, Mark Lawrence a écrit :
> On 23/01/2014 07:18, wxjmfauth at gmail.com wrote:
>
> > Le mercredi 22 janvier 2014 20:23:55 UTC+1, Mark Lawrence a écrit :
>
> >> I thought this blog might interest some of you
>
> >>
>
> >> http://pydanny.com/awesome-slugify-human-readable-url-slugs-from-any-string.html
>
> >>
>
> >> My fellow Pythonistas, ask not what our language can do for you, ask
>
> >>
>
> >> what you can do for our language.
>
> >>
>
> >
>
> > This is not "unicode", only string manipulations.
>
> > The same work could be done with, let say, cp1252.
>
> > The difference lies in the repertoires of characters
>
> > to be handled.
>
> >
>
> > A better way is to work with normalization() and/or
>
> > with methods like .translate() with dedicated
>
> > tables; the hard task being the creation of these tables.
>
> >
>
> > Shortly, very naive.
>
> >
>
> > jmf
>
> >
>
>
>
> You'll have to excuse my ignorance of this stuff. How do I express the
>
> following in cp1252?
>
>
>
> def test_musical_notes():
>
> txt = "Is ♬ ♫ ♪ ♩ a melody or just noise?"
>
> assert slugify(txt) == "Is-a-melody-or-just-noise"
>
> assert slugify_unicode(txt) == "Is-a-melody-or-just-noise"
>
>
>
> --
>
> My fellow Pythonistas, ask not what our language can do for you, ask
>
> what you can do for our language.
>
>
I wrote: The same work could be done with, let say, cp1252.
Understand: The same work (string manipulation) ...
Would something like this not be more informative?
>>> "Is ♬ ♫ ♪ ♩ a melody or just noise?".encode('ascii', 'replace').decode('ascii')
'Is ? ? ? ? a melody or just noise?'
>>>
>>>
cp1252 analogy.
>>> 'abc€€€'.encode('cp1252').decode('ascii', 'replace').encode('ascii', 'replace').decode('ascii')
'abc???'
>>>
Again, not a "unicode" question, more "how to handle strings in a judicious way?"
jmf
More information about the Python-list
mailing list