str.title() fails with words containing apostrophes

Chris Angelico rosuav at gmail.com
Mon Mar 6 06:07:33 EST 2017


On Mon, Mar 6, 2017 at 9:04 PM, Peter Otten <__peter__ at web.de> wrote:
> Perhaps one could limit the conversion to go from lower to upper only, as
> names tend be in the desired case in the original text.

No, that just tends to make things confusing to use.

> Unfortunately this won't help with
>
>>>> title("admiral von schneider")
> 'Admiral Von Schneider'  # von should be lower case

On Mon, Mar 6, 2017 at 8:52 PM, Jussi Piitulainen
<jussi.piitulainen at helsinki.fi> wrote:
> It also will capitalize all the little words in the string that are
> usually not capitalized in titles, even in the usual headlinese English
> variants. And all the acronyms and such that are usually written in all
> caps, or in even odder patterns.

Right. If you want true title casing, it has to be *extremely*
linguistically-aware. Each of these highlights the fact that "title
case" does not truly equate to "capitalize each whitespace-delimited
word", so it's going to need some sort of intelligence.

There's probably a linguistic library out there that does all of this,
but it doesn't need to be in the stdlib. I am a little surprised by
the "Don'T" from the OP, but I'm not at all surprised at "Admiral Von
Schneider", nor of "How To Teach Css" and other "anomalies".

Still, it's fun to discuss, if only to show why that kind of
locale-aware transformation is important.

ChrisA



More information about the Python-list mailing list