str.title question after '

Leo Kislov Leo.Kislov at gmail.com
Mon Nov 13 06:23:35 EST 2006


Antoon Pardon wrote:
> I have a text in ascii. I use the ' for an apostroph. The problem is
> this gives problems with the title method.  I don't want letters
> after a ' to be uppercased. Here are some examples:
>
>    argument       result          expected
>
>   't smidje       'T Smidje       't Smidje
>   na'ama          Na'Ama          Na'ama
>   al pi tnu'at    Al Pi Tnu'At    Al Pi Tnu'at
>
>
> Is there an easy way to get what I want?

def title_words(s):
    words = re.split('(\s+)', s)
    return ''.join(word[0:1].upper()+word[1:] for word in words)

>
> Should the current behaviour condidered a bug?

I believe it follows definition of \w from re module.

> My would be inclined to answer yes, but that may be
> because this behaviour would be wrong in Dutch. I'm
> not so sure about english.

The problem is more complicated. First of all, why title() should be
limited to human languages? What about programming languages? Is
"bar.bar.spam" three tokens or one in a foo programming language? There
are some problems with human languages too: how are you going to
process "out-of-the-box" and "italian-american"?

  -- Leo




More information about the Python-list mailing list