Re: getting rid of —

Tep petshmidt at googlemail.com
Thu Jul 2 04:31:46 EDT 2009


On 2 Jul., 10:25, Tep <petshm... at googlemail.com> wrote:
> On 2 Jul., 01:56, MRAB <pyt... at mrabarnett.plus.com> wrote:
>
> > someone wrote:
> > > Hello,
>
> > > how can I replace '—' sign from string? Or do split at that character?
> > > Getting unicode error if I try to do it:
>
> > > UnicodeDecodeError: 'ascii' codec can't decode byte 0x97 in position
> > > 1: ordinal not in range(128)
>
> > > Thanks, Pet
>
> > > script is # -*- coding: UTF-8 -*-
>
> > It sounds like you're mixing bytestrings with Unicode strings. I can't
> > be any more helpful because you haven't shown the code.
>
> Oh, I'm sorry. Here it is
>
> def cleanInput(input)
>     return input.replace('—', '')

I also need:

#input is html source code, I have problem with only this character
#input = 'foo — bar'
#return should be foo
def splitInput(input)
    parts = input.split(' — ')
    return parts[0]


Thanks!



More information about the Python-list mailing list