Strange thing with types

TYR a.harrowell at gmail.com
Thu May 29 10:16:47 EDT 2008


On May 29, 2:23 pm, "Diez B. Roggisch" <de... at nospam.web.de> wrote:
> TYR wrote:
> > I'm doing some data normalisation, which involves data from a Web site
> > being extracted with BeautifulSoup, cleaned up with a regex, then
> > having the current year as returned by time()'s tm_year attribute
> > inserted, before the data is concatenated with string.join() and fed
> > to time.strptime().
>
> > Here's some code:
> > timeinput = re.split('[\s:-]', rawtime)
> > print timeinput #trace statement
> > print year #trace statement
> > t = timeinput.insert(2, year)
> > print t #trace statement
> > t1 = string.join(t, '')
> > timeobject = time.strptime(t1, "%d %b %Y %H %M")
>
> > year is a Unicode string; so is the data in rawtime (BeautifulSoup
> > gives you Unicode, dammit). And here's the output:
>
> > [u'29', u'May', u'01', u'00'] (OK, so the regex is working)
> > 2008 (OK, so the year is a year)
> > None (...but what's this?)
> > Traceback (most recent call last):
> >   File "bothv2.py", line 71, in <module>
> >     t1 = string.join(t, '')
> >   File "/usr/lib/python2.5/string.py", line 316, in join
> >     return sep.join(words)
> > TypeError
>
> First - don't use module string anymore. Use e.g.
>
> ''.join(t)
>
> Second, you can only join strings. but year is an integer. So convert it to
> a string first:
>
> t = timeinput.insert(2, str(year))
>
> Diez

Yes, tm_year is converted to a unicode string elsewhere in the program.



More information about the Python-list mailing list