[omaha] Web scraping and funky characters

Bob Haffner bob.haffner at gmail.com
Wed Nov 5 05:24:10 CET 2014


Hey Jeff thanks for the input

I'm using requests.  As far as the message, plain text.  I'm assuming i
have to do that cause I'm sending a mix of texts and emails.

When i did encode it the crashes stopped, but the funky output persisted.

Yeah, I think thats exactly what's going on.  Meaning, the copy and pasting
of some cool chars.





On Tue, Nov 4, 2014 at 9:25 PM, Jeff Hinrichs - DM&T <jeffh at dundeemt.com>
wrote:

> What library are you using?  Built-in, beautiful soup, requests? and then
> are you sending the information in a plain-text email or an html email?
> Plain-text won't be able to display extended characters without some
> encoding in the message.  Others can chime in here, but it seems that you
> either have to transmit those chars in a format that can handle them or if
> you are using plain-text, remove them completely or substitute another
> character for them.   Ahh, the joys of glyphs - or forget english, everyone
> speak ascii :)0
>
> Almost as enjoyable as users who cut and paste random things from the
> internet and then save them to an accounting database via windows who
> speaks cp1252. :p
>
> On Tue, Nov 4, 2014 at 9:14 PM, Bob Haffner <bob.haffner at gmail.com> wrote:
>
> > Hey gang,
> >
> > I got a script running in 2.7 that checks a website daily, finds some
> info
> > and then sends some messages (email and text) through gmail.
> >
> > Pretty simple and runs without any problems... most of the time.
> >
> > When it does have problems its usually because of some funky character
> (ex:
> > &#215) in the html.  These cause problems when searching for keywords or
> > with the gmail portion.
> >
> > Doing a .encode('utf-8') seemed to help with the crashing while sending,
> > but the characters still come across funny.
> >
> > Any advice?
> >
> > -bob
> > _______________________________________________
> > Omaha Python Users Group mailing list
> > Omaha at python.org
> > https://mail.python.org/mailman/listinfo/omaha
> > http://www.OmahaPython.org
> >
>
>
>
> --
> Best,
>
> Jeff Hinrichs
> 402.218.1473
> _______________________________________________
> Omaha Python Users Group mailing list
> Omaha at python.org
> https://mail.python.org/mailman/listinfo/omaha
> http://www.OmahaPython.org
>


More information about the Omaha mailing list