[omaha] parsing JavaScript problem

Eli Criffield elicriffield at gmail.com
Tue Apr 8 19:56:45 CEST 2008


This is why i love python!

Thanks

Eli Criffield

On Tue, Apr 8, 2008 at 12:10 PM, Mike Hostetler <mike at hostetlerhome.com> wrote:
>
>
>  Ha! I was just working on the same thing, but Jeff beat me to it.
>
>  It's for sure JSON.
>
>  >> import demjson
>  >> mlb = demjson.decode( file('mlb.js').read())
>  >>>
>  mlb['teams'][0]
>  {u'code': u'phi', u'runs': None, u'isHome': False}
>  >>> mlb['teams'][1]
>  {u'code': u'nym', u'runs': None,
>  u'isHome': True}
>
>
>
>
>  Jeff Hinrichs wrote:
>  > Looks
>  like json to me.
>  >
>  http://deron.meranda.us/python/comparing_json_modules/
>  >
>  > #try demjson (pure python json decoder/encoder)
>  > import
>  demjson
>  > json="""{
>  >     condensed_video:
>  null,
>  >     gameid: '2008/04/08/phimlb-nynmlb-1',
>  >
>  teams: [{
>  >             isHome: false,
>  >             runs:
>  null,
>  >             code: 'phi'
>  >         }, {
>  >
>           isHome: true,
>  >             runs: null,
>  >
>      code: 'nym'
>  >         }],
>  >     status: 'P',
>  >     event_time: ' 1:10 PM',
>  >     home_audio: {
>  >
>       blackout: 'local',
>  >         text: 'WFAN',
>  >
>  media_type: 'audio',
>  >         urls: [{
>  >
>  blackout: 'local',
>  >                 speed: '12',
>  >
>          url: {
>  >                     w:
>  >
>  'http://web.servicebureau.net/conf/meta?i=1112976083&c=1234&m=was&u=/w2.
>  > xsl',
>  >                     w_id: '620882',
>  >
>               v: '2',
>  >                     login: 'Y',
>  >
>                    authorization: 'Y',
>  >                     mid:
>  '200803172432841',
>  >                     pid: 'mlb_ga',
>  >
>                    fid: 'h12',
>  >                     url:
>  >
>  'http://web.servicebureau.net/conf/meta?i=1112976083&c=1234&m=was&u=/w2.
>  > xsl',
>  >                     id: '620882',
>  >
>             gid: '2008/04/08/phimlb-nynmlb-1'
>  >
>  },
>  >                 state: null
>  >             }],
>  >         is_free: false,
>  >         state: 'audio_pregame'
>  >     },
>  >     away_audio: {
>  >         blackout:
>  'local',
>  >         text: 'WPHT',
>  >         media_type:
>  'audio',
>  >         urls: [{
>  >                 blackout:
>  'local',
>  >                 speed: '12',
>  >
>  url: {
>  >                     w:
>  >
>  'http://web.servicebureau.net/conf/meta?i=1112958329&c=1234&m=was&u=/w2.
>  > xsl',
>  >                     w_id: '620884',
>  >
>               v: '2',
>  >                     login: 'Y',
>  >
>                    authorization: 'Y',
>  >                     mid:
>  '200803172432842',
>  >                     pid: 'mlb_ga',
>  >
>                    fid: 'a12',
>  >                     url:
>  >
>  'http://web.servicebureau.net/conf/meta?i=1112958329&c=1234&m=was&u=/w2.
>  > xsl',
>  >                     id: '620884',
>  >
>             gid: '2008/04/08/phimlb-nynmlb-1'
>  >
>  },
>  >                 state: null
>  >             }],
>  >         is_free: false,
>  >         state: 'audio_pregame'
>  >     },
>  > }
>  > """
>  > pobj =
>  demjson.decode(json)
>  >
>  > print pobj
>  >
>  >
>  # even prettier
>  > import pprint
>  > pp =
>  pprint.PrettyPrinter()
>  > pp.pprint(pobj)
>  >
>  >
>  >
>  > -j
>  > -----Original Message-----
>  >
>  From: omaha-bounces at python.org [mailto:omaha-bounces at python.org] On
>  > Behalf Of Eli Criffield
>  > Sent: Tuesday, April 08, 2008
>  11:31 AM
>  > To: Omaha Python Users Group
>  > Subject: [omaha]
>  parsing JavaScript problem
>  >
>  > Short background actull
>  problem below:
>  >
>  > MLB decided they should be the only
>  ones allowed to stream baseball
>  > games online. And that its worth
>  $14.95 for the privilege of listening
>  > to the audio. They also
>  assume that everyone is using windows and have
>  > two different
>  player options, Windows Media Player or Silverlight.
>  >
>  >
>  Using mostly the same code i used for my Sirius online player (sipie)
>  > i fake i can fake i am a windows browser and login and get the
>  stream
>  > url that can be feed to mplayer and then you can listen
>  on Linux. What
>  > you need is the game ID to know what stream to
>  request.
>  >
>  > This page has a chart of current games and
>  links to there media player
>  > to listen to them. The links have
>  the game id in them.
>  > http://mlb.mlb.com/mediacenter/index.jsp
>  >
>  > Although its a static page all the games are loaded via
>  ajax after the
>  > html is loaded. AJAX would be super easy to
>  parse, expect they forgot
>  > the XML part and have dynamically
>  generated JavaScript that has all
>  > the info in it.
>  >
>  > That JavaScript is page here:
>  >
>  http://mlb.mlb.com/components/game/year_2008/month_04/day_08/gamesbydate
>  > .jsp
>  >
>  > What i need to do is take data from that
>  page and extract what i want.
>  >
>  > So here's the real
>  problem.
>  > The JavaScript for a game is like this:
>  >
>  > {
>  >     condensed_video: null,
>  >     gameid:
>  '2008/04/08/phimlb-nynmlb-1',
>  >     teams: [{
>  >
>   isHome: false,
>  >             runs: null,
>  >
>  code: 'phi'
>  >         }, {
>  >             isHome: true,
>  >             runs: null,
>  >             code: 'nym'
>  >
>         }],
>  >     status: 'P',
>  >     event_time: ' 1:10
>  PM',
>  >     home_audio: {
>  >         blackout: 'local',
>  >         text: 'WFAN',
>  >         media_type: 'audio',
>  >         urls: [{
>  >                 blackout: 'local',
>  >                 speed: '12',
>  >                 url: {
>  >                     w:
>  >
>  'http://web.servicebureau.net/conf/meta?i=1112976083&c=1234&m=was&u=/w2.
>  > xsl',
>  >                     w_id: '620882',
>  >
>               v: '2',
>  >                     login: 'Y',
>  >
>                    authorization: 'Y',
>  >                     mid:
>  '200803172432841',
>  >                     pid: 'mlb_ga',
>  >
>                    fid: 'h12',
>  >                     url:
>  >
>  'http://web.servicebureau.net/conf/meta?i=1112976083&c=1234&m=was&u=/w2.
>  > xsl',
>  >                     id: '620882',
>  >
>             gid: '2008/04/08/phimlb-nynmlb-1'
>  >
>  },
>  >                 state: null
>  >             }],
>  >         is_free: false,
>  >         state: 'audio_pregame'
>  >     },
>  >     away_audio: {
>  >         blackout:
>  'local',
>  >         text: 'WPHT',
>  >         media_type:
>  'audio',
>  >         urls: [{
>  >                 blackout:
>  'local',
>  >                 speed: '12',
>  >
>  url: {
>  >                     w:
>  >
>  'http://web.servicebureau.net/conf/meta?i=1112958329&c=1234&m=was&u=/w2.
>  > xsl',
>  >                     w_id: '620884',
>  >
>               v: '2',
>  >                     login: 'Y',
>  >
>                    authorization: 'Y',
>  >                     mid:
>  '200803172432842',
>  >                     pid: 'mlb_ga',
>  >
>                    fid: 'a12',
>  >                     url:
>  >
>  'http://web.servicebureau.net/conf/meta?i=1112958329&c=1234&m=was&u=/w2.
>  > xsl',
>  >                     id: '620884',
>  >
>             gid: '2008/04/08/phimlb-nynmlb-1'
>  >
>  },
>  >                 state: null
>  >             }],
>  >         is_free: false,
>  >         state: 'audio_pregame'
>  >     },
>  > }
>  >
>  > I need to extract the team
>  names (its after code:), who is playing at
>  > home, and the w_id of
>  the home_audio and away_audio.
>  >
>  > Any ideas?
>  >
>
>  > Eli Criffield
>  >
>  _______________________________________________
>  > Omaha Python
>  Users Group mailing list
>  > Omaha at python.org
>  >
>  http://mail.python.org/mailman/listinfo/omaha
>  >
>  http://www.OmahaPython.org
>  >
>  _______________________________________________
>  > Omaha Python
>  Users Group mailing list
>  > Omaha at python.org
>  >
>  http://mail.python.org/mailman/listinfo/omaha
>  >
>  http://www.OmahaPython.org
>  >
>  _______________________________________________
>  Omaha Python Users Group mailing list
>  Omaha at python.org
>  http://mail.python.org/mailman/listinfo/omaha
>  http://www.OmahaPython.org
>


More information about the Omaha mailing list