Joining Strings

Jussi Piitulainen jussi.piitulainen at helsinki.fi
Thu Apr 7 01:01:20 EDT 2016


Emeka writes:

> Hello All,
>
> import urllib.request
> import re
>
> url = 'https://www.everyday.com/
>
>
>
> req = urllib.request.Request(url)
> resp = urllib.request.urlopen(req)
> respData = resp.read()
>
>
> paragraphs = re.findall(r'\[(.*?)\]',str(respData))
> for eachP in paragraphs:
>     print("".join(eachP.split(',')[1:-2]))
>     print("\n")
>
>
>
> I got the below:
> "Coke -  Yala Market Branch""NO. 113 IKU BAKR WAY YALA"""
> But what I need is
>
> 'Coke -  Yala Market Branch NO. 113 IKU BAKR WAY YALA'
>
> How to I achieve the above?

A couple of things you could do to understand your problem and work
around it: Change your code to print(eachP). Change your "".join to
"!".join to see where the commas were. Experiment with data of that form
in the REPL. Sometimes it's good to print repr(datum) instead of datum,
though not in this case.

But are you trying to extract and parse paragraphs from a JSON response?
Do not use regex for that at all. Use json.load or json.loads to parse
it properly, and access the relevant data by indexing:

x = json.loads('{"foo":[["Weather Forecast","It\'s Rain"],[]]}')

x ==> {'foo': [['Weather Forecast', "It's Rain"], []]}

x['foo'] ==> [['Weather Forecast', "It's Rain"], []]

x['foo'][0] ==> ['Weather Forecast', "It's Rain"]



More information about the Python-list mailing list