regexp problem in Python
Sönmez Kartal
rainwatching at gmail.com
Sat Aug 4 13:51:36 EDT 2007
On 4 A ustos, 17:10, Ehsan <ehsan.khod... at gmail.com> wrote:
> On Aug 4, 1:22 pm, Sönmez Kartal <rainwatch... at gmail.com> wrote:
>
>
>
>
>
>
>
> > On 4 A ustos, 00:41, Ehsan <ehsan.khod... at gmail.com> wrote:
>
> > > I want to find "http://www.2shared.com/download/1716611/e2000f22/
> > > Jadeed_Mlak14.wmv?tsid=20070803-164051-9d637d11" or 3gp instead of
> > > wmv in the text file like this :
> > > <html>
> > > ""some code""
> > > function reportAbuse() {
> > > var windowname="abuse";
> > > var url="/abuse.jsp?link=" + "http://www.2shared.com/file/1716611/
> > > e2000f22/Jadeed_Mlak14.html";
> > > OpenWindow =
> > > window.open(url,windowname,'toolbar=no,scrollbars=no,resizable=no,width=500,height=500,left=50,top=50');
> > > OpenWindow.focus();
> > > }
> > > function startDownload(){
> > > window.location = "http://www.2shared.com/download/1716611/
> > > e2000f22/Jadeed_Mlak14.wmv?tsid=20070803-164051-9d637d11";
> > > //document.downloadForm.submit();
> > > }
> > > </script>
> > > </head>
> > > </html>http://www.2shared.com/download/1716611/e2000f22/
> > > Jadeed_Mlak14.3gp?tsid=20070803-164051-9d637d11"sfgsfgsfgv
>
> > > I use this pattern :
> > > "http.*?\.(wmv|3gp).*""
>
> > > but it returns only 'wmv' and '3gp' instead of "http://www.2shared.com/
> > > download/1716611/e2000f22/Jadeed_Mlak14.wmv?
> > > tsid=20070803-164051-9d637d11"
>
> > > what can I do? what's wrong whit this pattern? thanx for your comments
>
> > You could use r'window.location = "(.*?\.(wmv|3gp)";' as your regex
> > string, I guess..- Hide quoted text -
>
> > - Show quoted text -
>
> I didn't get what do you mean? i think i must just change the pattern
> but I don't know how to find bestfit pattern
If you append "window.location = " and ';' to your pattern, it would
be more clear to detect it.
r'window.location = "(.*?)";'
... I have used this and it gave me ...
>>> data = """ <html>
... ""some code""
... function reportAbuse() {
... var windowname="abuse";
... var url="/abuse.jsp?link=" + "http://www.2shared.com/file/
1716611/e2000f22/Jadeed_Mlak14.html";
... OpenWindow =
...
window.open(url,windowname,'toolbar=no,scrollbars=no,resizable=no,width=500,height=500,left=50,top=50');
... OpenWindow.focus();
... }
... function startDownload(){
... window.location = "http://www.2shared.com/download/1716611/
e2000f22/Jadeed_Mlak14.wmv?tsid=20070803-164051-9d637d11";
... //document.downloadForm.submit();
... }
... </script>
... </head>
... </html>"""
>>> re.findall(r'window.location = "(.*?)";', data)
['http://www.2shared.com/download/1716611/e2000f22/Jadeed_Mlak14.wmv?
tsid=20070803-164051-9d637d11']
>>> print 'It works! :-)'
It works! :-)
>>>
Happy coding
More information about the Python-list
mailing list