urllib

Tom Longson joyraah at yahoo.com
Mon Aug 20 20:29:23 EDT 2001


I was using urllib to login and parse some data from a standard
password protected web site using the below logic:

#SNIP#
#! /usr/bin/python

import urllib

aruser = '10004'
arpassword = 'password'
jobnum = "100600"
servername, filename = 'host.com', '/scripts/sweb.dll/search?detail='

remoteaddr = 'http://%s:%s@%s%s%s' % (aruser, arpassword, servername,
filename, jobnum)

remotefile = urllib.urlopen(remoteaddr)
print remotefile
remotedata = remotefile.readlines()
print remotedata
remotefile.close()

#SNIP#
Up intil recently, this worked, but now I don't get any data returned,
only
', mode 'rb' at 8124158>> for var remotefile and [] for remotedata. 

Someone on #python suggested for me to use urllib2, but I can't
upgrade to 2.1 because I'm on RedHat 6.2 due to the dependancy to
db1-devel, which I have not been able to find.

Also, here is a bit that I pulled out of an strace while running the
program:

#SNIP#
connect(5, {sin_family=AF_INET, sin_port=htons(80),
sin_addr=inet_addr("38.169.22.3")}}, 16) = 0
send(5, "GET /scripts/sweb.dll/search?detail=100600 HTTP/1.0\r\n", 53,
0) = 53
send(5, "Authorization: Basic MTAwMDQ6YnVybjE3\r\n", 39, 0) = 39
send(5, "Host: usentry.metameta.com\r\n", 29, 0) = 29
send(5, "User-agent: Python-urllib/1.10\r\n", 32, 0) = 32
send(5, "\r\n", 2, 0)                   = 2
dup(5)                                  = 6
fcntl(6, F_GETFL)                       = 0x2 (flags O_RDWR)
fstat64(0x6, 0xbffff300)                = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0x40157000
_llseek(6, 0, 0xbffff340, SEEK_CUR)     = -1 ESPIPE (Illegal seek)
read(6, "HTTP/1.1 401 Unauthorized\r\nServer:
Microsoft-IIS/5.0\r\nDate: Mon, 20 Aug 2001 17:04:38 GMT\r\n
Allow: GET,POST,HEAD\r\nSet-Cookie:
21618618A732FFE=2603C3A0*27094EA2*2708C938*270844BC; \r\nTitle: : 2161
8618A732FFE=2603C3A0*27094EA2*2708C938*270844BC; \r\nWWW-Authenticate:
Basic realm=Metamedia Technologies
\r\nContent-Version: text/html\r\nContent-Type:
text/html\r\nContent:\r\n\r\n", 1024) = 364
brk(0x810e000)                          = 0x810e000
close(5)                                = 0
write(1, "<addinfourl at 135319872 whose fp = <open file \'<socket>\',
mode \'rb\' at 810c918>>\n", 82) =
82
read(6, "", 8192)                       = 0
write(1, "[]\n", 3)                     = 3
close(6)                                = 0
munmap(0x40157000, 4096)                = 0
close(4)                                = 0
munmap(0x40015000, 4096)                = 0
rt_sigaction(SIGINT, {SIG_DFL}, {0x8061504, [], SA_RESTART|0x4000000},
8) = 0
munmap(0x40016000, 4096)                = 0
_exit(0)                                = ?
#SNIP#

Any help would be extremely useful as I'm stuck on a limb right now
with this.. and I don't want to have to use wget instead of urllib!

Thanks!
-Tom Longson



More information about the Python-list mailing list