Intermittent bug with asyncio and MS Edge

Chris Angelico rosuav at gmail.com
Sun Mar 22 06:11:27 EDT 2020


On Sun, Mar 22, 2020 at 8:30 PM Frank Millman <frank at chagford.com> wrote:
>
> On 2020-03-22 10:45 AM, Chris Angelico wrote:
> > On Sun, Mar 22, 2020 at 6:58 PM Frank Millman <frank at chagford.com> wrote:
> >>> I'd look at the network traffic with wireshark to see if there is anything different between edge and the other browsers.
> >>>
> >>
> >> You are leading me into deep waters here :-)  I have never used
> >> Wireshark before. I have now downloaded it and am running it - it
> >> generates a *lot* of data, most of which I do not understand yet!
> >>
> >> One thing immediately stands out. When I run it with MS Edge and
> >> Python3.8, it shows a lot of lines highlighted in red, with the symbols
> >> [RST,ACK]. They do not appear when running Chrome, and they do not
> >> appear when running Python3.7.
> >
> > Interesting. RST means "Reset" and is sent when the connection is
> > closed. Which direction were these packets sent (Edge to Python or
> > Python to Edge)? You can tell by the source and destination ports -
> > one of them is going to be the port Python is listening on (eg 80 or
> > 443), so if the destination port is 80, it's being sent *to* Python,
> > and if the source port is 80, it's being sent *from* Python.
> >
>
> They are all being sent *from* Python *to* Edge.

Very interesting indeed. What that *might* mean is that Python is
misinterpreting something and then believing that the connection has
been closed, so it responds "Okay, I'm closing the connection". Or
possibly it sees some sort of error condition.

> >> I have another data point. I tried putting an asyncio.sleep() after
> >> sending each file. A value of 0.01 made no difference, but a value of
> >> 0.1 makes the problem go away.
> >
> > Interesting also.
> >
> > Can you recreate the problem without Edge? It sounds like something's
> > going on with concurrent transfers, so it'd be awesome if you can
> > replace Edge with another Python program, and then post both programs.
> >
>
> Do you mean write a program that emulates a browser - make a connection,
> receive the HTML page, send a GET request for each file, and receive the
> results?
>
> I will give it a go!

Yes - although the HTML page is most likely irrelevant. You could just
make a connection and then spam requests. Or make multiple
connections.

Actually, that's another thing worth checking. Is Edge using a single
connection for all requests, or separate connections for each request,
or something in between (eg a pool of four connections and spreading
requests between them)? You'll be able to recognize different
connections by the port numbers Edge uses, which are guaranteed unique
among concurrent connections, and are most likely to not be reused for
a while even if the other is closed.

> > Also of interest: Does the problem go away if you change "Connection:
> > Keep-Alive" to "Connection: Close" in your headers?
> >
>
> Yes, the problem does go away.
>

This makes me think that the answer to the previous question is going
to involve some connection reuse.

If you can recreate the problem with a single socket and multiple
requests, that would be extremely helpful. I also think it's highly
likely that this is the case.

My theory: Your program is sending a large amount of data to the lower
level API functions, which attempt to send a large amount of data to
the socket. At some point, something gets told "sorry, can't handle
all that data, hold some of it back" at a time when it's not prepared
to do so, and it misinterprets it as an error. This error results in
the connection being closed.

But that's just a theory.

ChrisA


More information about the Python-list mailing list