Discussion:
[issue25586] socket.sendall broken when a socket has a timeout
Jakub Stasiak
2015-11-08 21:47:35 UTC
Permalink
New submission from Jakub Stasiak:

It is my understanding that socket.sendall effectively calls the underlying socket.send's implementation in a retry loop, possibly multiple times.

It is also my understanding that each one of those low level send calls can timeout on its own if a socket timeout is set.

Considering the above I believe it's undesired to ever call socket.sendall with a socket that has a timeout set because if:

1. At least one send call succeeds
2. A send call after that times out

then a socket timeout error will be raised and the information about the sent data will be lost.

Granted, the documentation says that "On error, an exception is raised, and there is no way to determine how much data, if any, was successfully sent". I believe, however, that such API is very easy to misuse (I've seen it used with sockets with timeout set, because of small payload sizes and other circumstances it would appear to work fine most of the time and only fail every N hours or days).

Possible improvements I see:

1. Explicitly mention interaction between socket's timeout and sendall's behavior in the documentation
2. Make sendall raise an error when the socket it's called with has a timeout set (I can see the backwards compatibility being a concern here but I can't think of any reason to remain compatible with behavior I believe to be broken anyway)
3. Both of the above

I'm happy to procure an appropriate patch for any of the above. Please correct me if my understanding of this is off.

----------
assignee: ***@python
components: Documentation, IO, Library (Lib)
messages: 254357
nosy: ***@python, jstasiak
priority: normal
severity: normal
status: open
title: socket.sendall broken when a socket has a timeout
type: behavior
versions: Python 2.7, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25586>
_______________________________________
STINNER Victor
2015-11-08 22:06:37 UTC
Permalink
STINNER Victor added the comment:

FYI the behaviour of socket.socket.sendall() regarding timeout has been modified in Python 3.5:
https://docs.python.org/dev/library/socket.html#socket.socket.sendall

"Changed in version 3.5: The socket timeout is no more reset each time data is sent successfuly. The socket timeout is now the maximum total duration to send all data."

----------
nosy: +haypo

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25586>
_______________________________________
Martin Panter
2015-11-08 22:33:20 UTC
Permalink
Martin Panter added the comment:

I don’t like the sound of improvement 2. I think it would break existing use cases, e.g. HTTPConnection(timeout=...), urlopen(timeout=...). If you want a HTTP request to abort if it takes a long time, how is that behaviour broken?

Regarding improvement 1, I think Victor’s 3.5 documentation explains how the timeout works fairly well. Perhaps you meant for this bug to be about pre-3.5 versions?

Also, see Issue 7322 for discussion of how to handle exceptions when reading from sockets and other streams. E.g. what should readline() do when it has read half a line and then gets an exception, and should you be allowed to read again after handling an exception? In my mind the readline() and sendall() cases are somewhat complementary.

----------
nosy: +martin.panter

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25586>
_______________________________________
Jakub Stasiak
2015-11-09 01:19:16 UTC
Permalink
Jakub Stasiak added the comment:

Martin: While I'd consider timeout in HTTPConnection(timeout=...) or urlopen(timeout=...) to be the timeout for the entire operation, just just for the data sending part and HTTPConnection/urlopen can achieve the timeout behavior using just send I concede there may be valid use cases for "either sendall succeeds or we don't care about what we've sent anyway" and in this light my second suggestion is problematic.

Victor: The behavior change in 3.5 does't affect my concern from what I see. The concern is sendall timing out after some data has already been sent which can create some subtle issues. I've seen code like this:

def x(data, sock):
while True:
# some code here
try:
sock.sendall(data)
return
except timeout:
pass


Now I'll agree the code is at fault for ever attempting to retry sendall but I also think the API is easy to misuse like this. And it many cases it'll just work most of the time because sendall won't timeout.

Maybe explicitly mentioning sendall's behavior concerning sockets with timeouts could improve this? I'm honestly not sure anymore, technically "On error, an exception is raised, and there is no way to determine how much data, if any, was successfully sent." should be enough.

----------

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25586>
_______________________________________
Martin Panter
2015-11-09 01:45:25 UTC
Permalink
Martin Panter added the comment:

Maybe it would be reasonable to expand on that “On error” sentence. I imagine the main errors that would cause this situation are a timeout as you say, and also a blocking exception. Also, you could point out that if you have lost record of how much has already been sent, it may not make sense to send any more data with the same socket.

----------
versions: -Python 3.2, Python 3.3

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25586>
_______________________________________
Jakub Stasiak
2015-11-09 09:17:11 UTC
Permalink
Jakub Stasiak added the comment:

This is exactly what I'm thinking. Do you think it's sensible to move that sentence + some additional information (following your suggestion) into a "Warning" block?

----------

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25586>
_______________________________________
Martin Panter
2015-11-09 21:16:44 UTC
Permalink
Martin Panter added the comment:

Personally I would avoid big red warning boxes in 90% of the cases people suggest them, including this one. But maybe we can see what others think.

----------

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25586>
_______________________________________
STINNER Victor
2015-11-09 21:24:08 UTC
Permalink
STINNER Victor added the comment:

I don't really understand why do even you consider the behaviour has a bug or a trap. On timeout, the most common behaviour is to give up on the whole client. I don't know code trying to resend the same data in case of timeout.

Describing the behaviour on the documentation helps, but no warning is need.

When you *read* data, it's different. It can be interesting to get the partial read.

--

For example, the asyncio.StreamReader.readexactly() coroutine raises an asyncio.IncompleteReadError exception which contains the partial data:

https://docs.python.org/dev/library/asyncio-stream.html#asyncio.StreamReader.readexactly

The HTTP client has a similar exception.

I proposed to add similar method to socket.socket which also raises a similar exception to return partial data: issue #1103213.

----------

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25586>
_______________________________________
Jakub Stasiak
2015-11-10 09:25:53 UTC
Permalink
Jakub Stasiak added the comment:

That's fair and thanks for the links.

Please find a quick patch attached, feel free to use that or any modification of it. While I believe the documentation is technically correct right now it won't hurt to clarify this I think.

----------
keywords: +patch
Added file: http://bugs.python.org/file40994/socket-docs.patch

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25586>
_______________________________________
Martin Panter
2015-11-11 02:26:43 UTC
Permalink
Martin Panter added the comment:

That was kind of what I had in mind. The only change I would make is to restore the comma to “On error (including socket timeout), an exception . . .”. I’ll try to commit this in a few days if nobody has anything else to say.

----------
stage: -> patch review

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25586>
_______________________________________

Loading...