-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: distinguish server timeouts from transport timeouts #43
Conversation
A transport layer timeout is made independent of the query timeout, i.e. the maximum time to wait for the query to complete. The query timeout is used by the blocking poll so that the backend does not block for too long when polling for job completion, but the transport can have different timeout requirements, and we do not want it to be raising sometimes unnecessary timeout errors.
As job methods do not split the timeout anymore between all requests a method might make, the Client methods are adjusted in the same way.
@tswast Ping. (BTW, should I still be requesting BigQuery reviews from you by default? Or should I pick, say, @shollyman instead?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one question.
Yes, please direct future PRs to @shollyman in the future. He can pull me in when he needs more context.
@plamut Hi, I like a new version will be released. |
@sonots I will have to check, but since quite a few fixes and additions have been made since the last release, I think there is a decent chance a new version will be released "soon" (say, by the end of the month). It's just my personal opinion, however, not an official answer, but I will try to make it happen sooner rather than later. |
@sonots FYI, a new version of the library has been released, you can now try it out. |
Thanks. Could you provide an example when I want to insert a CSV file into bigquery allowing longer timeout limitation? Thanks |
Fixes #40.
This PR fixes the problem with timeouts sometimes occurring too early by making a distinction between the server timeout (the
timeoutMs
API parameter) and the timeout for the transport layer. These two are now independent from each other.How to reproduce
Seems like the ticket description provides a reasonable way to do it (repeating a non-trivial query a lot of times). I had quite some trouble reproducing it consistently on my network, but manually disabling one's internet connection can also be used.
See the rest of the notes for remarks/discussion.
Methods might block longer than
timeout
In the 1.24.0 release, a timeout parameter was added to public methods to prevent HTTP requests from hanging indefinitely. However, if the timeout is not provided (e.g. when polling for job completion), trying to estimate it from the server-side "is job done" timeout can lead to random timeout errors due to random network delays, etc.
Since
timeout
is now directly passed as the timeout to the underlyingrequests
lib, it means that the actual duration of a function call can be considerably longer than a wall clock timeout would be.As this negates the wall clock timeout approximation, the logic dealing with that has also been removed in the methods that might send multiple requests. The transport timeout now applies to each individual request.
Timeout errors are not retried
If a transport timeout error is raised, it is currently not retried by the default
retry
object. We might actually not want to change that, as timeout errors are generally used by the core futures to signal that retrying a request took too long.In any case, the default behavior is now again the same as in pre-1.24.0 versions, meaning that the user code that worked with, say, version 1.19.0 should behave the same.
PR checklist