mirror of
https://github.com/golang/go.git
synced 2025-11-02 17:50:56 +00:00
net/http: scale rstAvoidanceDelay to reduce test flakiness
As far as I can tell, some flakiness is unavoidable in tests that race a large client request write against a server's response when the server doesn't read the full request. It does not appear to be possible to simultaneously ensure that well-behaved clients see EOF instead of ECONNRESET and also prevent misbehaving clients from consuming arbitrary server resources. (See RFC 7230 §6.6 for more detail.) Since there doesn't appear to be a way to cleanly eliminate this source of flakiness, we can instead work around it: we can allow the test to adjust the hard-coded delay if it sees a plausibly-related failure, so that the test can retry with a longer delay. As a nice side benefit, this also allows the tests to run more quickly in the typical case: since the test will retry in case of spurious failures, we can start with an aggressively short delay, and only back off to a longer one if it is really needed on the specific machine running the test. Fixes #57084. Fixes #51104. For #58398. Change-Id: Ia4050679f0777e5eeba7670307a77d93cfce856f Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-longtest-race,gotip-linux-amd64-race,gotip-windows-amd64-race Reviewed-on: https://go-review.googlesource.com/c/go/+/527196 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Damien Neil <dneil@google.com> Auto-Submit: Bryan Mills <bcmills@google.com>
This commit is contained in:
parent
545e4f38e0
commit
6760f20ef5
4 changed files with 419 additions and 289 deletions
|
|
@ -1750,8 +1750,12 @@ func (c *conn) close() {
|
|||
// and processes its final data before they process the subsequent RST
|
||||
// from closing a connection with known unread data.
|
||||
// This RST seems to occur mostly on BSD systems. (And Windows?)
|
||||
// This timeout is somewhat arbitrary (~latency around the planet).
|
||||
const rstAvoidanceDelay = 500 * time.Millisecond
|
||||
// This timeout is somewhat arbitrary (~latency around the planet),
|
||||
// and may be modified by tests.
|
||||
//
|
||||
// TODO(bcmills): This should arguably be a server configuration parameter,
|
||||
// not a hard-coded value.
|
||||
var rstAvoidanceDelay = 500 * time.Millisecond
|
||||
|
||||
type closeWriter interface {
|
||||
CloseWrite() error
|
||||
|
|
@ -1770,6 +1774,27 @@ func (c *conn) closeWriteAndWait() {
|
|||
if tcp, ok := c.rwc.(closeWriter); ok {
|
||||
tcp.CloseWrite()
|
||||
}
|
||||
|
||||
// When we return from closeWriteAndWait, the caller will fully close the
|
||||
// connection. If client is still writing to the connection, this will cause
|
||||
// the write to fail with ECONNRESET or similar. Unfortunately, many TCP
|
||||
// implementations will also drop unread packets from the client's read buffer
|
||||
// when a write fails, causing our final response to be truncated away too.
|
||||
//
|
||||
// As a result, https://www.rfc-editor.org/rfc/rfc7230#section-6.6 recommends
|
||||
// that “[t]he server … continues to read from the connection until it
|
||||
// receives a corresponding close by the client, or until the server is
|
||||
// reasonably certain that its own TCP stack has received the client's
|
||||
// acknowledgement of the packet(s) containing the server's last response.”
|
||||
//
|
||||
// Unfortunately, we have no straightforward way to be “reasonably certain”
|
||||
// that we have received the client's ACK, and at any rate we don't want to
|
||||
// allow a misbehaving client to soak up server connections indefinitely by
|
||||
// withholding an ACK, nor do we want to go through the complexity or overhead
|
||||
// of using low-level APIs to figure out when a TCP round-trip has completed.
|
||||
//
|
||||
// Instead, we declare that we are “reasonably certain” that we received the
|
||||
// ACK if maxRSTAvoidanceDelay has elapsed.
|
||||
time.Sleep(rstAvoidanceDelay)
|
||||
}
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue