net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
// Copyright 2018 The Go Authors. All rights reserved.
|
|
|
|
|
// Use of this source code is governed by a BSD-style
|
|
|
|
|
// license that can be found in the LICENSE file.
|
|
|
|
|
|
|
|
|
|
package poll
|
|
|
|
|
|
2018-05-21 19:28:19 -07:00
|
|
|
import (
|
2019-12-24 01:24:43 +01:00
|
|
|
"internal/syscall/unix"
|
2018-05-21 19:28:19 -07:00
|
|
|
"sync/atomic"
|
|
|
|
|
"syscall"
|
|
|
|
|
"unsafe"
|
|
|
|
|
)
|
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
|
|
|
|
|
const (
|
|
|
|
|
// spliceNonblock makes calls to splice(2) non-blocking.
|
|
|
|
|
spliceNonblock = 0x2
|
|
|
|
|
|
|
|
|
|
// maxSpliceSize is the maximum amount of data Splice asks
|
|
|
|
|
// the kernel to move in a single call to splice(2).
|
|
|
|
|
maxSpliceSize = 4 << 20
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
// Splice transfers at most remain bytes of data from src to dst, using the
|
|
|
|
|
// splice system call to minimize copies of data from and to userspace.
|
|
|
|
|
//
|
|
|
|
|
// Splice creates a temporary pipe, to serve as a buffer for the data transfer.
|
|
|
|
|
// src and dst must both be stream-oriented sockets.
|
|
|
|
|
//
|
|
|
|
|
// If err != nil, sc is the system call which caused the error.
|
|
|
|
|
func Splice(dst, src *FD, remain int64) (written int64, handled bool, sc string, err error) {
|
|
|
|
|
prfd, pwfd, sc, err := newTempPipe()
|
|
|
|
|
if err != nil {
|
|
|
|
|
return 0, false, sc, err
|
|
|
|
|
}
|
|
|
|
|
defer destroyTempPipe(prfd, pwfd)
|
|
|
|
|
var inPipe, n int
|
|
|
|
|
for err == nil && remain > 0 {
|
|
|
|
|
max := maxSpliceSize
|
|
|
|
|
if int64(max) > remain {
|
|
|
|
|
max = int(remain)
|
|
|
|
|
}
|
|
|
|
|
inPipe, err = spliceDrain(pwfd, src, max)
|
2018-09-15 18:31:49 +02:00
|
|
|
// The operation is considered handled if splice returns no
|
|
|
|
|
// error, or an error other than EINVAL. An EINVAL means the
|
|
|
|
|
// kernel does not support splice for the socket type of src.
|
|
|
|
|
// The failed syscall does not consume any data so it is safe
|
|
|
|
|
// to fall back to a generic copy.
|
|
|
|
|
//
|
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
// spliceDrain should never return EAGAIN, so if err != nil,
|
2018-09-15 18:31:49 +02:00
|
|
|
// Splice cannot continue.
|
|
|
|
|
//
|
|
|
|
|
// If inPipe == 0 && err == nil, src is at EOF, and the
|
|
|
|
|
// transfer is complete.
|
|
|
|
|
handled = handled || (err != syscall.EINVAL)
|
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
if err != nil || (inPipe == 0 && err == nil) {
|
|
|
|
|
break
|
|
|
|
|
}
|
|
|
|
|
n, err = splicePump(dst, prfd, inPipe)
|
|
|
|
|
if n > 0 {
|
|
|
|
|
written += int64(n)
|
|
|
|
|
remain -= int64(n)
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
if err != nil {
|
2018-09-05 10:20:26 -07:00
|
|
|
return written, handled, "splice", err
|
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
}
|
|
|
|
|
return written, true, "", nil
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// spliceDrain moves data from a socket to a pipe.
|
|
|
|
|
//
|
|
|
|
|
// Invariant: when entering spliceDrain, the pipe is empty. It is either in its
|
|
|
|
|
// initial state, or splicePump has emptied it previously.
|
|
|
|
|
//
|
|
|
|
|
// Given this, spliceDrain can reasonably assume that the pipe is ready for
|
|
|
|
|
// writing, so if splice returns EAGAIN, it must be because the socket is not
|
|
|
|
|
// ready for reading.
|
|
|
|
|
//
|
|
|
|
|
// If spliceDrain returns (0, nil), src is at EOF.
|
|
|
|
|
func spliceDrain(pipefd int, sock *FD, max int) (int, error) {
|
2018-06-21 19:11:34 +02:00
|
|
|
if err := sock.readLock(); err != nil {
|
|
|
|
|
return 0, err
|
|
|
|
|
}
|
|
|
|
|
defer sock.readUnlock()
|
|
|
|
|
if err := sock.pd.prepareRead(sock.isFile); err != nil {
|
|
|
|
|
return 0, err
|
|
|
|
|
}
|
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
for {
|
|
|
|
|
n, err := splice(pipefd, sock.Sysfd, max, spliceNonblock)
|
internal/poll, os: loop on EINTR
Historically we've assumed that we can install all signal handlers
with the SA_RESTART flag set, and let the system restart slow functions
if a signal is received. Therefore, we don't have to worry about EINTR.
This is only partially true, and we've added EINTR checks already for
connect, and open/read on Darwin, and sendfile on Solaris.
Other cases have turned up in #36644, #38033, and #38836.
Also, #20400 points out that when Go code is included in a C program,
the C program may install its own signal handlers without SA_RESTART.
In that case, Go code will see EINTR no matter what it does.
So, go ahead and check for EINTR. We don't check in the syscall package;
people using syscalls directly may want to check for EINTR themselves.
But we do check for EINTR in the higher level APIs in os and net,
and retry the system call if we see it.
This change looks safe, but of course we may be missing some cases
where we need to check for EINTR. As such cases turn up, we can add
tests to runtime/testdata/testprogcgo/eintr.go, and fix the code.
If there are any such cases, their handling after this change will be
no worse than it is today.
For #22838
Fixes #20400
Fixes #36644
Fixes #38033
Fixes #38836
Change-Id: I7e46ca8cafed0429c7a2386cc9edc9d9d47a6896
Reviewed-on: https://go-review.googlesource.com/c/go/+/232862
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
2020-05-07 21:34:54 -07:00
|
|
|
if err == syscall.EINTR {
|
|
|
|
|
continue
|
|
|
|
|
}
|
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
if err != syscall.EAGAIN {
|
|
|
|
|
return n, err
|
|
|
|
|
}
|
|
|
|
|
if err := sock.pd.waitRead(sock.isFile); err != nil {
|
|
|
|
|
return n, err
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// splicePump moves all the buffered data from a pipe to a socket.
|
|
|
|
|
//
|
|
|
|
|
// Invariant: when entering splicePump, there are exactly inPipe
|
|
|
|
|
// bytes of data in the pipe, from a previous call to spliceDrain.
|
|
|
|
|
//
|
|
|
|
|
// By analogy to the condition from spliceDrain, splicePump
|
|
|
|
|
// only needs to poll the socket for readiness, if splice returns
|
|
|
|
|
// EAGAIN.
|
|
|
|
|
//
|
|
|
|
|
// If splicePump cannot move all the data in a single call to
|
|
|
|
|
// splice(2), it loops over the buffered data until it has written
|
|
|
|
|
// all of it to the socket. This behavior is similar to the Write
|
|
|
|
|
// step of an io.Copy in userspace.
|
|
|
|
|
func splicePump(sock *FD, pipefd int, inPipe int) (int, error) {
|
2018-06-21 19:11:34 +02:00
|
|
|
if err := sock.writeLock(); err != nil {
|
|
|
|
|
return 0, err
|
|
|
|
|
}
|
|
|
|
|
defer sock.writeUnlock()
|
|
|
|
|
if err := sock.pd.prepareWrite(sock.isFile); err != nil {
|
|
|
|
|
return 0, err
|
|
|
|
|
}
|
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
written := 0
|
|
|
|
|
for inPipe > 0 {
|
|
|
|
|
n, err := splice(sock.Sysfd, pipefd, inPipe, spliceNonblock)
|
|
|
|
|
// Here, the condition n == 0 && err == nil should never be
|
|
|
|
|
// observed, since Splice controls the write side of the pipe.
|
|
|
|
|
if n > 0 {
|
|
|
|
|
inPipe -= n
|
|
|
|
|
written += n
|
|
|
|
|
continue
|
|
|
|
|
}
|
|
|
|
|
if err != syscall.EAGAIN {
|
|
|
|
|
return written, err
|
|
|
|
|
}
|
|
|
|
|
if err := sock.pd.waitWrite(sock.isFile); err != nil {
|
|
|
|
|
return written, err
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return written, nil
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// splice wraps the splice system call. Since the current implementation
|
|
|
|
|
// only uses splice on sockets and pipes, the offset arguments are unused.
|
|
|
|
|
// splice returns int instead of int64, because callers never ask it to
|
|
|
|
|
// move more data in a single call than can fit in an int32.
|
|
|
|
|
func splice(out int, in int, max int, flags int) (int, error) {
|
|
|
|
|
n, err := syscall.Splice(in, nil, out, nil, max, flags)
|
|
|
|
|
return int(n), err
|
|
|
|
|
}
|
|
|
|
|
|
2018-05-21 19:28:19 -07:00
|
|
|
var disableSplice unsafe.Pointer
|
|
|
|
|
|
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
// newTempPipe sets up a temporary pipe for a splice operation.
|
|
|
|
|
func newTempPipe() (prfd, pwfd int, sc string, err error) {
|
2018-05-21 19:28:19 -07:00
|
|
|
p := (*bool)(atomic.LoadPointer(&disableSplice))
|
|
|
|
|
if p != nil && *p {
|
|
|
|
|
return -1, -1, "splice", syscall.EINVAL
|
|
|
|
|
}
|
|
|
|
|
|
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
var fds [2]int
|
2018-05-21 19:28:19 -07:00
|
|
|
// pipe2 was added in 2.6.27 and our minimum requirement is 2.6.23, so it
|
|
|
|
|
// might not be implemented. Falling back to pipe is possible, but prior to
|
|
|
|
|
// 2.6.29 splice returns -EAGAIN instead of 0 when the connection is
|
|
|
|
|
// closed.
|
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
const flags = syscall.O_CLOEXEC | syscall.O_NONBLOCK
|
|
|
|
|
if err := syscall.Pipe2(fds[:], flags); err != nil {
|
|
|
|
|
return -1, -1, "pipe2", err
|
|
|
|
|
}
|
|
|
|
|
|
2018-05-21 19:28:19 -07:00
|
|
|
if p == nil {
|
|
|
|
|
p = new(bool)
|
|
|
|
|
defer atomic.StorePointer(&disableSplice, unsafe.Pointer(p))
|
|
|
|
|
|
|
|
|
|
// F_GETPIPE_SZ was added in 2.6.35, which does not have the -EAGAIN bug.
|
2019-12-24 01:24:43 +01:00
|
|
|
if _, _, errno := syscall.Syscall(unix.FcntlSyscall, uintptr(fds[0]), syscall.F_GETPIPE_SZ, 0); errno != 0 {
|
2018-05-21 19:28:19 -07:00
|
|
|
*p = true
|
|
|
|
|
destroyTempPipe(fds[0], fds[1])
|
|
|
|
|
return -1, -1, "fcntl", errno
|
|
|
|
|
}
|
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
}
|
2018-05-21 19:28:19 -07:00
|
|
|
|
|
|
|
|
return fds[0], fds[1], "", nil
|
net: add support for splice(2) in (*TCPConn).ReadFrom on Linux
This change adds support for the splice system call on Linux,
for the purpose of optimizing (*TCPConn).ReadFrom by reducing
copies of data from and to userspace. It does so by creating a
temporary pipe and splicing data from the source connection to the
pipe, then from the pipe to the destination connection. The pipe
serves as an in-kernel buffer for the data transfer.
No new API is added to package net, but a new Splice function is
added to package internal/poll, because using splice requires help
from the network poller. Users of the net package should benefit
from the change transparently.
This change only enables the optimization if the Reader in ReadFrom
is a TCP connection. Since splice is a more general interface, it
could, in theory, also be enabled if the Reader were a unix socket,
or the read half of a pipe.
However, benchmarks show that enabling it for unix sockets is most
likely not a net performance gain. The tcp <- unix case is also
fairly unlikely to be used very much by users of package net.
Enabling the optimization for pipes is also problematic from an
implementation perspective, since package net cannot easily get at
the *poll.FD of an *os.File. A possible solution to this would be
to dup the pipe file descriptor, register the duped descriptor with
the network poller, and work on that *poll.FD instead of the original.
However, this seems too intrusive, so it has not been done. If there
was a clean way to do it, it would probably be worth doing, since
splicing from a pipe to a socket can be done directly.
Therefore, this patch only enables the optimization for what is likely
the most common use case: tcp <- tcp.
The following benchmark compares the performance of the previous
userspace genericReadFrom code path to the new optimized code path.
The sub-benchmarks represent chunk sizes used by the writer on the
other end of the Reader passed to ReadFrom.
benchmark old ns/op new ns/op delta
BenchmarkTCPReadFrom/1024-4 4727 4954 +4.80%
BenchmarkTCPReadFrom/2048-4 4389 4301 -2.01%
BenchmarkTCPReadFrom/4096-4 4606 4534 -1.56%
BenchmarkTCPReadFrom/8192-4 5219 4779 -8.43%
BenchmarkTCPReadFrom/16384-4 8708 8008 -8.04%
BenchmarkTCPReadFrom/32768-4 16349 14973 -8.42%
BenchmarkTCPReadFrom/65536-4 35246 27406 -22.24%
BenchmarkTCPReadFrom/131072-4 72920 52382 -28.17%
BenchmarkTCPReadFrom/262144-4 149311 95094 -36.31%
BenchmarkTCPReadFrom/524288-4 306704 181856 -40.71%
BenchmarkTCPReadFrom/1048576-4 674174 357406 -46.99%
benchmark old MB/s new MB/s speedup
BenchmarkTCPReadFrom/1024-4 216.62 206.69 0.95x
BenchmarkTCPReadFrom/2048-4 466.61 476.08 1.02x
BenchmarkTCPReadFrom/4096-4 889.09 903.31 1.02x
BenchmarkTCPReadFrom/8192-4 1569.40 1714.06 1.09x
BenchmarkTCPReadFrom/16384-4 1881.42 2045.84 1.09x
BenchmarkTCPReadFrom/32768-4 2004.18 2188.41 1.09x
BenchmarkTCPReadFrom/65536-4 1859.38 2391.25 1.29x
BenchmarkTCPReadFrom/131072-4 1797.46 2502.21 1.39x
BenchmarkTCPReadFrom/262144-4 1755.69 2756.68 1.57x
BenchmarkTCPReadFrom/524288-4 1709.42 2882.98 1.69x
BenchmarkTCPReadFrom/1048576-4 1555.35 2933.84 1.89x
Fixes #10948
Change-Id: I3ce27f21f7adda8b696afdc48a91149998ae16a5
Reviewed-on: https://go-review.googlesource.com/107715
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-04-18 10:56:06 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// destroyTempPipe destroys a temporary pipe.
|
|
|
|
|
func destroyTempPipe(prfd, pwfd int) error {
|
|
|
|
|
err := CloseFunc(prfd)
|
|
|
|
|
err1 := CloseFunc(pwfd)
|
|
|
|
|
if err == nil {
|
|
|
|
|
return err1
|
|
|
|
|
}
|
|
|
|
|
return err
|
|
|
|
|
}
|