go/src/internal/fuzz/coverage.go

109 lines
2.7 KiB
Go
Raw Normal View History

// Copyright 2021 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package fuzz
import (
"internal/unsafeheader"
"math/bits"
"unsafe"
)
// coverage returns a []byte containing unique 8-bit counters for each edge of
// the instrumented source code. This coverage data will only be generated if
// `-d=libfuzzer` is set at build time. This can be used to understand the code
// coverage of a test execution.
func coverage() []byte {
addr := unsafe.Pointer(&_counters)
size := uintptr(unsafe.Pointer(&_ecounters)) - uintptr(addr)
var res []byte
*(*unsafeheader.Slice)(unsafe.Pointer(&res)) = unsafeheader.Slice{
Data: addr,
Len: int(size),
Cap: int(size),
}
return res
}
[dev.fuzz] internal/fuzz: move coverage capture closer to function When instrumented packages intersect with the packages used by the testing or internal/fuzz packages the coverage counters become noisier, as counters will be triggered by non-fuzzed harness code. Ideally counters would be deterministic, as there are many advanced fuzzing strategies that require mutating the input while maintaining static coverage. The simplest way to mitigate this noise is to capture the coverage counters as closely as possible to the invocation of the fuzz target in the testing package. In order to do this add a new function which captures the current values of the counters, SnapshotCoverage. This function copies the current counters into a static buffer, coverageSnapshot, which workerServer.fuzz can then inspect when it comes time to check if new coverage has been found. This method is not foolproof. As the fuzz target is called in a goroutine, harness code can still cause counters to be incremented while the target is being executed. Despite this we do see significant reduction in churn via this approach. For example, running a basic target that causes strconv to be instrumented for 500,000 iterations causes ~800 unique sets of coverage counters, whereas by capturing the counters closer to the target we get ~40 unique sets. It may be possible to make counters completely deterministic, but likely this would require rewriting testing/F.Fuzz to not use tRunner in a goroutine, and instead use it in a blocking manner (which I couldn't figure out an obvious way to do), or by doing something even more complex. Change-Id: I95c2f3b1d7089c3e6885fc7628a0d3a8ac1a99cf Reviewed-on: https://go-review.googlesource.com/c/go/+/320329 Trust: Roland Shoemaker <roland@golang.org> Trust: Katie Hockman <katie@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com> Reviewed-by: Katie Hockman <katie@golang.org>
2021-05-15 18:46:05 -07:00
// ResetCovereage sets all of the counters for each edge of the instrumented
[dev.fuzz] internal/fuzz: use coverage instrumentation while fuzzing This change updates the go command behavior when fuzzing to instrument the binary for code coverage, and uses this coverage in the fuzzing engine to determine if an input is interesting. Unfortunately, we can't store and use the coverage data for a given run of `go test` and re-use it the next time we fuzz, since the edges could have changed between builds. Instead, every entry in the seed corpus and the on-disk corpus is run by the workers before fuzzing begins, so that the coordinator can get the baseline coverage for what the fuzzing engine has already found (or what the developers have already provided). Users should run `go clean -fuzzcache` before using this change, to clear out any existing "interesting" values that were in the cache. Previously, every single non-crashing input was written to the on-disk corpus. Now, only inputs that actually expand coverage are written. This change includes a small hack in cmd/go/internal/load/pkg.go which ensures that the Gcflags that were explicitly set in cmd/go/internal/test/test.go don't get cleared out. Tests will be added in a follow-up change, since they will be a bit more involved. Change-Id: Ie659222d44475c6d68fa4a35d37c37cab3619d71 Reviewed-on: https://go-review.googlesource.com/c/go/+/312009 Trust: Katie Hockman <katie@golang.org> Run-TryBot: Katie Hockman <katie@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com>
2021-04-20 16:11:13 -04:00
// source code to 0.
[dev.fuzz] internal/fuzz: move coverage capture closer to function When instrumented packages intersect with the packages used by the testing or internal/fuzz packages the coverage counters become noisier, as counters will be triggered by non-fuzzed harness code. Ideally counters would be deterministic, as there are many advanced fuzzing strategies that require mutating the input while maintaining static coverage. The simplest way to mitigate this noise is to capture the coverage counters as closely as possible to the invocation of the fuzz target in the testing package. In order to do this add a new function which captures the current values of the counters, SnapshotCoverage. This function copies the current counters into a static buffer, coverageSnapshot, which workerServer.fuzz can then inspect when it comes time to check if new coverage has been found. This method is not foolproof. As the fuzz target is called in a goroutine, harness code can still cause counters to be incremented while the target is being executed. Despite this we do see significant reduction in churn via this approach. For example, running a basic target that causes strconv to be instrumented for 500,000 iterations causes ~800 unique sets of coverage counters, whereas by capturing the counters closer to the target we get ~40 unique sets. It may be possible to make counters completely deterministic, but likely this would require rewriting testing/F.Fuzz to not use tRunner in a goroutine, and instead use it in a blocking manner (which I couldn't figure out an obvious way to do), or by doing something even more complex. Change-Id: I95c2f3b1d7089c3e6885fc7628a0d3a8ac1a99cf Reviewed-on: https://go-review.googlesource.com/c/go/+/320329 Trust: Roland Shoemaker <roland@golang.org> Trust: Katie Hockman <katie@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com> Reviewed-by: Katie Hockman <katie@golang.org>
2021-05-15 18:46:05 -07:00
func ResetCoverage() {
[dev.fuzz] internal/fuzz: use coverage instrumentation while fuzzing This change updates the go command behavior when fuzzing to instrument the binary for code coverage, and uses this coverage in the fuzzing engine to determine if an input is interesting. Unfortunately, we can't store and use the coverage data for a given run of `go test` and re-use it the next time we fuzz, since the edges could have changed between builds. Instead, every entry in the seed corpus and the on-disk corpus is run by the workers before fuzzing begins, so that the coordinator can get the baseline coverage for what the fuzzing engine has already found (or what the developers have already provided). Users should run `go clean -fuzzcache` before using this change, to clear out any existing "interesting" values that were in the cache. Previously, every single non-crashing input was written to the on-disk corpus. Now, only inputs that actually expand coverage are written. This change includes a small hack in cmd/go/internal/load/pkg.go which ensures that the Gcflags that were explicitly set in cmd/go/internal/test/test.go don't get cleared out. Tests will be added in a follow-up change, since they will be a bit more involved. Change-Id: Ie659222d44475c6d68fa4a35d37c37cab3619d71 Reviewed-on: https://go-review.googlesource.com/c/go/+/312009 Trust: Katie Hockman <katie@golang.org> Run-TryBot: Katie Hockman <katie@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com>
2021-04-20 16:11:13 -04:00
cov := coverage()
for i := range cov {
cov[i] = 0
}
}
[dev.fuzz] internal/fuzz: move coverage capture closer to function When instrumented packages intersect with the packages used by the testing or internal/fuzz packages the coverage counters become noisier, as counters will be triggered by non-fuzzed harness code. Ideally counters would be deterministic, as there are many advanced fuzzing strategies that require mutating the input while maintaining static coverage. The simplest way to mitigate this noise is to capture the coverage counters as closely as possible to the invocation of the fuzz target in the testing package. In order to do this add a new function which captures the current values of the counters, SnapshotCoverage. This function copies the current counters into a static buffer, coverageSnapshot, which workerServer.fuzz can then inspect when it comes time to check if new coverage has been found. This method is not foolproof. As the fuzz target is called in a goroutine, harness code can still cause counters to be incremented while the target is being executed. Despite this we do see significant reduction in churn via this approach. For example, running a basic target that causes strconv to be instrumented for 500,000 iterations causes ~800 unique sets of coverage counters, whereas by capturing the counters closer to the target we get ~40 unique sets. It may be possible to make counters completely deterministic, but likely this would require rewriting testing/F.Fuzz to not use tRunner in a goroutine, and instead use it in a blocking manner (which I couldn't figure out an obvious way to do), or by doing something even more complex. Change-Id: I95c2f3b1d7089c3e6885fc7628a0d3a8ac1a99cf Reviewed-on: https://go-review.googlesource.com/c/go/+/320329 Trust: Roland Shoemaker <roland@golang.org> Trust: Katie Hockman <katie@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com> Reviewed-by: Katie Hockman <katie@golang.org>
2021-05-15 18:46:05 -07:00
// SnapshotCoverage copies the current counter values into coverageSnapshot,
// preserving them for later inspection. SnapshotCoverage also rounds each
// counter down to the nearest power of two. This lets the coordinator store
// multiple values for each counter by OR'ing them together.
[dev.fuzz] internal/fuzz: move coverage capture closer to function When instrumented packages intersect with the packages used by the testing or internal/fuzz packages the coverage counters become noisier, as counters will be triggered by non-fuzzed harness code. Ideally counters would be deterministic, as there are many advanced fuzzing strategies that require mutating the input while maintaining static coverage. The simplest way to mitigate this noise is to capture the coverage counters as closely as possible to the invocation of the fuzz target in the testing package. In order to do this add a new function which captures the current values of the counters, SnapshotCoverage. This function copies the current counters into a static buffer, coverageSnapshot, which workerServer.fuzz can then inspect when it comes time to check if new coverage has been found. This method is not foolproof. As the fuzz target is called in a goroutine, harness code can still cause counters to be incremented while the target is being executed. Despite this we do see significant reduction in churn via this approach. For example, running a basic target that causes strconv to be instrumented for 500,000 iterations causes ~800 unique sets of coverage counters, whereas by capturing the counters closer to the target we get ~40 unique sets. It may be possible to make counters completely deterministic, but likely this would require rewriting testing/F.Fuzz to not use tRunner in a goroutine, and instead use it in a blocking manner (which I couldn't figure out an obvious way to do), or by doing something even more complex. Change-Id: I95c2f3b1d7089c3e6885fc7628a0d3a8ac1a99cf Reviewed-on: https://go-review.googlesource.com/c/go/+/320329 Trust: Roland Shoemaker <roland@golang.org> Trust: Katie Hockman <katie@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com> Reviewed-by: Katie Hockman <katie@golang.org>
2021-05-15 18:46:05 -07:00
func SnapshotCoverage() {
cov := coverage()
for i, b := range cov {
b |= b >> 1
b |= b >> 2
b |= b >> 4
b -= b >> 1
coverageSnapshot[i] = b
[dev.fuzz] internal/fuzz: move coverage capture closer to function When instrumented packages intersect with the packages used by the testing or internal/fuzz packages the coverage counters become noisier, as counters will be triggered by non-fuzzed harness code. Ideally counters would be deterministic, as there are many advanced fuzzing strategies that require mutating the input while maintaining static coverage. The simplest way to mitigate this noise is to capture the coverage counters as closely as possible to the invocation of the fuzz target in the testing package. In order to do this add a new function which captures the current values of the counters, SnapshotCoverage. This function copies the current counters into a static buffer, coverageSnapshot, which workerServer.fuzz can then inspect when it comes time to check if new coverage has been found. This method is not foolproof. As the fuzz target is called in a goroutine, harness code can still cause counters to be incremented while the target is being executed. Despite this we do see significant reduction in churn via this approach. For example, running a basic target that causes strconv to be instrumented for 500,000 iterations causes ~800 unique sets of coverage counters, whereas by capturing the counters closer to the target we get ~40 unique sets. It may be possible to make counters completely deterministic, but likely this would require rewriting testing/F.Fuzz to not use tRunner in a goroutine, and instead use it in a blocking manner (which I couldn't figure out an obvious way to do), or by doing something even more complex. Change-Id: I95c2f3b1d7089c3e6885fc7628a0d3a8ac1a99cf Reviewed-on: https://go-review.googlesource.com/c/go/+/320329 Trust: Roland Shoemaker <roland@golang.org> Trust: Katie Hockman <katie@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com> Reviewed-by: Katie Hockman <katie@golang.org>
2021-05-15 18:46:05 -07:00
}
}
// diffCoverage returns a set of bits set in snapshot but not in base.
// If there are no new bits set, diffCoverage returns nil.
func diffCoverage(base, snapshot []byte) []byte {
found := false
for i := range snapshot {
if snapshot[i]&^base[i] != 0 {
found = true
break
}
}
if !found {
return nil
}
diff := make([]byte, len(snapshot))
for i := range diff {
diff[i] = snapshot[i] &^ base[i]
}
return diff
}
// countNewCoverageBits returns the number of bits set in snapshot that are not
// set in base.
func countNewCoverageBits(base, snapshot []byte) int {
n := 0
for i := range snapshot {
n += bits.OnesCount8(snapshot[i] &^ base[i])
}
return n
}
// hasCoverageBit returns true if snapshot has at least one bit set that is
// also set in base.
func hasCoverageBit(base, snapshot []byte) bool {
for i := range snapshot {
if snapshot[i]&base[i] != 0 {
return true
}
}
return false
}
func countBits(cov []byte) int {
n := 0
for _, c := range cov {
n += bits.OnesCount8(c)
}
return n
}
var coverageSnapshot = make([]byte, len(coverage()))
[dev.fuzz] internal/fuzz: move coverage capture closer to function When instrumented packages intersect with the packages used by the testing or internal/fuzz packages the coverage counters become noisier, as counters will be triggered by non-fuzzed harness code. Ideally counters would be deterministic, as there are many advanced fuzzing strategies that require mutating the input while maintaining static coverage. The simplest way to mitigate this noise is to capture the coverage counters as closely as possible to the invocation of the fuzz target in the testing package. In order to do this add a new function which captures the current values of the counters, SnapshotCoverage. This function copies the current counters into a static buffer, coverageSnapshot, which workerServer.fuzz can then inspect when it comes time to check if new coverage has been found. This method is not foolproof. As the fuzz target is called in a goroutine, harness code can still cause counters to be incremented while the target is being executed. Despite this we do see significant reduction in churn via this approach. For example, running a basic target that causes strconv to be instrumented for 500,000 iterations causes ~800 unique sets of coverage counters, whereas by capturing the counters closer to the target we get ~40 unique sets. It may be possible to make counters completely deterministic, but likely this would require rewriting testing/F.Fuzz to not use tRunner in a goroutine, and instead use it in a blocking manner (which I couldn't figure out an obvious way to do), or by doing something even more complex. Change-Id: I95c2f3b1d7089c3e6885fc7628a0d3a8ac1a99cf Reviewed-on: https://go-review.googlesource.com/c/go/+/320329 Trust: Roland Shoemaker <roland@golang.org> Trust: Katie Hockman <katie@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com> Reviewed-by: Katie Hockman <katie@golang.org>
2021-05-15 18:46:05 -07:00
// _counters and _ecounters mark the start and end, respectively, of where
// the 8-bit coverage counters reside in memory. They're known to cmd/link,
// which specially assigns their addresses for this purpose.
var _counters, _ecounters [0]byte