go/src/cmd/internal/dwarf/dwarf.go

1515 lines
39 KiB
Go
Raw Normal View History

// Copyright 2016 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Package dwarf generates DWARF debugging information.
// DWARF generation is split between the compiler and the linker,
// this package contains the shared code.
package dwarf
cmd/internal/dwarf: remove global encbuf The global encbuf helped avoid allocations. It is incompatible with a concurrent backend. To avoid a performance regression while removing it, introduce two optimizations. First, re-use a buffer in dwarf.PutFunc. Second, avoid a buffer entirely when the int being encoded fits in seven bits, which is about 75% of the time. Passes toolstash-check. Updates #15756 name old alloc/op new alloc/op delta Template 40.6MB ± 0% 40.6MB ± 0% -0.08% (p=0.001 n=8+9) Unicode 29.9MB ± 0% 29.9MB ± 0% ~ (p=0.068 n=8+10) GoTypes 116MB ± 0% 116MB ± 0% +0.05% (p=0.043 n=10+9) SSA 864MB ± 0% 864MB ± 0% +0.01% (p=0.010 n=10+9) Flate 25.8MB ± 0% 25.8MB ± 0% ~ (p=0.353 n=10+10) GoParser 32.2MB ± 0% 32.2MB ± 0% ~ (p=0.353 n=10+10) Reflect 80.2MB ± 0% 80.2MB ± 0% ~ (p=0.165 n=10+10) Tar 27.0MB ± 0% 26.9MB ± 0% ~ (p=0.143 n=10+10) XML 42.8MB ± 0% 42.8MB ± 0% ~ (p=0.400 n=10+9) name old allocs/op new allocs/op delta Template 398k ± 0% 397k ± 0% -0.20% (p=0.002 n=8+9) Unicode 320k ± 0% 321k ± 1% ~ (p=0.122 n=8+10) GoTypes 1.16M ± 0% 1.17M ± 0% ~ (p=0.053 n=10+9) SSA 7.65M ± 0% 7.65M ± 0% ~ (p=0.122 n=10+8) Flate 240k ± 1% 240k ± 1% ~ (p=0.243 n=10+9) GoParser 322k ± 1% 322k ± 1% ~ (p=0.481 n=10+10) Reflect 1.00M ± 0% 1.00M ± 0% ~ (p=0.211 n=9+10) Tar 256k ± 0% 255k ± 1% ~ (p=0.052 n=10+10) XML 400k ± 1% 400k ± 0% ~ (p=0.631 n=10+10) Change-Id: Ia39d9de09232fdbfc9c9cec14587bbf6939c9492 Reviewed-on: https://go-review.googlesource.com/38713 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 15:16:11 -07:00
import (
"errors"
cmd/internal/dwarf: remove global encbuf The global encbuf helped avoid allocations. It is incompatible with a concurrent backend. To avoid a performance regression while removing it, introduce two optimizations. First, re-use a buffer in dwarf.PutFunc. Second, avoid a buffer entirely when the int being encoded fits in seven bits, which is about 75% of the time. Passes toolstash-check. Updates #15756 name old alloc/op new alloc/op delta Template 40.6MB ± 0% 40.6MB ± 0% -0.08% (p=0.001 n=8+9) Unicode 29.9MB ± 0% 29.9MB ± 0% ~ (p=0.068 n=8+10) GoTypes 116MB ± 0% 116MB ± 0% +0.05% (p=0.043 n=10+9) SSA 864MB ± 0% 864MB ± 0% +0.01% (p=0.010 n=10+9) Flate 25.8MB ± 0% 25.8MB ± 0% ~ (p=0.353 n=10+10) GoParser 32.2MB ± 0% 32.2MB ± 0% ~ (p=0.353 n=10+10) Reflect 80.2MB ± 0% 80.2MB ± 0% ~ (p=0.165 n=10+10) Tar 27.0MB ± 0% 26.9MB ± 0% ~ (p=0.143 n=10+10) XML 42.8MB ± 0% 42.8MB ± 0% ~ (p=0.400 n=10+9) name old allocs/op new allocs/op delta Template 398k ± 0% 397k ± 0% -0.20% (p=0.002 n=8+9) Unicode 320k ± 0% 321k ± 1% ~ (p=0.122 n=8+10) GoTypes 1.16M ± 0% 1.17M ± 0% ~ (p=0.053 n=10+9) SSA 7.65M ± 0% 7.65M ± 0% ~ (p=0.122 n=10+8) Flate 240k ± 1% 240k ± 1% ~ (p=0.243 n=10+9) GoParser 322k ± 1% 322k ± 1% ~ (p=0.481 n=10+10) Reflect 1.00M ± 0% 1.00M ± 0% ~ (p=0.211 n=9+10) Tar 256k ± 0% 255k ± 1% ~ (p=0.052 n=10+10) XML 400k ± 1% 400k ± 0% ~ (p=0.631 n=10+10) Change-Id: Ia39d9de09232fdbfc9c9cec14587bbf6939c9492 Reviewed-on: https://go-review.googlesource.com/38713 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 15:16:11 -07:00
"fmt"
"sort"
"strings"
cmd/internal/dwarf: remove global encbuf The global encbuf helped avoid allocations. It is incompatible with a concurrent backend. To avoid a performance regression while removing it, introduce two optimizations. First, re-use a buffer in dwarf.PutFunc. Second, avoid a buffer entirely when the int being encoded fits in seven bits, which is about 75% of the time. Passes toolstash-check. Updates #15756 name old alloc/op new alloc/op delta Template 40.6MB ± 0% 40.6MB ± 0% -0.08% (p=0.001 n=8+9) Unicode 29.9MB ± 0% 29.9MB ± 0% ~ (p=0.068 n=8+10) GoTypes 116MB ± 0% 116MB ± 0% +0.05% (p=0.043 n=10+9) SSA 864MB ± 0% 864MB ± 0% +0.01% (p=0.010 n=10+9) Flate 25.8MB ± 0% 25.8MB ± 0% ~ (p=0.353 n=10+10) GoParser 32.2MB ± 0% 32.2MB ± 0% ~ (p=0.353 n=10+10) Reflect 80.2MB ± 0% 80.2MB ± 0% ~ (p=0.165 n=10+10) Tar 27.0MB ± 0% 26.9MB ± 0% ~ (p=0.143 n=10+10) XML 42.8MB ± 0% 42.8MB ± 0% ~ (p=0.400 n=10+9) name old allocs/op new allocs/op delta Template 398k ± 0% 397k ± 0% -0.20% (p=0.002 n=8+9) Unicode 320k ± 0% 321k ± 1% ~ (p=0.122 n=8+10) GoTypes 1.16M ± 0% 1.17M ± 0% ~ (p=0.053 n=10+9) SSA 7.65M ± 0% 7.65M ± 0% ~ (p=0.122 n=10+8) Flate 240k ± 1% 240k ± 1% ~ (p=0.243 n=10+9) GoParser 322k ± 1% 322k ± 1% ~ (p=0.481 n=10+10) Reflect 1.00M ± 0% 1.00M ± 0% ~ (p=0.211 n=9+10) Tar 256k ± 0% 255k ± 1% ~ (p=0.052 n=10+10) XML 400k ± 1% 400k ± 0% ~ (p=0.631 n=10+10) Change-Id: Ia39d9de09232fdbfc9c9cec14587bbf6939c9492 Reviewed-on: https://go-review.googlesource.com/38713 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 15:16:11 -07:00
)
// InfoPrefix is the prefix for all the symbols containing DWARF info entries.
const InfoPrefix = "go.info."
[dev.debug] cmd/compile: better DWARF with optimizations on Debuggers use DWARF information to find local variables on the stack and in registers. Prior to this CL, the DWARF information for functions claimed that all variables were on the stack at all times. That's incorrect when optimizations are enabled, and results in debuggers showing data that is out of date or complete gibberish. After this CL, the compiler is capable of representing variable locations more accurately, and attempts to do so. Due to limitations of the SSA backend, it's not possible to be completely correct. There are a number of problems in the current design. One of the easier to understand is that variable names currently must be attached to an SSA value, but not all assignments in the source code actually result in machine code. For example: type myint int var a int b := myint(int) and b := (*uint64)(unsafe.Pointer(a)) don't generate machine code because the underlying representation is the same, so the correct value of b will not be set when the user would expect. Generating the more precise debug information is behind a flag, dwarflocationlists. Because of the issues described above, setting the flag may not make the debugging experience much better, and may actually make it worse in cases where the variable actually is on the stack and the more complicated analysis doesn't realize it. A number of changes are included: - Add a new pseudo-instruction, RegKill, which indicates that the value in the register has been clobbered. - Adjust regalloc to emit RegKills in the right places. Significantly, this means that phis are mixed with StoreReg and RegKills after regalloc. - Track variable decomposition in ssa.LocalSlots. - After the SSA backend is done, analyze the result and build location lists for each LocalSlot. - After assembly is done, update the location lists with the assembled PC offsets, recompose variables, and build DWARF location lists. Emit the list as a new linker symbol, one per function. - In the linker, aggregate the location lists into a .debug_loc section. TODO: - currently disabled for non-X86/AMD64 because there are no data tables. go build -toolexec 'toolstash -cmp' -a std succeeds. With -dwarflocationlists false: before: f02812195637909ff675782c0b46836a8ff01976 after: 06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec benchstat -geomean /tmp/220352263 /tmp/621364410 completed 15 of 15, estimated time remaining 0s (eta 3:52PM) name old time/op new time/op delta Template 199ms ± 3% 198ms ± 2% ~ (p=0.400 n=15+14) Unicode 96.6ms ± 5% 96.4ms ± 5% ~ (p=0.838 n=15+15) GoTypes 653ms ± 2% 647ms ± 2% ~ (p=0.102 n=15+14) Flate 133ms ± 6% 129ms ± 3% -2.62% (p=0.041 n=15+15) GoParser 164ms ± 5% 159ms ± 3% -3.05% (p=0.000 n=15+15) Reflect 428ms ± 4% 422ms ± 3% ~ (p=0.156 n=15+13) Tar 123ms ±10% 124ms ± 8% ~ (p=0.461 n=15+15) XML 228ms ± 3% 224ms ± 3% -1.57% (p=0.045 n=15+15) [Geo mean] 206ms 377ms +82.86% name old user-time/op new user-time/op delta Template 292ms ±10% 301ms ±12% ~ (p=0.189 n=15+15) Unicode 166ms ±37% 158ms ±14% ~ (p=0.418 n=15+14) GoTypes 962ms ± 6% 963ms ± 7% ~ (p=0.976 n=15+15) Flate 207ms ±19% 200ms ±14% ~ (p=0.345 n=14+15) GoParser 246ms ±22% 240ms ±15% ~ (p=0.587 n=15+15) Reflect 611ms ±13% 587ms ±14% ~ (p=0.085 n=15+13) Tar 211ms ±12% 217ms ±14% ~ (p=0.355 n=14+15) XML 335ms ±15% 320ms ±18% ~ (p=0.169 n=15+15) [Geo mean] 317ms 583ms +83.72% name old alloc/op new alloc/op delta Template 40.2MB ± 0% 40.2MB ± 0% -0.15% (p=0.000 n=14+15) Unicode 29.2MB ± 0% 29.3MB ± 0% ~ (p=0.624 n=15+15) GoTypes 114MB ± 0% 114MB ± 0% -0.15% (p=0.000 n=15+14) Flate 25.7MB ± 0% 25.6MB ± 0% -0.18% (p=0.000 n=13+15) GoParser 32.2MB ± 0% 32.2MB ± 0% -0.14% (p=0.003 n=15+15) Reflect 77.8MB ± 0% 77.9MB ± 0% ~ (p=0.061 n=15+15) Tar 27.1MB ± 0% 27.0MB ± 0% -0.11% (p=0.029 n=15+15) XML 42.7MB ± 0% 42.5MB ± 0% -0.29% (p=0.000 n=15+15) [Geo mean] 42.1MB 75.0MB +78.05% name old allocs/op new allocs/op delta Template 402k ± 1% 398k ± 0% -0.91% (p=0.000 n=15+15) Unicode 344k ± 1% 344k ± 0% ~ (p=0.715 n=15+14) GoTypes 1.18M ± 0% 1.17M ± 0% -0.91% (p=0.000 n=15+14) Flate 243k ± 0% 240k ± 1% -1.05% (p=0.000 n=13+15) GoParser 327k ± 1% 324k ± 1% -0.96% (p=0.000 n=15+15) Reflect 984k ± 1% 982k ± 0% ~ (p=0.050 n=15+15) Tar 261k ± 1% 259k ± 1% -0.77% (p=0.000 n=15+15) XML 411k ± 0% 404k ± 1% -1.55% (p=0.000 n=15+15) [Geo mean] 439k 755k +72.01% name old text-bytes new text-bytes delta HelloSize 694kB ± 0% 694kB ± 0% -0.00% (p=0.000 n=15+15) name old data-bytes new data-bytes delta HelloSize 5.55kB ± 0% 5.55kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 133kB ± 0% 133kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.04MB ± 0% 1.04MB ± 0% ~ (all equal) Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8 Reviewed-on: https://go-review.googlesource.com/41770 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-21 18:30:19 -04:00
// RangePrefix is the prefix for all the symbols containing DWARF location lists.
const LocPrefix = "go.loc."
// RangePrefix is the prefix for all the symbols containing DWARF range lists.
const RangePrefix = "go.range."
// IsStmtPrefix is the prefix for all the symbols containing DWARF is_stmt info for the line number table.
const IsStmtPrefix = "go.isstmt."
// ConstInfoPrefix is the prefix for all symbols containing DWARF info
// entries that contain constants.
const ConstInfoPrefix = "go.constinfo."
// CUInfoPrefix is the prefix for symbols containing information to
// populate the DWARF compilation unit info entries.
const CUInfoPrefix = "go.cuinfo."
// Used to form the symbol name assigned to the DWARF 'abstract subprogram"
// info entry for a function
const AbstractFuncSuffix = "$abstract"
// Controls logging/debugging for selected aspects of DWARF subprogram
// generation (functions, scopes).
var logDwarf bool
// Sym represents a symbol.
type Sym interface {
Len() int64
}
// A Var represents a local variable or a function parameter.
type Var struct {
Name string
Abbrev int // Either DW_ABRV_AUTO[_LOCLIST] or DW_ABRV_PARAM[_LOCLIST]
IsReturnValue bool
IsInlFormal bool
StackOffset int32
cmd/compile: reimplement location list generation Completely redesign and reimplement location list generation to be more efficient, and hopefully not too hard to understand. RegKills are gone. Instead of using the regalloc's liveness calculations, redo them using the Ops' clobber information. Besides saving a lot of Values, this avoids adding RegKills to blocks that would be empty otherwise, which was messing up optimizations. This does mean that it's much harder to tell whether the generation process is buggy (there's nothing to cross-check it with), and there may be disagreements with GC liveness. But the performance gain is significant, and it's nice not to be messing with earlier compiler phases. The intermediate representations are gone. Instead of producing ssa.BlockDebugs, then dwarf.LocationLists, and then finally real location lists, go directly from the SSA to a (mostly) real location list. Because the SSA analysis happens before assembly, it stores encoded block/value IDs where PCs would normally go. It would be easier to do the SSA analysis after assembly, but I didn't want to retain the SSA just for that. Generation proceeds in two phases: first, it traverses the function in CFG order, storing the state of the block at the beginning and end. End states are used to produce the start states of the successor blocks. In the second phase, it traverses in program text order and produces the location lists. The processing in the second phase is redundant, but much cheaper than storing the intermediate representation. It might be possible to combine the two phases somewhat to take advantage of cases where the CFG matches the block layout, but I haven't tried. Location lists are finalized by adding a base address selection entry, translating each encoded block/value ID to a real PC, and adding the terminating zero entry. This probably won't work on OSX, where dsymutil will choke on the base address selection. I tried emitting CU-relative relocations for each address, and it was *very* bad for performance -- it uses more memory storing all the relocations than it does for the actual location list bytes. I think I'm going to end up synthesizing the relocations in the linker only on OSX, but TBD. TestNexting needs updating: with more optimizations working, the debugger doesn't stop on the continue (line 88) any more, and the test's duplicate suppression kicks in. Also, dx and dy live a little longer now, but they have the correct values. Change-Id: Ie772dfe23a4e389ca573624fac4d05401ae32307 Reviewed-on: https://go-review.googlesource.com/89356 Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2017-10-26 15:40:17 -04:00
// This package can't use the ssa package, so it can't mention ssa.FuncDebug,
// so indirect through a closure.
PutLocationList func(listSym, startPC Sym)
Scope int32
Type Sym
DeclFile string
DeclLine uint
DeclCol uint
InlIndex int32 // subtract 1 to form real index into InlTree
ChildIndex int32 // child DIE index in abstract function
IsInAbstract bool // variable exists in abstract function
}
// A Scope represents a lexical scope. All variables declared within a
// scope will only be visible to instructions covered by the scope.
// Lexical scopes are contiguous in source files but can end up being
// compiled to discontiguous blocks of instructions in the executable.
// The Ranges field lists all the blocks of instructions that belong
// in this scope.
type Scope struct {
Parent int32
Ranges []Range
Vars []*Var
}
// A Range represents a half-open interval [Start, End).
type Range struct {
Start, End int64
}
// This container is used by the PutFunc* variants below when
// creating the DWARF subprogram DIE(s) for a function.
type FnState struct {
Name string
Importpath string
Info Sym
Filesym Sym
Loc Sym
Ranges Sym
Absfn Sym
StartPC Sym
Size int64
External bool
Scopes []Scope
InlCalls InlCalls
}
func EnableLogging(doit bool) {
logDwarf = doit
}
// UnifyRanges merges the list of ranges of c into the list of ranges of s
func (s *Scope) UnifyRanges(c *Scope) {
out := make([]Range, 0, len(s.Ranges)+len(c.Ranges))
i, j := 0, 0
for {
var cur Range
if i < len(s.Ranges) && j < len(c.Ranges) {
if s.Ranges[i].Start < c.Ranges[j].Start {
cur = s.Ranges[i]
i++
} else {
cur = c.Ranges[j]
j++
}
} else if i < len(s.Ranges) {
cur = s.Ranges[i]
i++
} else if j < len(c.Ranges) {
cur = c.Ranges[j]
j++
} else {
break
}
if n := len(out); n > 0 && cur.Start <= out[n-1].End {
out[n-1].End = cur.End
} else {
out = append(out, cur)
}
}
s.Ranges = out
}
type InlCalls struct {
Calls []InlCall
}
type InlCall struct {
// index into ctx.InlTree describing the call inlined here
InlIndex int
// Symbol of file containing inlined call site (really *obj.LSym).
CallFile Sym
// Line number of inlined call site.
CallLine uint32
// Dwarf abstract subroutine symbol (really *obj.LSym).
AbsFunSym Sym
// Indices of child inlines within Calls array above.
Children []int
// entries in this list are PAUTO's created by the inliner to
// capture the promoted formals and locals of the inlined callee.
InlVars []*Var
// PC ranges for this inlined call.
Ranges []Range
// Root call (not a child of some other call).
Root bool
}
// A Context specifies how to add data to a Sym.
type Context interface {
PtrSize() int
AddInt(s Sym, size int, i int64)
AddBytes(s Sym, b []byte)
AddAddress(s Sym, t interface{}, ofs int64)
AddSectionOffset(s Sym, size int, t interface{}, ofs int64)
AddDWARFSectionOffset(s Sym, size int, t interface{}, ofs int64)
CurrentOffset(s Sym) int64
RecordDclReference(from Sym, to Sym, dclIdx int, inlIndex int)
RecordChildDieOffsets(s Sym, vars []*Var, offsets []int32)
AddString(s Sym, v string)
AddFileRef(s Sym, f interface{})
Logf(format string, args ...interface{})
}
// AppendUleb128 appends v to b using DWARF's unsigned LEB128 encoding.
func AppendUleb128(b []byte, v uint64) []byte {
for {
c := uint8(v & 0x7f)
v >>= 7
if v != 0 {
c |= 0x80
}
b = append(b, c)
if c&0x80 == 0 {
break
}
}
return b
}
// AppendSleb128 appends v to b using DWARF's signed LEB128 encoding.
func AppendSleb128(b []byte, v int64) []byte {
for {
c := uint8(v & 0x7f)
s := uint8(v & 0x40)
v >>= 7
if (v != -1 || s == 0) && (v != 0 || s != 0) {
c |= 0x80
}
b = append(b, c)
if c&0x80 == 0 {
break
}
}
return b
}
cmd/internal/dwarf: remove global encbuf The global encbuf helped avoid allocations. It is incompatible with a concurrent backend. To avoid a performance regression while removing it, introduce two optimizations. First, re-use a buffer in dwarf.PutFunc. Second, avoid a buffer entirely when the int being encoded fits in seven bits, which is about 75% of the time. Passes toolstash-check. Updates #15756 name old alloc/op new alloc/op delta Template 40.6MB ± 0% 40.6MB ± 0% -0.08% (p=0.001 n=8+9) Unicode 29.9MB ± 0% 29.9MB ± 0% ~ (p=0.068 n=8+10) GoTypes 116MB ± 0% 116MB ± 0% +0.05% (p=0.043 n=10+9) SSA 864MB ± 0% 864MB ± 0% +0.01% (p=0.010 n=10+9) Flate 25.8MB ± 0% 25.8MB ± 0% ~ (p=0.353 n=10+10) GoParser 32.2MB ± 0% 32.2MB ± 0% ~ (p=0.353 n=10+10) Reflect 80.2MB ± 0% 80.2MB ± 0% ~ (p=0.165 n=10+10) Tar 27.0MB ± 0% 26.9MB ± 0% ~ (p=0.143 n=10+10) XML 42.8MB ± 0% 42.8MB ± 0% ~ (p=0.400 n=10+9) name old allocs/op new allocs/op delta Template 398k ± 0% 397k ± 0% -0.20% (p=0.002 n=8+9) Unicode 320k ± 0% 321k ± 1% ~ (p=0.122 n=8+10) GoTypes 1.16M ± 0% 1.17M ± 0% ~ (p=0.053 n=10+9) SSA 7.65M ± 0% 7.65M ± 0% ~ (p=0.122 n=10+8) Flate 240k ± 1% 240k ± 1% ~ (p=0.243 n=10+9) GoParser 322k ± 1% 322k ± 1% ~ (p=0.481 n=10+10) Reflect 1.00M ± 0% 1.00M ± 0% ~ (p=0.211 n=9+10) Tar 256k ± 0% 255k ± 1% ~ (p=0.052 n=10+10) XML 400k ± 1% 400k ± 0% ~ (p=0.631 n=10+10) Change-Id: Ia39d9de09232fdbfc9c9cec14587bbf6939c9492 Reviewed-on: https://go-review.googlesource.com/38713 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 15:16:11 -07:00
// sevenbits contains all unsigned seven bit numbers, indexed by their value.
var sevenbits = [...]byte{
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f,
0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, 0x28, 0x29, 0x2a, 0x2b, 0x2c, 0x2d, 0x2e, 0x2f,
0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f,
0x40, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, 0x48, 0x49, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4f,
0x50, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57, 0x58, 0x59, 0x5a, 0x5b, 0x5c, 0x5d, 0x5e, 0x5f,
0x60, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67, 0x68, 0x69, 0x6a, 0x6b, 0x6c, 0x6d, 0x6e, 0x6f,
0x70, 0x71, 0x72, 0x73, 0x74, 0x75, 0x76, 0x77, 0x78, 0x79, 0x7a, 0x7b, 0x7c, 0x7d, 0x7e, 0x7f,
}
// sevenBitU returns the unsigned LEB128 encoding of v if v is seven bits and nil otherwise.
// The contents of the returned slice must not be modified.
func sevenBitU(v int64) []byte {
if uint64(v) < uint64(len(sevenbits)) {
return sevenbits[v : v+1]
}
return nil
}
cmd/internal/dwarf: remove global encbuf The global encbuf helped avoid allocations. It is incompatible with a concurrent backend. To avoid a performance regression while removing it, introduce two optimizations. First, re-use a buffer in dwarf.PutFunc. Second, avoid a buffer entirely when the int being encoded fits in seven bits, which is about 75% of the time. Passes toolstash-check. Updates #15756 name old alloc/op new alloc/op delta Template 40.6MB ± 0% 40.6MB ± 0% -0.08% (p=0.001 n=8+9) Unicode 29.9MB ± 0% 29.9MB ± 0% ~ (p=0.068 n=8+10) GoTypes 116MB ± 0% 116MB ± 0% +0.05% (p=0.043 n=10+9) SSA 864MB ± 0% 864MB ± 0% +0.01% (p=0.010 n=10+9) Flate 25.8MB ± 0% 25.8MB ± 0% ~ (p=0.353 n=10+10) GoParser 32.2MB ± 0% 32.2MB ± 0% ~ (p=0.353 n=10+10) Reflect 80.2MB ± 0% 80.2MB ± 0% ~ (p=0.165 n=10+10) Tar 27.0MB ± 0% 26.9MB ± 0% ~ (p=0.143 n=10+10) XML 42.8MB ± 0% 42.8MB ± 0% ~ (p=0.400 n=10+9) name old allocs/op new allocs/op delta Template 398k ± 0% 397k ± 0% -0.20% (p=0.002 n=8+9) Unicode 320k ± 0% 321k ± 1% ~ (p=0.122 n=8+10) GoTypes 1.16M ± 0% 1.17M ± 0% ~ (p=0.053 n=10+9) SSA 7.65M ± 0% 7.65M ± 0% ~ (p=0.122 n=10+8) Flate 240k ± 1% 240k ± 1% ~ (p=0.243 n=10+9) GoParser 322k ± 1% 322k ± 1% ~ (p=0.481 n=10+10) Reflect 1.00M ± 0% 1.00M ± 0% ~ (p=0.211 n=9+10) Tar 256k ± 0% 255k ± 1% ~ (p=0.052 n=10+10) XML 400k ± 1% 400k ± 0% ~ (p=0.631 n=10+10) Change-Id: Ia39d9de09232fdbfc9c9cec14587bbf6939c9492 Reviewed-on: https://go-review.googlesource.com/38713 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 15:16:11 -07:00
// sevenBitS returns the signed LEB128 encoding of v if v is seven bits and nil otherwise.
// The contents of the returned slice must not be modified.
func sevenBitS(v int64) []byte {
if uint64(v) <= 63 {
return sevenbits[v : v+1]
}
if uint64(-v) <= 64 {
return sevenbits[128+v : 128+v+1]
}
return nil
}
// Uleb128put appends v to s using DWARF's unsigned LEB128 encoding.
func Uleb128put(ctxt Context, s Sym, v int64) {
cmd/internal/dwarf: remove global encbuf The global encbuf helped avoid allocations. It is incompatible with a concurrent backend. To avoid a performance regression while removing it, introduce two optimizations. First, re-use a buffer in dwarf.PutFunc. Second, avoid a buffer entirely when the int being encoded fits in seven bits, which is about 75% of the time. Passes toolstash-check. Updates #15756 name old alloc/op new alloc/op delta Template 40.6MB ± 0% 40.6MB ± 0% -0.08% (p=0.001 n=8+9) Unicode 29.9MB ± 0% 29.9MB ± 0% ~ (p=0.068 n=8+10) GoTypes 116MB ± 0% 116MB ± 0% +0.05% (p=0.043 n=10+9) SSA 864MB ± 0% 864MB ± 0% +0.01% (p=0.010 n=10+9) Flate 25.8MB ± 0% 25.8MB ± 0% ~ (p=0.353 n=10+10) GoParser 32.2MB ± 0% 32.2MB ± 0% ~ (p=0.353 n=10+10) Reflect 80.2MB ± 0% 80.2MB ± 0% ~ (p=0.165 n=10+10) Tar 27.0MB ± 0% 26.9MB ± 0% ~ (p=0.143 n=10+10) XML 42.8MB ± 0% 42.8MB ± 0% ~ (p=0.400 n=10+9) name old allocs/op new allocs/op delta Template 398k ± 0% 397k ± 0% -0.20% (p=0.002 n=8+9) Unicode 320k ± 0% 321k ± 1% ~ (p=0.122 n=8+10) GoTypes 1.16M ± 0% 1.17M ± 0% ~ (p=0.053 n=10+9) SSA 7.65M ± 0% 7.65M ± 0% ~ (p=0.122 n=10+8) Flate 240k ± 1% 240k ± 1% ~ (p=0.243 n=10+9) GoParser 322k ± 1% 322k ± 1% ~ (p=0.481 n=10+10) Reflect 1.00M ± 0% 1.00M ± 0% ~ (p=0.211 n=9+10) Tar 256k ± 0% 255k ± 1% ~ (p=0.052 n=10+10) XML 400k ± 1% 400k ± 0% ~ (p=0.631 n=10+10) Change-Id: Ia39d9de09232fdbfc9c9cec14587bbf6939c9492 Reviewed-on: https://go-review.googlesource.com/38713 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 15:16:11 -07:00
b := sevenBitU(v)
if b == nil {
var encbuf [20]byte
b = AppendUleb128(encbuf[:0], uint64(v))
}
ctxt.AddBytes(s, b)
}
cmd/internal/dwarf: remove global encbuf The global encbuf helped avoid allocations. It is incompatible with a concurrent backend. To avoid a performance regression while removing it, introduce two optimizations. First, re-use a buffer in dwarf.PutFunc. Second, avoid a buffer entirely when the int being encoded fits in seven bits, which is about 75% of the time. Passes toolstash-check. Updates #15756 name old alloc/op new alloc/op delta Template 40.6MB ± 0% 40.6MB ± 0% -0.08% (p=0.001 n=8+9) Unicode 29.9MB ± 0% 29.9MB ± 0% ~ (p=0.068 n=8+10) GoTypes 116MB ± 0% 116MB ± 0% +0.05% (p=0.043 n=10+9) SSA 864MB ± 0% 864MB ± 0% +0.01% (p=0.010 n=10+9) Flate 25.8MB ± 0% 25.8MB ± 0% ~ (p=0.353 n=10+10) GoParser 32.2MB ± 0% 32.2MB ± 0% ~ (p=0.353 n=10+10) Reflect 80.2MB ± 0% 80.2MB ± 0% ~ (p=0.165 n=10+10) Tar 27.0MB ± 0% 26.9MB ± 0% ~ (p=0.143 n=10+10) XML 42.8MB ± 0% 42.8MB ± 0% ~ (p=0.400 n=10+9) name old allocs/op new allocs/op delta Template 398k ± 0% 397k ± 0% -0.20% (p=0.002 n=8+9) Unicode 320k ± 0% 321k ± 1% ~ (p=0.122 n=8+10) GoTypes 1.16M ± 0% 1.17M ± 0% ~ (p=0.053 n=10+9) SSA 7.65M ± 0% 7.65M ± 0% ~ (p=0.122 n=10+8) Flate 240k ± 1% 240k ± 1% ~ (p=0.243 n=10+9) GoParser 322k ± 1% 322k ± 1% ~ (p=0.481 n=10+10) Reflect 1.00M ± 0% 1.00M ± 0% ~ (p=0.211 n=9+10) Tar 256k ± 0% 255k ± 1% ~ (p=0.052 n=10+10) XML 400k ± 1% 400k ± 0% ~ (p=0.631 n=10+10) Change-Id: Ia39d9de09232fdbfc9c9cec14587bbf6939c9492 Reviewed-on: https://go-review.googlesource.com/38713 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 15:16:11 -07:00
// Sleb128put appends v to s using DWARF's signed LEB128 encoding.
func Sleb128put(ctxt Context, s Sym, v int64) {
cmd/internal/dwarf: remove global encbuf The global encbuf helped avoid allocations. It is incompatible with a concurrent backend. To avoid a performance regression while removing it, introduce two optimizations. First, re-use a buffer in dwarf.PutFunc. Second, avoid a buffer entirely when the int being encoded fits in seven bits, which is about 75% of the time. Passes toolstash-check. Updates #15756 name old alloc/op new alloc/op delta Template 40.6MB ± 0% 40.6MB ± 0% -0.08% (p=0.001 n=8+9) Unicode 29.9MB ± 0% 29.9MB ± 0% ~ (p=0.068 n=8+10) GoTypes 116MB ± 0% 116MB ± 0% +0.05% (p=0.043 n=10+9) SSA 864MB ± 0% 864MB ± 0% +0.01% (p=0.010 n=10+9) Flate 25.8MB ± 0% 25.8MB ± 0% ~ (p=0.353 n=10+10) GoParser 32.2MB ± 0% 32.2MB ± 0% ~ (p=0.353 n=10+10) Reflect 80.2MB ± 0% 80.2MB ± 0% ~ (p=0.165 n=10+10) Tar 27.0MB ± 0% 26.9MB ± 0% ~ (p=0.143 n=10+10) XML 42.8MB ± 0% 42.8MB ± 0% ~ (p=0.400 n=10+9) name old allocs/op new allocs/op delta Template 398k ± 0% 397k ± 0% -0.20% (p=0.002 n=8+9) Unicode 320k ± 0% 321k ± 1% ~ (p=0.122 n=8+10) GoTypes 1.16M ± 0% 1.17M ± 0% ~ (p=0.053 n=10+9) SSA 7.65M ± 0% 7.65M ± 0% ~ (p=0.122 n=10+8) Flate 240k ± 1% 240k ± 1% ~ (p=0.243 n=10+9) GoParser 322k ± 1% 322k ± 1% ~ (p=0.481 n=10+10) Reflect 1.00M ± 0% 1.00M ± 0% ~ (p=0.211 n=9+10) Tar 256k ± 0% 255k ± 1% ~ (p=0.052 n=10+10) XML 400k ± 1% 400k ± 0% ~ (p=0.631 n=10+10) Change-Id: Ia39d9de09232fdbfc9c9cec14587bbf6939c9492 Reviewed-on: https://go-review.googlesource.com/38713 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-26 15:16:11 -07:00
b := sevenBitS(v)
if b == nil {
var encbuf [20]byte
b = AppendSleb128(encbuf[:0], v)
}
ctxt.AddBytes(s, b)
}
/*
* Defining Abbrevs. This is hardcoded, and there will be
* only a handful of them. The DWARF spec places no restriction on
* the ordering of attributes in the Abbrevs and DIEs, and we will
* always write them out in the order of declaration in the abbrev.
*/
type dwAttrForm struct {
attr uint16
form uint8
}
// Go-specific type attributes.
const (
DW_AT_go_kind = 0x2900
DW_AT_go_key = 0x2901
DW_AT_go_elem = 0x2902
// Attribute for DW_TAG_member of a struct type.
// Nonzero value indicates the struct field is an embedded field.
DW_AT_go_embedded_field = 0x2903
DW_AT_go_runtime_type = 0x2904
DW_AT_internal_location = 253 // params and locals; not emitted
)
// Index into the abbrevs table below.
[dev.debug] cmd/compile: better DWARF with optimizations on Debuggers use DWARF information to find local variables on the stack and in registers. Prior to this CL, the DWARF information for functions claimed that all variables were on the stack at all times. That's incorrect when optimizations are enabled, and results in debuggers showing data that is out of date or complete gibberish. After this CL, the compiler is capable of representing variable locations more accurately, and attempts to do so. Due to limitations of the SSA backend, it's not possible to be completely correct. There are a number of problems in the current design. One of the easier to understand is that variable names currently must be attached to an SSA value, but not all assignments in the source code actually result in machine code. For example: type myint int var a int b := myint(int) and b := (*uint64)(unsafe.Pointer(a)) don't generate machine code because the underlying representation is the same, so the correct value of b will not be set when the user would expect. Generating the more precise debug information is behind a flag, dwarflocationlists. Because of the issues described above, setting the flag may not make the debugging experience much better, and may actually make it worse in cases where the variable actually is on the stack and the more complicated analysis doesn't realize it. A number of changes are included: - Add a new pseudo-instruction, RegKill, which indicates that the value in the register has been clobbered. - Adjust regalloc to emit RegKills in the right places. Significantly, this means that phis are mixed with StoreReg and RegKills after regalloc. - Track variable decomposition in ssa.LocalSlots. - After the SSA backend is done, analyze the result and build location lists for each LocalSlot. - After assembly is done, update the location lists with the assembled PC offsets, recompose variables, and build DWARF location lists. Emit the list as a new linker symbol, one per function. - In the linker, aggregate the location lists into a .debug_loc section. TODO: - currently disabled for non-X86/AMD64 because there are no data tables. go build -toolexec 'toolstash -cmp' -a std succeeds. With -dwarflocationlists false: before: f02812195637909ff675782c0b46836a8ff01976 after: 06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec benchstat -geomean /tmp/220352263 /tmp/621364410 completed 15 of 15, estimated time remaining 0s (eta 3:52PM) name old time/op new time/op delta Template 199ms ± 3% 198ms ± 2% ~ (p=0.400 n=15+14) Unicode 96.6ms ± 5% 96.4ms ± 5% ~ (p=0.838 n=15+15) GoTypes 653ms ± 2% 647ms ± 2% ~ (p=0.102 n=15+14) Flate 133ms ± 6% 129ms ± 3% -2.62% (p=0.041 n=15+15) GoParser 164ms ± 5% 159ms ± 3% -3.05% (p=0.000 n=15+15) Reflect 428ms ± 4% 422ms ± 3% ~ (p=0.156 n=15+13) Tar 123ms ±10% 124ms ± 8% ~ (p=0.461 n=15+15) XML 228ms ± 3% 224ms ± 3% -1.57% (p=0.045 n=15+15) [Geo mean] 206ms 377ms +82.86% name old user-time/op new user-time/op delta Template 292ms ±10% 301ms ±12% ~ (p=0.189 n=15+15) Unicode 166ms ±37% 158ms ±14% ~ (p=0.418 n=15+14) GoTypes 962ms ± 6% 963ms ± 7% ~ (p=0.976 n=15+15) Flate 207ms ±19% 200ms ±14% ~ (p=0.345 n=14+15) GoParser 246ms ±22% 240ms ±15% ~ (p=0.587 n=15+15) Reflect 611ms ±13% 587ms ±14% ~ (p=0.085 n=15+13) Tar 211ms ±12% 217ms ±14% ~ (p=0.355 n=14+15) XML 335ms ±15% 320ms ±18% ~ (p=0.169 n=15+15) [Geo mean] 317ms 583ms +83.72% name old alloc/op new alloc/op delta Template 40.2MB ± 0% 40.2MB ± 0% -0.15% (p=0.000 n=14+15) Unicode 29.2MB ± 0% 29.3MB ± 0% ~ (p=0.624 n=15+15) GoTypes 114MB ± 0% 114MB ± 0% -0.15% (p=0.000 n=15+14) Flate 25.7MB ± 0% 25.6MB ± 0% -0.18% (p=0.000 n=13+15) GoParser 32.2MB ± 0% 32.2MB ± 0% -0.14% (p=0.003 n=15+15) Reflect 77.8MB ± 0% 77.9MB ± 0% ~ (p=0.061 n=15+15) Tar 27.1MB ± 0% 27.0MB ± 0% -0.11% (p=0.029 n=15+15) XML 42.7MB ± 0% 42.5MB ± 0% -0.29% (p=0.000 n=15+15) [Geo mean] 42.1MB 75.0MB +78.05% name old allocs/op new allocs/op delta Template 402k ± 1% 398k ± 0% -0.91% (p=0.000 n=15+15) Unicode 344k ± 1% 344k ± 0% ~ (p=0.715 n=15+14) GoTypes 1.18M ± 0% 1.17M ± 0% -0.91% (p=0.000 n=15+14) Flate 243k ± 0% 240k ± 1% -1.05% (p=0.000 n=13+15) GoParser 327k ± 1% 324k ± 1% -0.96% (p=0.000 n=15+15) Reflect 984k ± 1% 982k ± 0% ~ (p=0.050 n=15+15) Tar 261k ± 1% 259k ± 1% -0.77% (p=0.000 n=15+15) XML 411k ± 0% 404k ± 1% -1.55% (p=0.000 n=15+15) [Geo mean] 439k 755k +72.01% name old text-bytes new text-bytes delta HelloSize 694kB ± 0% 694kB ± 0% -0.00% (p=0.000 n=15+15) name old data-bytes new data-bytes delta HelloSize 5.55kB ± 0% 5.55kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 133kB ± 0% 133kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.04MB ± 0% 1.04MB ± 0% ~ (all equal) Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8 Reviewed-on: https://go-review.googlesource.com/41770 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-21 18:30:19 -04:00
// Keep in sync with ispubname() and ispubtype() in ld/dwarf.go.
// ispubtype considers >= NULLTYPE public
const (
DW_ABRV_NULL = iota
DW_ABRV_COMPUNIT
DW_ABRV_FUNCTION
DW_ABRV_FUNCTION_ABSTRACT
DW_ABRV_FUNCTION_CONCRETE
DW_ABRV_INLINED_SUBROUTINE
DW_ABRV_INLINED_SUBROUTINE_RANGES
DW_ABRV_VARIABLE
DW_ABRV_INT_CONSTANT
DW_ABRV_AUTO
DW_ABRV_AUTO_LOCLIST
DW_ABRV_AUTO_ABSTRACT
DW_ABRV_AUTO_CONCRETE
DW_ABRV_AUTO_CONCRETE_LOCLIST
DW_ABRV_PARAM
DW_ABRV_PARAM_LOCLIST
DW_ABRV_PARAM_ABSTRACT
DW_ABRV_PARAM_CONCRETE
DW_ABRV_PARAM_CONCRETE_LOCLIST
DW_ABRV_LEXICAL_BLOCK_RANGES
DW_ABRV_LEXICAL_BLOCK_SIMPLE
DW_ABRV_STRUCTFIELD
DW_ABRV_FUNCTYPEPARAM
DW_ABRV_DOTDOTDOT
DW_ABRV_ARRAYRANGE
DW_ABRV_NULLTYPE
DW_ABRV_BASETYPE
DW_ABRV_ARRAYTYPE
DW_ABRV_CHANTYPE
DW_ABRV_FUNCTYPE
DW_ABRV_IFACETYPE
DW_ABRV_MAPTYPE
DW_ABRV_PTRTYPE
DW_ABRV_BARE_PTRTYPE // only for void*, no DW_AT_type attr to please gdb 6.
DW_ABRV_SLICETYPE
DW_ABRV_STRINGTYPE
DW_ABRV_STRUCTTYPE
DW_ABRV_TYPEDECL
DW_NABRV
)
type dwAbbrev struct {
tag uint8
children uint8
attr []dwAttrForm
}
var abbrevs = [DW_NABRV]dwAbbrev{
/* The mandatory DW_ABRV_NULL entry. */
{0, 0, []dwAttrForm{}},
/* COMPUNIT */
{
DW_TAG_compile_unit,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_language, DW_FORM_data1},
{DW_AT_stmt_list, DW_FORM_sec_offset},
{DW_AT_low_pc, DW_FORM_addr},
{DW_AT_ranges, DW_FORM_sec_offset},
{DW_AT_comp_dir, DW_FORM_string},
{DW_AT_producer, DW_FORM_string},
},
},
/* FUNCTION */
{
DW_TAG_subprogram,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_low_pc, DW_FORM_addr},
{DW_AT_high_pc, DW_FORM_addr},
{DW_AT_frame_base, DW_FORM_block1},
{DW_AT_decl_file, DW_FORM_data4},
{DW_AT_external, DW_FORM_flag},
},
},
/* FUNCTION_ABSTRACT */
{
DW_TAG_subprogram,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_inline, DW_FORM_data1},
{DW_AT_external, DW_FORM_flag},
},
},
/* FUNCTION_CONCRETE */
{
DW_TAG_subprogram,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_abstract_origin, DW_FORM_ref_addr},
{DW_AT_low_pc, DW_FORM_addr},
{DW_AT_high_pc, DW_FORM_addr},
{DW_AT_frame_base, DW_FORM_block1},
},
},
/* INLINED_SUBROUTINE */
{
DW_TAG_inlined_subroutine,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_abstract_origin, DW_FORM_ref_addr},
{DW_AT_low_pc, DW_FORM_addr},
{DW_AT_high_pc, DW_FORM_addr},
{DW_AT_call_file, DW_FORM_data4},
{DW_AT_call_line, DW_FORM_udata},
},
},
/* INLINED_SUBROUTINE_RANGES */
{
DW_TAG_inlined_subroutine,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_abstract_origin, DW_FORM_ref_addr},
{DW_AT_ranges, DW_FORM_sec_offset},
{DW_AT_call_file, DW_FORM_data4},
{DW_AT_call_line, DW_FORM_udata},
},
},
/* VARIABLE */
{
DW_TAG_variable,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_location, DW_FORM_block1},
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_external, DW_FORM_flag},
},
},
/* INT CONSTANT */
{
DW_TAG_constant,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_const_value, DW_FORM_sdata},
},
},
/* AUTO */
{
DW_TAG_variable,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_decl_line, DW_FORM_udata},
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_location, DW_FORM_block1},
},
},
/* AUTO_LOCLIST */
{
DW_TAG_variable,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_decl_line, DW_FORM_udata},
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_location, DW_FORM_sec_offset},
},
},
/* AUTO_ABSTRACT */
{
DW_TAG_variable,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_decl_line, DW_FORM_udata},
{DW_AT_type, DW_FORM_ref_addr},
},
},
/* AUTO_CONCRETE */
{
DW_TAG_variable,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_abstract_origin, DW_FORM_ref_addr},
{DW_AT_location, DW_FORM_block1},
},
},
/* AUTO_CONCRETE_LOCLIST */
{
DW_TAG_variable,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_abstract_origin, DW_FORM_ref_addr},
{DW_AT_location, DW_FORM_sec_offset},
},
},
/* PARAM */
{
DW_TAG_formal_parameter,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_variable_parameter, DW_FORM_flag},
{DW_AT_decl_line, DW_FORM_udata},
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_location, DW_FORM_block1},
},
},
/* PARAM_LOCLIST */
{
DW_TAG_formal_parameter,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_variable_parameter, DW_FORM_flag},
{DW_AT_decl_line, DW_FORM_udata},
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_location, DW_FORM_sec_offset},
},
},
/* PARAM_ABSTRACT */
{
DW_TAG_formal_parameter,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_variable_parameter, DW_FORM_flag},
{DW_AT_type, DW_FORM_ref_addr},
},
},
/* PARAM_CONCRETE */
{
DW_TAG_formal_parameter,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_abstract_origin, DW_FORM_ref_addr},
{DW_AT_location, DW_FORM_block1},
},
},
/* PARAM_CONCRETE_LOCLIST */
{
DW_TAG_formal_parameter,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_abstract_origin, DW_FORM_ref_addr},
{DW_AT_location, DW_FORM_sec_offset},
},
},
/* LEXICAL_BLOCK_RANGES */
{
DW_TAG_lexical_block,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_ranges, DW_FORM_sec_offset},
},
},
/* LEXICAL_BLOCK_SIMPLE */
{
DW_TAG_lexical_block,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_low_pc, DW_FORM_addr},
{DW_AT_high_pc, DW_FORM_addr},
},
},
/* STRUCTFIELD */
{
DW_TAG_member,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_data_member_location, DW_FORM_udata},
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_go_embedded_field, DW_FORM_flag},
},
},
/* FUNCTYPEPARAM */
{
DW_TAG_formal_parameter,
DW_CHILDREN_no,
// No name!
[]dwAttrForm{
{DW_AT_type, DW_FORM_ref_addr},
},
},
/* DOTDOTDOT */
{
DW_TAG_unspecified_parameters,
DW_CHILDREN_no,
[]dwAttrForm{},
},
/* ARRAYRANGE */
{
DW_TAG_subrange_type,
DW_CHILDREN_no,
// No name!
[]dwAttrForm{
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_count, DW_FORM_udata},
},
},
// Below here are the types considered public by ispubtype
/* NULLTYPE */
{
DW_TAG_unspecified_type,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
},
},
/* BASETYPE */
{
DW_TAG_base_type,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_encoding, DW_FORM_data1},
{DW_AT_byte_size, DW_FORM_data1},
{DW_AT_go_kind, DW_FORM_data1},
{DW_AT_go_runtime_type, DW_FORM_addr},
},
},
/* ARRAYTYPE */
// child is subrange with upper bound
{
DW_TAG_array_type,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_byte_size, DW_FORM_udata},
{DW_AT_go_kind, DW_FORM_data1},
{DW_AT_go_runtime_type, DW_FORM_addr},
},
},
/* CHANTYPE */
{
DW_TAG_typedef,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_go_kind, DW_FORM_data1},
{DW_AT_go_runtime_type, DW_FORM_addr},
{DW_AT_go_elem, DW_FORM_ref_addr},
},
},
/* FUNCTYPE */
{
DW_TAG_subroutine_type,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_byte_size, DW_FORM_udata},
{DW_AT_go_kind, DW_FORM_data1},
{DW_AT_go_runtime_type, DW_FORM_addr},
},
},
/* IFACETYPE */
{
DW_TAG_typedef,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_go_kind, DW_FORM_data1},
{DW_AT_go_runtime_type, DW_FORM_addr},
},
},
/* MAPTYPE */
{
DW_TAG_typedef,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_go_kind, DW_FORM_data1},
{DW_AT_go_runtime_type, DW_FORM_addr},
{DW_AT_go_key, DW_FORM_ref_addr},
{DW_AT_go_elem, DW_FORM_ref_addr},
},
},
/* PTRTYPE */
{
DW_TAG_pointer_type,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_type, DW_FORM_ref_addr},
{DW_AT_go_kind, DW_FORM_data1},
{DW_AT_go_runtime_type, DW_FORM_addr},
},
},
/* BARE_PTRTYPE */
{
DW_TAG_pointer_type,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
},
},
/* SLICETYPE */
{
DW_TAG_structure_type,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_byte_size, DW_FORM_udata},
{DW_AT_go_kind, DW_FORM_data1},
{DW_AT_go_runtime_type, DW_FORM_addr},
{DW_AT_go_elem, DW_FORM_ref_addr},
},
},
/* STRINGTYPE */
{
DW_TAG_structure_type,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_byte_size, DW_FORM_udata},
{DW_AT_go_kind, DW_FORM_data1},
{DW_AT_go_runtime_type, DW_FORM_addr},
},
},
/* STRUCTTYPE */
{
DW_TAG_structure_type,
DW_CHILDREN_yes,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_byte_size, DW_FORM_udata},
{DW_AT_go_kind, DW_FORM_data1},
{DW_AT_go_runtime_type, DW_FORM_addr},
},
},
/* TYPEDECL */
{
DW_TAG_typedef,
DW_CHILDREN_no,
[]dwAttrForm{
{DW_AT_name, DW_FORM_string},
{DW_AT_type, DW_FORM_ref_addr},
},
},
}
// GetAbbrev returns the contents of the .debug_abbrev section.
func GetAbbrev() []byte {
var buf []byte
for i := 1; i < DW_NABRV; i++ {
// See section 7.5.3
buf = AppendUleb128(buf, uint64(i))
buf = AppendUleb128(buf, uint64(abbrevs[i].tag))
buf = append(buf, abbrevs[i].children)
for _, f := range abbrevs[i].attr {
buf = AppendUleb128(buf, uint64(f.attr))
buf = AppendUleb128(buf, uint64(f.form))
}
buf = append(buf, 0, 0)
}
return append(buf, 0)
}
/*
* Debugging Information Entries and their attributes.
*/
// DWAttr represents an attribute of a DWDie.
//
// For DW_CLS_string and _block, value should contain the length, and
// data the data, for _reference, value is 0 and data is a DWDie* to
// the referenced instance, for all others, value is the whole thing
// and data is null.
type DWAttr struct {
Link *DWAttr
Atr uint16 // DW_AT_
Cls uint8 // DW_CLS_
Value int64
Data interface{}
}
// DWDie represents a DWARF debug info entry.
type DWDie struct {
Abbrev int
Link *DWDie
Child *DWDie
Attr *DWAttr
Sym Sym
}
func putattr(ctxt Context, s Sym, abbrev int, form int, cls int, value int64, data interface{}) error {
switch form {
case DW_FORM_addr: // address
// Allow nil addresses for DW_AT_go_runtime_type.
if data == nil && value == 0 {
ctxt.AddInt(s, ctxt.PtrSize(), 0)
break
}
if cls == DW_CLS_GO_TYPEREF {
ctxt.AddSectionOffset(s, ctxt.PtrSize(), data, value)
break
}
ctxt.AddAddress(s, data, value)
case DW_FORM_block1: // block
if cls == DW_CLS_ADDRESS {
ctxt.AddInt(s, 1, int64(1+ctxt.PtrSize()))
ctxt.AddInt(s, 1, DW_OP_addr)
ctxt.AddAddress(s, data, 0)
break
}
value &= 0xff
ctxt.AddInt(s, 1, value)
p := data.([]byte)[:value]
ctxt.AddBytes(s, p)
case DW_FORM_block2: // block
value &= 0xffff
ctxt.AddInt(s, 2, value)
p := data.([]byte)[:value]
ctxt.AddBytes(s, p)
case DW_FORM_block4: // block
value &= 0xffffffff
ctxt.AddInt(s, 4, value)
p := data.([]byte)[:value]
ctxt.AddBytes(s, p)
case DW_FORM_block: // block
Uleb128put(ctxt, s, value)
p := data.([]byte)[:value]
ctxt.AddBytes(s, p)
case DW_FORM_data1: // constant
ctxt.AddInt(s, 1, value)
case DW_FORM_data2: // constant
ctxt.AddInt(s, 2, value)
case DW_FORM_data4: // constant, {line,loclist,mac,rangelist}ptr
if cls == DW_CLS_PTR { // DW_AT_stmt_list and DW_AT_ranges
ctxt.AddDWARFSectionOffset(s, 4, data, value)
break
}
ctxt.AddInt(s, 4, value)
case DW_FORM_data8: // constant, {line,loclist,mac,rangelist}ptr
ctxt.AddInt(s, 8, value)
case DW_FORM_sdata: // constant
Sleb128put(ctxt, s, value)
case DW_FORM_udata: // constant
Uleb128put(ctxt, s, value)
case DW_FORM_string: // string
str := data.(string)
ctxt.AddString(s, str)
// TODO(ribrdb): verify padded strings are never used and remove this
for i := int64(len(str)); i < value; i++ {
ctxt.AddInt(s, 1, 0)
}
case DW_FORM_flag: // flag
if value != 0 {
ctxt.AddInt(s, 1, 1)
} else {
ctxt.AddInt(s, 1, 0)
}
// As of DWARF 3 the ref_addr is always 32 bits, unless emitting a large
// (> 4 GB of debug info aka "64-bit") unit, which we don't implement.
case DW_FORM_ref_addr: // reference to a DIE in the .info section
fallthrough
case DW_FORM_sec_offset: // offset into a DWARF section other than .info
if data == nil {
return fmt.Errorf("dwarf: null reference in %d", abbrev)
}
ctxt.AddDWARFSectionOffset(s, 4, data, value)
case DW_FORM_ref1, // reference within the compilation unit
DW_FORM_ref2, // reference
DW_FORM_ref4, // reference
DW_FORM_ref8, // reference
DW_FORM_ref_udata, // reference
DW_FORM_strp, // string
DW_FORM_indirect: // (see Section 7.5.3)
fallthrough
default:
return fmt.Errorf("dwarf: unsupported attribute form %d / class %d", form, cls)
}
return nil
}
// PutAttrs writes the attributes for a DIE to symbol 's'.
//
// Note that we can (and do) add arbitrary attributes to a DIE, but
// only the ones actually listed in the Abbrev will be written out.
func PutAttrs(ctxt Context, s Sym, abbrev int, attr *DWAttr) {
Outer:
for _, f := range abbrevs[abbrev].attr {
for ap := attr; ap != nil; ap = ap.Link {
if ap.Atr == f.attr {
putattr(ctxt, s, abbrev, int(f.form), int(ap.Cls), ap.Value, ap.Data)
continue Outer
}
}
putattr(ctxt, s, abbrev, int(f.form), 0, 0, nil)
}
}
// HasChildren returns true if 'die' uses an abbrev that supports children.
func HasChildren(die *DWDie) bool {
return abbrevs[die.Abbrev].children != 0
}
// PutIntConst writes a DIE for an integer constant
func PutIntConst(ctxt Context, info, typ Sym, name string, val int64) {
Uleb128put(ctxt, info, DW_ABRV_INT_CONSTANT)
putattr(ctxt, info, DW_ABRV_INT_CONSTANT, DW_FORM_string, DW_CLS_STRING, int64(len(name)), name)
putattr(ctxt, info, DW_ABRV_INT_CONSTANT, DW_FORM_ref_addr, DW_CLS_REFERENCE, 0, typ)
putattr(ctxt, info, DW_ABRV_INT_CONSTANT, DW_FORM_sdata, DW_CLS_CONSTANT, val, nil)
}
// PutRanges writes a range table to sym. All addresses in ranges are
// relative to some base address. If base is not nil, then they're
// relative to the start of base. If base is nil, then the caller must
// arrange a base address some other way (such as a DW_AT_low_pc
// attribute).
func PutRanges(ctxt Context, sym Sym, base Sym, ranges []Range) {
ps := ctxt.PtrSize()
cmd/link: fix up debug_range for dsymutil (revert CL 72371) Dsymutil, an utility used on macOS when externally linking executables, does not support base address selector entries in debug_ranges. CL 73271 worked around this problem by removing base address selectors and emitting CU-relative relocations for each list entry. This commit, as an optimization, reintroduces the base address selectors and changes the linker to remove them again, but only when it knows that it will have to invoke the external linker on macOS. Compilecmp comparing master with a branch that has scope tracking always enabled: completed 15 of 15, estimated time remaining 0s (eta 2:43PM) name old time/op new time/op delta Template 272ms ± 8% 257ms ± 5% -5.33% (p=0.000 n=15+14) Unicode 124ms ± 7% 122ms ± 5% ~ (p=0.210 n=14+14) GoTypes 873ms ± 3% 870ms ± 5% ~ (p=0.856 n=15+13) Compiler 4.49s ± 2% 4.49s ± 5% ~ (p=0.982 n=14+14) SSA 11.8s ± 4% 11.8s ± 3% ~ (p=0.653 n=15+15) Flate 163ms ± 6% 164ms ± 9% ~ (p=0.914 n=14+15) GoParser 203ms ± 6% 202ms ±10% ~ (p=0.571 n=14+14) Reflect 547ms ± 7% 542ms ± 4% ~ (p=0.914 n=15+14) Tar 244ms ± 7% 237ms ± 3% -2.80% (p=0.002 n=14+13) XML 289ms ± 6% 289ms ± 5% ~ (p=0.839 n=14+14) [Geo mean] 537ms 531ms -1.10% name old user-time/op new user-time/op delta Template 360ms ± 4% 341ms ± 7% -5.16% (p=0.000 n=14+14) Unicode 189ms ±11% 190ms ± 8% ~ (p=0.844 n=15+15) GoTypes 1.13s ± 4% 1.14s ± 7% ~ (p=0.582 n=15+14) Compiler 5.34s ± 2% 5.40s ± 4% +1.19% (p=0.036 n=11+13) SSA 14.7s ± 2% 14.7s ± 3% ~ (p=0.602 n=15+15) Flate 211ms ± 7% 214ms ± 8% ~ (p=0.252 n=14+14) GoParser 267ms ±12% 266ms ± 2% ~ (p=0.837 n=15+11) Reflect 706ms ± 4% 701ms ± 3% ~ (p=0.213 n=14+12) Tar 331ms ± 9% 320ms ± 5% -3.30% (p=0.025 n=15+14) XML 378ms ± 4% 373ms ± 6% ~ (p=0.253 n=14+15) [Geo mean] 704ms 700ms -0.58% name old alloc/op new alloc/op delta Template 38.0MB ± 0% 38.4MB ± 0% +1.12% (p=0.000 n=15+15) Unicode 28.8MB ± 0% 28.8MB ± 0% +0.17% (p=0.000 n=15+15) GoTypes 112MB ± 0% 114MB ± 0% +1.47% (p=0.000 n=15+15) Compiler 465MB ± 0% 473MB ± 0% +1.71% (p=0.000 n=15+15) SSA 1.48GB ± 0% 1.53GB ± 0% +3.07% (p=0.000 n=15+15) Flate 24.3MB ± 0% 24.7MB ± 0% +1.67% (p=0.000 n=15+15) GoParser 30.7MB ± 0% 31.0MB ± 0% +1.15% (p=0.000 n=12+15) Reflect 76.3MB ± 0% 77.1MB ± 0% +0.97% (p=0.000 n=15+15) Tar 39.2MB ± 0% 39.6MB ± 0% +0.91% (p=0.000 n=15+15) XML 41.5MB ± 0% 42.0MB ± 0% +1.29% (p=0.000 n=15+15) [Geo mean] 77.5MB 78.6MB +1.35% name old allocs/op new allocs/op delta Template 385k ± 0% 387k ± 0% +0.51% (p=0.000 n=15+15) Unicode 342k ± 0% 343k ± 0% +0.10% (p=0.000 n=14+15) GoTypes 1.19M ± 0% 1.19M ± 0% +0.62% (p=0.000 n=15+15) Compiler 4.51M ± 0% 4.54M ± 0% +0.50% (p=0.000 n=14+15) SSA 12.2M ± 0% 12.4M ± 0% +1.12% (p=0.000 n=14+15) Flate 234k ± 0% 236k ± 0% +0.60% (p=0.000 n=15+15) GoParser 318k ± 0% 320k ± 0% +0.60% (p=0.000 n=15+15) Reflect 974k ± 0% 977k ± 0% +0.27% (p=0.000 n=15+15) Tar 395k ± 0% 397k ± 0% +0.37% (p=0.000 n=14+15) XML 404k ± 0% 407k ± 0% +0.53% (p=0.000 n=15+15) [Geo mean] 794k 798k +0.52% name old text-bytes new text-bytes delta HelloSize 680kB ± 0% 680kB ± 0% ~ (all equal) name old data-bytes new data-bytes delta HelloSize 9.62kB ± 0% 9.62kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.11MB ± 0% 1.13MB ± 0% +1.85% (p=0.000 n=15+15) Change-Id: I61c98ba0340cb798034b2bb55e3ab3a58ac1cf23 Reviewed-on: https://go-review.googlesource.com/98075 Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-02-19 15:26:49 +01:00
// Write base address entry.
if base != nil {
ctxt.AddInt(sym, ps, -1)
ctxt.AddAddress(sym, base, 0)
}
// Write ranges.
for _, r := range ranges {
cmd/link: fix up debug_range for dsymutil (revert CL 72371) Dsymutil, an utility used on macOS when externally linking executables, does not support base address selector entries in debug_ranges. CL 73271 worked around this problem by removing base address selectors and emitting CU-relative relocations for each list entry. This commit, as an optimization, reintroduces the base address selectors and changes the linker to remove them again, but only when it knows that it will have to invoke the external linker on macOS. Compilecmp comparing master with a branch that has scope tracking always enabled: completed 15 of 15, estimated time remaining 0s (eta 2:43PM) name old time/op new time/op delta Template 272ms ± 8% 257ms ± 5% -5.33% (p=0.000 n=15+14) Unicode 124ms ± 7% 122ms ± 5% ~ (p=0.210 n=14+14) GoTypes 873ms ± 3% 870ms ± 5% ~ (p=0.856 n=15+13) Compiler 4.49s ± 2% 4.49s ± 5% ~ (p=0.982 n=14+14) SSA 11.8s ± 4% 11.8s ± 3% ~ (p=0.653 n=15+15) Flate 163ms ± 6% 164ms ± 9% ~ (p=0.914 n=14+15) GoParser 203ms ± 6% 202ms ±10% ~ (p=0.571 n=14+14) Reflect 547ms ± 7% 542ms ± 4% ~ (p=0.914 n=15+14) Tar 244ms ± 7% 237ms ± 3% -2.80% (p=0.002 n=14+13) XML 289ms ± 6% 289ms ± 5% ~ (p=0.839 n=14+14) [Geo mean] 537ms 531ms -1.10% name old user-time/op new user-time/op delta Template 360ms ± 4% 341ms ± 7% -5.16% (p=0.000 n=14+14) Unicode 189ms ±11% 190ms ± 8% ~ (p=0.844 n=15+15) GoTypes 1.13s ± 4% 1.14s ± 7% ~ (p=0.582 n=15+14) Compiler 5.34s ± 2% 5.40s ± 4% +1.19% (p=0.036 n=11+13) SSA 14.7s ± 2% 14.7s ± 3% ~ (p=0.602 n=15+15) Flate 211ms ± 7% 214ms ± 8% ~ (p=0.252 n=14+14) GoParser 267ms ±12% 266ms ± 2% ~ (p=0.837 n=15+11) Reflect 706ms ± 4% 701ms ± 3% ~ (p=0.213 n=14+12) Tar 331ms ± 9% 320ms ± 5% -3.30% (p=0.025 n=15+14) XML 378ms ± 4% 373ms ± 6% ~ (p=0.253 n=14+15) [Geo mean] 704ms 700ms -0.58% name old alloc/op new alloc/op delta Template 38.0MB ± 0% 38.4MB ± 0% +1.12% (p=0.000 n=15+15) Unicode 28.8MB ± 0% 28.8MB ± 0% +0.17% (p=0.000 n=15+15) GoTypes 112MB ± 0% 114MB ± 0% +1.47% (p=0.000 n=15+15) Compiler 465MB ± 0% 473MB ± 0% +1.71% (p=0.000 n=15+15) SSA 1.48GB ± 0% 1.53GB ± 0% +3.07% (p=0.000 n=15+15) Flate 24.3MB ± 0% 24.7MB ± 0% +1.67% (p=0.000 n=15+15) GoParser 30.7MB ± 0% 31.0MB ± 0% +1.15% (p=0.000 n=12+15) Reflect 76.3MB ± 0% 77.1MB ± 0% +0.97% (p=0.000 n=15+15) Tar 39.2MB ± 0% 39.6MB ± 0% +0.91% (p=0.000 n=15+15) XML 41.5MB ± 0% 42.0MB ± 0% +1.29% (p=0.000 n=15+15) [Geo mean] 77.5MB 78.6MB +1.35% name old allocs/op new allocs/op delta Template 385k ± 0% 387k ± 0% +0.51% (p=0.000 n=15+15) Unicode 342k ± 0% 343k ± 0% +0.10% (p=0.000 n=14+15) GoTypes 1.19M ± 0% 1.19M ± 0% +0.62% (p=0.000 n=15+15) Compiler 4.51M ± 0% 4.54M ± 0% +0.50% (p=0.000 n=14+15) SSA 12.2M ± 0% 12.4M ± 0% +1.12% (p=0.000 n=14+15) Flate 234k ± 0% 236k ± 0% +0.60% (p=0.000 n=15+15) GoParser 318k ± 0% 320k ± 0% +0.60% (p=0.000 n=15+15) Reflect 974k ± 0% 977k ± 0% +0.27% (p=0.000 n=15+15) Tar 395k ± 0% 397k ± 0% +0.37% (p=0.000 n=14+15) XML 404k ± 0% 407k ± 0% +0.53% (p=0.000 n=15+15) [Geo mean] 794k 798k +0.52% name old text-bytes new text-bytes delta HelloSize 680kB ± 0% 680kB ± 0% ~ (all equal) name old data-bytes new data-bytes delta HelloSize 9.62kB ± 0% 9.62kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.11MB ± 0% 1.13MB ± 0% +1.85% (p=0.000 n=15+15) Change-Id: I61c98ba0340cb798034b2bb55e3ab3a58ac1cf23 Reviewed-on: https://go-review.googlesource.com/98075 Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-02-19 15:26:49 +01:00
ctxt.AddInt(sym, ps, r.Start)
ctxt.AddInt(sym, ps, r.End)
}
// Write trailer.
ctxt.AddInt(sym, ps, 0)
ctxt.AddInt(sym, ps, 0)
}
// Return TRUE if the inlined call in the specified slot is empty,
// meaning it has a zero-length range (no instructions), and all
// of its children are empty.
func isEmptyInlinedCall(slot int, calls *InlCalls) bool {
ic := &calls.Calls[slot]
if ic.InlIndex == -2 {
return true
}
live := false
for _, k := range ic.Children {
if !isEmptyInlinedCall(k, calls) {
live = true
}
}
if len(ic.Ranges) > 0 {
live = true
}
if !live {
ic.InlIndex = -2
}
return !live
}
// Slot -1: return top-level inlines
// Slot >= 0: return children of that slot
func inlChildren(slot int, calls *InlCalls) []int {
var kids []int
if slot != -1 {
for _, k := range calls.Calls[slot].Children {
if !isEmptyInlinedCall(k, calls) {
kids = append(kids, k)
}
}
} else {
for k := 0; k < len(calls.Calls); k += 1 {
if calls.Calls[k].Root && !isEmptyInlinedCall(k, calls) {
kids = append(kids, k)
}
}
}
return kids
}
func inlinedVarTable(inlcalls *InlCalls) map[*Var]bool {
vars := make(map[*Var]bool)
for _, ic := range inlcalls.Calls {
for _, v := range ic.InlVars {
vars[v] = true
}
}
return vars
}
// The s.Scopes slice contains variables were originally part of the
// function being emitted, as well as variables that were imported
// from various callee functions during the inlining process. This
// function prunes out any variables from the latter category (since
// they will be emitted as part of DWARF inlined_subroutine DIEs) and
// then generates scopes for vars in the former category.
func putPrunedScopes(ctxt Context, s *FnState, fnabbrev int) error {
if len(s.Scopes) == 0 {
return nil
}
scopes := make([]Scope, len(s.Scopes), len(s.Scopes))
pvars := inlinedVarTable(&s.InlCalls)
for k, s := range s.Scopes {
var pruned Scope = Scope{Parent: s.Parent, Ranges: s.Ranges}
for i := 0; i < len(s.Vars); i++ {
_, found := pvars[s.Vars[i]]
if !found {
pruned.Vars = append(pruned.Vars, s.Vars[i])
}
}
sort.Sort(byChildIndex(pruned.Vars))
scopes[k] = pruned
}
var encbuf [20]byte
if putscope(ctxt, s, scopes, 0, fnabbrev, encbuf[:0]) < int32(len(scopes)) {
return errors.New("multiple toplevel scopes")
}
return nil
}
// Emit DWARF attributes and child DIEs for an 'abstract' subprogram.
// The abstract subprogram DIE for a function contains its
// location-independent attributes (name, type, etc). Other instances
// of the function (any inlined copy of it, or the single out-of-line
// 'concrete' instance) will contain a pointer back to this abstract
// DIE (as a space-saving measure, so that name/type etc doesn't have
// to be repeated for each inlined copy).
func PutAbstractFunc(ctxt Context, s *FnState) error {
if logDwarf {
ctxt.Logf("PutAbstractFunc(%v)\n", s.Absfn)
}
abbrev := DW_ABRV_FUNCTION_ABSTRACT
Uleb128put(ctxt, s.Absfn, int64(abbrev))
fullname := s.Name
if strings.HasPrefix(s.Name, "\"\".") {
// Generate a fully qualified name for the function in the
// abstract case. This is so as to avoid the need for the
// linker to process the DIE with patchDWARFName(); we can't
// allow the name attribute of an abstract subprogram DIE to
// be rewritten, since it would change the offsets of the
// child DIEs (which we're relying on in order for abstract
// origin references to work).
fullname = s.Importpath + "." + s.Name[3:]
}
putattr(ctxt, s.Absfn, abbrev, DW_FORM_string, DW_CLS_STRING, int64(len(fullname)), fullname)
// DW_AT_inlined value
putattr(ctxt, s.Absfn, abbrev, DW_FORM_data1, DW_CLS_CONSTANT, int64(DW_INL_inlined), nil)
var ev int64
if s.External {
ev = 1
}
putattr(ctxt, s.Absfn, abbrev, DW_FORM_flag, DW_CLS_FLAG, ev, 0)
// Child variables (may be empty)
var flattened []*Var
// This slice will hold the offset in bytes for each child var DIE
// with respect to the start of the parent subprogram DIE.
var offsets []int32
// Scopes/vars
if len(s.Scopes) > 0 {
// For abstract subprogram DIEs we want to flatten out scope info:
// lexical scope DIEs contain range and/or hi/lo PC attributes,
// which we explicitly don't want for the abstract subprogram DIE.
pvars := inlinedVarTable(&s.InlCalls)
for _, scope := range s.Scopes {
for i := 0; i < len(scope.Vars); i++ {
_, found := pvars[scope.Vars[i]]
if found || !scope.Vars[i].IsInAbstract {
continue
}
flattened = append(flattened, scope.Vars[i])
}
}
if len(flattened) > 0 {
sort.Sort(byChildIndex(flattened))
if logDwarf {
ctxt.Logf("putAbstractScope(%v): vars:", s.Info)
for i, v := range flattened {
ctxt.Logf(" %d:%s", i, v.Name)
}
ctxt.Logf("\n")
}
// This slice will hold the offset in bytes for each child
// variable DIE with respect to the start of the parent
// subprogram DIE.
for _, v := range flattened {
offsets = append(offsets, int32(ctxt.CurrentOffset(s.Absfn)))
putAbstractVar(ctxt, s.Absfn, v)
}
}
}
ctxt.RecordChildDieOffsets(s.Absfn, flattened, offsets)
Uleb128put(ctxt, s.Absfn, 0)
return nil
}
// Emit DWARF attributes and child DIEs for an inlined subroutine. The
// first attribute of an inlined subroutine DIE is a reference back to
// its corresponding 'abstract' DIE (containing location-independent
// attributes such as name, type, etc). Inlined subroutine DIEs can
// have other inlined subroutine DIEs as children.
func PutInlinedFunc(ctxt Context, s *FnState, callersym Sym, callIdx int) error {
ic := s.InlCalls.Calls[callIdx]
callee := ic.AbsFunSym
abbrev := DW_ABRV_INLINED_SUBROUTINE_RANGES
if len(ic.Ranges) == 1 {
abbrev = DW_ABRV_INLINED_SUBROUTINE
}
Uleb128put(ctxt, s.Info, int64(abbrev))
if logDwarf {
ctxt.Logf("PutInlinedFunc(caller=%v,callee=%v,abbrev=%d)\n", callersym, callee, abbrev)
}
// Abstract origin.
putattr(ctxt, s.Info, abbrev, DW_FORM_ref_addr, DW_CLS_REFERENCE, 0, callee)
if abbrev == DW_ABRV_INLINED_SUBROUTINE_RANGES {
putattr(ctxt, s.Info, abbrev, DW_FORM_sec_offset, DW_CLS_PTR, s.Ranges.Len(), s.Ranges)
PutRanges(ctxt, s.Ranges, s.StartPC, ic.Ranges)
} else {
st := ic.Ranges[0].Start
en := ic.Ranges[0].End
putattr(ctxt, s.Info, abbrev, DW_FORM_addr, DW_CLS_ADDRESS, st, s.StartPC)
putattr(ctxt, s.Info, abbrev, DW_FORM_addr, DW_CLS_ADDRESS, en, s.StartPC)
}
// Emit call file, line attrs.
ctxt.AddFileRef(s.Info, ic.CallFile)
putattr(ctxt, s.Info, abbrev, DW_FORM_udata, DW_CLS_CONSTANT, int64(ic.CallLine), nil)
// Variables associated with this inlined routine instance.
vars := ic.InlVars
sort.Sort(byChildIndex(vars))
inlIndex := ic.InlIndex
var encbuf [20]byte
for _, v := range vars {
if !v.IsInAbstract {
continue
}
putvar(ctxt, s, v, callee, abbrev, inlIndex, encbuf[:0])
}
// Children of this inline.
for _, sib := range inlChildren(callIdx, &s.InlCalls) {
absfn := s.InlCalls.Calls[sib].AbsFunSym
err := PutInlinedFunc(ctxt, s, absfn, sib)
if err != nil {
return err
}
}
Uleb128put(ctxt, s.Info, 0)
return nil
}
// Emit DWARF attributes and child DIEs for a 'concrete' subprogram,
// meaning the out-of-line copy of a function that was inlined at some
// point during the compilation of its containing package. The first
// attribute for a concrete DIE is a reference to the 'abstract' DIE
// for the function (which holds location-independent attributes such
// as name, type), then the remainder of the attributes are specific
// to this instance (location, frame base, etc).
func PutConcreteFunc(ctxt Context, s *FnState) error {
if logDwarf {
ctxt.Logf("PutConcreteFunc(%v)\n", s.Info)
}
abbrev := DW_ABRV_FUNCTION_CONCRETE
Uleb128put(ctxt, s.Info, int64(abbrev))
// Abstract origin.
putattr(ctxt, s.Info, abbrev, DW_FORM_ref_addr, DW_CLS_REFERENCE, 0, s.Absfn)
// Start/end PC.
putattr(ctxt, s.Info, abbrev, DW_FORM_addr, DW_CLS_ADDRESS, 0, s.StartPC)
putattr(ctxt, s.Info, abbrev, DW_FORM_addr, DW_CLS_ADDRESS, s.Size, s.StartPC)
// cfa / frame base
putattr(ctxt, s.Info, abbrev, DW_FORM_block1, DW_CLS_BLOCK, 1, []byte{DW_OP_call_frame_cfa})
// Scopes
if err := putPrunedScopes(ctxt, s, abbrev); err != nil {
return err
}
// Inlined subroutines.
for _, sib := range inlChildren(-1, &s.InlCalls) {
absfn := s.InlCalls.Calls[sib].AbsFunSym
err := PutInlinedFunc(ctxt, s, absfn, sib)
if err != nil {
return err
}
}
Uleb128put(ctxt, s.Info, 0)
return nil
}
// Emit DWARF attributes and child DIEs for a subprogram. Here
// 'default' implies that the function in question was not inlined
// when its containing package was compiled (hence there is no need to
// emit an abstract version for it to use as a base for inlined
// routine records).
func PutDefaultFunc(ctxt Context, s *FnState) error {
if logDwarf {
ctxt.Logf("PutDefaultFunc(%v)\n", s.Info)
}
abbrev := DW_ABRV_FUNCTION
Uleb128put(ctxt, s.Info, int64(abbrev))
putattr(ctxt, s.Info, DW_ABRV_FUNCTION, DW_FORM_string, DW_CLS_STRING, int64(len(s.Name)), s.Name)
putattr(ctxt, s.Info, abbrev, DW_FORM_addr, DW_CLS_ADDRESS, 0, s.StartPC)
putattr(ctxt, s.Info, abbrev, DW_FORM_addr, DW_CLS_ADDRESS, s.Size, s.StartPC)
putattr(ctxt, s.Info, abbrev, DW_FORM_block1, DW_CLS_BLOCK, 1, []byte{DW_OP_call_frame_cfa})
ctxt.AddFileRef(s.Info, s.Filesym)
var ev int64
if s.External {
ev = 1
}
putattr(ctxt, s.Info, abbrev, DW_FORM_flag, DW_CLS_FLAG, ev, 0)
// Scopes
if err := putPrunedScopes(ctxt, s, abbrev); err != nil {
return err
}
// Inlined subroutines.
for _, sib := range inlChildren(-1, &s.InlCalls) {
absfn := s.InlCalls.Calls[sib].AbsFunSym
err := PutInlinedFunc(ctxt, s, absfn, sib)
if err != nil {
return err
}
}
Uleb128put(ctxt, s.Info, 0)
return nil
}
func putscope(ctxt Context, s *FnState, scopes []Scope, curscope int32, fnabbrev int, encbuf []byte) int32 {
if logDwarf {
ctxt.Logf("putscope(%v,%d): vars:", s.Info, curscope)
for i, v := range scopes[curscope].Vars {
ctxt.Logf(" %d:%d:%s", i, v.ChildIndex, v.Name)
}
ctxt.Logf("\n")
}
for _, v := range scopes[curscope].Vars {
putvar(ctxt, s, v, s.Absfn, fnabbrev, -1, encbuf)
}
this := curscope
curscope++
for curscope < int32(len(scopes)) {
scope := scopes[curscope]
if scope.Parent != this {
return curscope
}
cmd/compile: optimize scope tracking 1. Detect and remove the markers of lexical scopes that don't contain any variables early in noder, instead of waiting until the end of DWARF generation. This saves memory by never allocating some of the markers and optimizes some of the algorithms that depend on the number of scopes. 2. Assign scopes to Progs by doing, for each Prog, a binary search over the markers array. This is faster, compared to sorting the Prog list because there are fewer markers than there are Progs. completed 15 of 15, estimated time remaining 0s (eta 2:30PM) name old time/op new time/op delta Template 274ms ± 5% 260ms ± 6% -4.91% (p=0.000 n=15+15) Unicode 126ms ± 5% 127ms ± 9% ~ (p=0.856 n=13+15) GoTypes 861ms ± 5% 857ms ± 4% ~ (p=0.595 n=15+15) Compiler 4.11s ± 4% 4.12s ± 5% ~ (p=1.000 n=15+15) SSA 10.7s ± 2% 10.9s ± 4% +2.01% (p=0.002 n=14+14) Flate 163ms ± 4% 166ms ± 9% ~ (p=0.134 n=14+15) GoParser 203ms ± 4% 205ms ± 6% ~ (p=0.461 n=15+15) Reflect 544ms ± 5% 549ms ± 4% ~ (p=0.174 n=15+15) Tar 249ms ± 9% 245ms ± 6% ~ (p=0.285 n=15+15) XML 286ms ± 4% 291ms ± 5% ~ (p=0.081 n=15+15) [Geo mean] 528ms 529ms +0.14% name old user-time/op new user-time/op delta Template 358ms ± 7% 354ms ± 5% ~ (p=0.242 n=14+15) Unicode 189ms ±11% 191ms ±10% ~ (p=0.438 n=15+15) GoTypes 1.15s ± 4% 1.14s ± 3% ~ (p=0.405 n=15+15) Compiler 5.36s ± 6% 5.35s ± 5% ~ (p=0.588 n=15+15) SSA 14.6s ± 3% 15.0s ± 4% +2.58% (p=0.000 n=15+15) Flate 214ms ±12% 216ms ± 8% ~ (p=0.539 n=15+15) GoParser 267ms ± 6% 270ms ± 5% ~ (p=0.569 n=15+15) Reflect 712ms ± 5% 709ms ± 4% ~ (p=0.894 n=15+15) Tar 329ms ± 8% 330ms ± 5% ~ (p=0.974 n=14+15) XML 371ms ± 3% 381ms ± 5% +2.85% (p=0.002 n=13+15) [Geo mean] 705ms 709ms +0.62% name old alloc/op new alloc/op delta Template 38.0MB ± 0% 38.4MB ± 0% +1.27% (p=0.000 n=15+14) Unicode 28.8MB ± 0% 28.8MB ± 0% +0.16% (p=0.000 n=15+14) GoTypes 112MB ± 0% 114MB ± 0% +1.64% (p=0.000 n=15+15) Compiler 465MB ± 0% 474MB ± 0% +1.91% (p=0.000 n=15+15) SSA 1.48GB ± 0% 1.53GB ± 0% +3.32% (p=0.000 n=15+15) Flate 24.3MB ± 0% 24.8MB ± 0% +1.77% (p=0.000 n=14+15) GoParser 30.7MB ± 0% 31.1MB ± 0% +1.27% (p=0.000 n=15+15) Reflect 76.3MB ± 0% 77.1MB ± 0% +1.03% (p=0.000 n=15+15) Tar 39.2MB ± 0% 39.6MB ± 0% +1.02% (p=0.000 n=13+15) XML 41.5MB ± 0% 42.1MB ± 0% +1.45% (p=0.000 n=15+15) [Geo mean] 77.5MB 78.7MB +1.48% name old allocs/op new allocs/op delta Template 385k ± 0% 387k ± 0% +0.54% (p=0.000 n=15+15) Unicode 342k ± 0% 343k ± 0% +0.10% (p=0.000 n=15+15) GoTypes 1.19M ± 0% 1.19M ± 0% +0.64% (p=0.000 n=14+15) Compiler 4.51M ± 0% 4.54M ± 0% +0.53% (p=0.000 n=15+15) SSA 12.2M ± 0% 12.4M ± 0% +1.16% (p=0.000 n=15+15) Flate 234k ± 0% 236k ± 0% +0.63% (p=0.000 n=14+15) GoParser 318k ± 0% 320k ± 0% +0.63% (p=0.000 n=15+15) Reflect 974k ± 0% 977k ± 0% +0.28% (p=0.000 n=15+15) Tar 395k ± 0% 397k ± 0% +0.38% (p=0.000 n=15+13) XML 404k ± 0% 407k ± 0% +0.55% (p=0.000 n=15+15) [Geo mean] 794k 799k +0.55% name old text-bytes new text-bytes delta HelloSize 680kB ± 0% 680kB ± 0% ~ (all equal) name old data-bytes new data-bytes delta HelloSize 9.62kB ± 0% 9.62kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.11MB ± 0% 1.12MB ± 0% +1.11% (p=0.000 n=15+15) Change-Id: I95a0173ee28c52be1a4851d2a6e389529e74bf28 Reviewed-on: https://go-review.googlesource.com/95396 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-02-15 07:40:44 +01:00
if len(scopes[curscope].Vars) == 0 {
curscope = putscope(ctxt, s, scopes, curscope, fnabbrev, encbuf)
continue
}
if len(scope.Ranges) == 1 {
Uleb128put(ctxt, s.Info, DW_ABRV_LEXICAL_BLOCK_SIMPLE)
putattr(ctxt, s.Info, DW_ABRV_LEXICAL_BLOCK_SIMPLE, DW_FORM_addr, DW_CLS_ADDRESS, scope.Ranges[0].Start, s.StartPC)
putattr(ctxt, s.Info, DW_ABRV_LEXICAL_BLOCK_SIMPLE, DW_FORM_addr, DW_CLS_ADDRESS, scope.Ranges[0].End, s.StartPC)
} else {
Uleb128put(ctxt, s.Info, DW_ABRV_LEXICAL_BLOCK_RANGES)
putattr(ctxt, s.Info, DW_ABRV_LEXICAL_BLOCK_RANGES, DW_FORM_sec_offset, DW_CLS_PTR, s.Ranges.Len(), s.Ranges)
PutRanges(ctxt, s.Ranges, s.StartPC, scope.Ranges)
}
curscope = putscope(ctxt, s, scopes, curscope, fnabbrev, encbuf)
cmd/compile: optimize scope tracking 1. Detect and remove the markers of lexical scopes that don't contain any variables early in noder, instead of waiting until the end of DWARF generation. This saves memory by never allocating some of the markers and optimizes some of the algorithms that depend on the number of scopes. 2. Assign scopes to Progs by doing, for each Prog, a binary search over the markers array. This is faster, compared to sorting the Prog list because there are fewer markers than there are Progs. completed 15 of 15, estimated time remaining 0s (eta 2:30PM) name old time/op new time/op delta Template 274ms ± 5% 260ms ± 6% -4.91% (p=0.000 n=15+15) Unicode 126ms ± 5% 127ms ± 9% ~ (p=0.856 n=13+15) GoTypes 861ms ± 5% 857ms ± 4% ~ (p=0.595 n=15+15) Compiler 4.11s ± 4% 4.12s ± 5% ~ (p=1.000 n=15+15) SSA 10.7s ± 2% 10.9s ± 4% +2.01% (p=0.002 n=14+14) Flate 163ms ± 4% 166ms ± 9% ~ (p=0.134 n=14+15) GoParser 203ms ± 4% 205ms ± 6% ~ (p=0.461 n=15+15) Reflect 544ms ± 5% 549ms ± 4% ~ (p=0.174 n=15+15) Tar 249ms ± 9% 245ms ± 6% ~ (p=0.285 n=15+15) XML 286ms ± 4% 291ms ± 5% ~ (p=0.081 n=15+15) [Geo mean] 528ms 529ms +0.14% name old user-time/op new user-time/op delta Template 358ms ± 7% 354ms ± 5% ~ (p=0.242 n=14+15) Unicode 189ms ±11% 191ms ±10% ~ (p=0.438 n=15+15) GoTypes 1.15s ± 4% 1.14s ± 3% ~ (p=0.405 n=15+15) Compiler 5.36s ± 6% 5.35s ± 5% ~ (p=0.588 n=15+15) SSA 14.6s ± 3% 15.0s ± 4% +2.58% (p=0.000 n=15+15) Flate 214ms ±12% 216ms ± 8% ~ (p=0.539 n=15+15) GoParser 267ms ± 6% 270ms ± 5% ~ (p=0.569 n=15+15) Reflect 712ms ± 5% 709ms ± 4% ~ (p=0.894 n=15+15) Tar 329ms ± 8% 330ms ± 5% ~ (p=0.974 n=14+15) XML 371ms ± 3% 381ms ± 5% +2.85% (p=0.002 n=13+15) [Geo mean] 705ms 709ms +0.62% name old alloc/op new alloc/op delta Template 38.0MB ± 0% 38.4MB ± 0% +1.27% (p=0.000 n=15+14) Unicode 28.8MB ± 0% 28.8MB ± 0% +0.16% (p=0.000 n=15+14) GoTypes 112MB ± 0% 114MB ± 0% +1.64% (p=0.000 n=15+15) Compiler 465MB ± 0% 474MB ± 0% +1.91% (p=0.000 n=15+15) SSA 1.48GB ± 0% 1.53GB ± 0% +3.32% (p=0.000 n=15+15) Flate 24.3MB ± 0% 24.8MB ± 0% +1.77% (p=0.000 n=14+15) GoParser 30.7MB ± 0% 31.1MB ± 0% +1.27% (p=0.000 n=15+15) Reflect 76.3MB ± 0% 77.1MB ± 0% +1.03% (p=0.000 n=15+15) Tar 39.2MB ± 0% 39.6MB ± 0% +1.02% (p=0.000 n=13+15) XML 41.5MB ± 0% 42.1MB ± 0% +1.45% (p=0.000 n=15+15) [Geo mean] 77.5MB 78.7MB +1.48% name old allocs/op new allocs/op delta Template 385k ± 0% 387k ± 0% +0.54% (p=0.000 n=15+15) Unicode 342k ± 0% 343k ± 0% +0.10% (p=0.000 n=15+15) GoTypes 1.19M ± 0% 1.19M ± 0% +0.64% (p=0.000 n=14+15) Compiler 4.51M ± 0% 4.54M ± 0% +0.53% (p=0.000 n=15+15) SSA 12.2M ± 0% 12.4M ± 0% +1.16% (p=0.000 n=15+15) Flate 234k ± 0% 236k ± 0% +0.63% (p=0.000 n=14+15) GoParser 318k ± 0% 320k ± 0% +0.63% (p=0.000 n=15+15) Reflect 974k ± 0% 977k ± 0% +0.28% (p=0.000 n=15+15) Tar 395k ± 0% 397k ± 0% +0.38% (p=0.000 n=15+13) XML 404k ± 0% 407k ± 0% +0.55% (p=0.000 n=15+15) [Geo mean] 794k 799k +0.55% name old text-bytes new text-bytes delta HelloSize 680kB ± 0% 680kB ± 0% ~ (all equal) name old data-bytes new data-bytes delta HelloSize 9.62kB ± 0% 9.62kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.11MB ± 0% 1.12MB ± 0% +1.11% (p=0.000 n=15+15) Change-Id: I95a0173ee28c52be1a4851d2a6e389529e74bf28 Reviewed-on: https://go-review.googlesource.com/95396 Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com>
2018-02-15 07:40:44 +01:00
Uleb128put(ctxt, s.Info, 0)
}
return curscope
}
// Given a default var abbrev code, select corresponding concrete code.
func concreteVarAbbrev(varAbbrev int) int {
switch varAbbrev {
case DW_ABRV_AUTO:
return DW_ABRV_AUTO_CONCRETE
case DW_ABRV_PARAM:
return DW_ABRV_PARAM_CONCRETE
case DW_ABRV_AUTO_LOCLIST:
return DW_ABRV_AUTO_CONCRETE_LOCLIST
case DW_ABRV_PARAM_LOCLIST:
return DW_ABRV_PARAM_CONCRETE_LOCLIST
default:
panic("should never happen")
}
}
// Pick the correct abbrev code for variable or parameter DIE.
func determineVarAbbrev(v *Var, fnabbrev int) (int, bool, bool) {
abbrev := v.Abbrev
cmd/compile: cover control flow insns in location lists The information that's used to generate DWARF location lists is very ssa.Value centric; it uses Values as start and end coordinates to define ranges. That mostly works fine, but control flow instructions don't come from Values, so the ranges couldn't cover them. Control flow instructions are generated when the SSA representation is converted to assembly, so that's the best place to extend the ranges to cover them. (Before that, there's nothing to refer to, and afterward the boundaries between blocks have been lost.) That requires block information in the debugInfo type, which then flows down to make everything else awkward. On the plus side, there's a little less copying slices around than there used to be, so it should be a little faster. Previously, the ranges for empty blocks were not very meaningful. That was fine, because they had no Values to cover, so no debug information was generated for them. But they do have control flow instructions (that's why they exist) and so now it's important that the information be correct. Introduce two sentinel values, BlockStart and BlockEnd, that denote the boundary of a block, even if the block is empty. BlockEnd replaces the previous SurvivedBlock flag. There's one more problem: the last instruction in the function will be a control flow instruction, so any live ranges need to be extended past it. But there's no instruction after it to use as the end of the range. Instead, leave the EndProg field of those ranges as nil and fix it up to point to past the end of the assembled text at the very last moment. Change-Id: I81f884020ff36fd6fe8d7888fc57c99412c4245b Reviewed-on: https://go-review.googlesource.com/63010 Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-11 14:28:34 -04:00
// If the variable was entirely optimized out, don't emit a location list;
// convert to an inline abbreviation and emit an empty location.
missing := false
switch {
cmd/compile: reimplement location list generation Completely redesign and reimplement location list generation to be more efficient, and hopefully not too hard to understand. RegKills are gone. Instead of using the regalloc's liveness calculations, redo them using the Ops' clobber information. Besides saving a lot of Values, this avoids adding RegKills to blocks that would be empty otherwise, which was messing up optimizations. This does mean that it's much harder to tell whether the generation process is buggy (there's nothing to cross-check it with), and there may be disagreements with GC liveness. But the performance gain is significant, and it's nice not to be messing with earlier compiler phases. The intermediate representations are gone. Instead of producing ssa.BlockDebugs, then dwarf.LocationLists, and then finally real location lists, go directly from the SSA to a (mostly) real location list. Because the SSA analysis happens before assembly, it stores encoded block/value IDs where PCs would normally go. It would be easier to do the SSA analysis after assembly, but I didn't want to retain the SSA just for that. Generation proceeds in two phases: first, it traverses the function in CFG order, storing the state of the block at the beginning and end. End states are used to produce the start states of the successor blocks. In the second phase, it traverses in program text order and produces the location lists. The processing in the second phase is redundant, but much cheaper than storing the intermediate representation. It might be possible to combine the two phases somewhat to take advantage of cases where the CFG matches the block layout, but I haven't tried. Location lists are finalized by adding a base address selection entry, translating each encoded block/value ID to a real PC, and adding the terminating zero entry. This probably won't work on OSX, where dsymutil will choke on the base address selection. I tried emitting CU-relative relocations for each address, and it was *very* bad for performance -- it uses more memory storing all the relocations than it does for the actual location list bytes. I think I'm going to end up synthesizing the relocations in the linker only on OSX, but TBD. TestNexting needs updating: with more optimizations working, the debugger doesn't stop on the continue (line 88) any more, and the test's duplicate suppression kicks in. Also, dx and dy live a little longer now, but they have the correct values. Change-Id: Ie772dfe23a4e389ca573624fac4d05401ae32307 Reviewed-on: https://go-review.googlesource.com/89356 Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2017-10-26 15:40:17 -04:00
case abbrev == DW_ABRV_AUTO_LOCLIST && v.PutLocationList == nil:
cmd/compile: cover control flow insns in location lists The information that's used to generate DWARF location lists is very ssa.Value centric; it uses Values as start and end coordinates to define ranges. That mostly works fine, but control flow instructions don't come from Values, so the ranges couldn't cover them. Control flow instructions are generated when the SSA representation is converted to assembly, so that's the best place to extend the ranges to cover them. (Before that, there's nothing to refer to, and afterward the boundaries between blocks have been lost.) That requires block information in the debugInfo type, which then flows down to make everything else awkward. On the plus side, there's a little less copying slices around than there used to be, so it should be a little faster. Previously, the ranges for empty blocks were not very meaningful. That was fine, because they had no Values to cover, so no debug information was generated for them. But they do have control flow instructions (that's why they exist) and so now it's important that the information be correct. Introduce two sentinel values, BlockStart and BlockEnd, that denote the boundary of a block, even if the block is empty. BlockEnd replaces the previous SurvivedBlock flag. There's one more problem: the last instruction in the function will be a control flow instruction, so any live ranges need to be extended past it. But there's no instruction after it to use as the end of the range. Instead, leave the EndProg field of those ranges as nil and fix it up to point to past the end of the assembled text at the very last moment. Change-Id: I81f884020ff36fd6fe8d7888fc57c99412c4245b Reviewed-on: https://go-review.googlesource.com/63010 Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-11 14:28:34 -04:00
missing = true
abbrev = DW_ABRV_AUTO
cmd/compile: reimplement location list generation Completely redesign and reimplement location list generation to be more efficient, and hopefully not too hard to understand. RegKills are gone. Instead of using the regalloc's liveness calculations, redo them using the Ops' clobber information. Besides saving a lot of Values, this avoids adding RegKills to blocks that would be empty otherwise, which was messing up optimizations. This does mean that it's much harder to tell whether the generation process is buggy (there's nothing to cross-check it with), and there may be disagreements with GC liveness. But the performance gain is significant, and it's nice not to be messing with earlier compiler phases. The intermediate representations are gone. Instead of producing ssa.BlockDebugs, then dwarf.LocationLists, and then finally real location lists, go directly from the SSA to a (mostly) real location list. Because the SSA analysis happens before assembly, it stores encoded block/value IDs where PCs would normally go. It would be easier to do the SSA analysis after assembly, but I didn't want to retain the SSA just for that. Generation proceeds in two phases: first, it traverses the function in CFG order, storing the state of the block at the beginning and end. End states are used to produce the start states of the successor blocks. In the second phase, it traverses in program text order and produces the location lists. The processing in the second phase is redundant, but much cheaper than storing the intermediate representation. It might be possible to combine the two phases somewhat to take advantage of cases where the CFG matches the block layout, but I haven't tried. Location lists are finalized by adding a base address selection entry, translating each encoded block/value ID to a real PC, and adding the terminating zero entry. This probably won't work on OSX, where dsymutil will choke on the base address selection. I tried emitting CU-relative relocations for each address, and it was *very* bad for performance -- it uses more memory storing all the relocations than it does for the actual location list bytes. I think I'm going to end up synthesizing the relocations in the linker only on OSX, but TBD. TestNexting needs updating: with more optimizations working, the debugger doesn't stop on the continue (line 88) any more, and the test's duplicate suppression kicks in. Also, dx and dy live a little longer now, but they have the correct values. Change-Id: Ie772dfe23a4e389ca573624fac4d05401ae32307 Reviewed-on: https://go-review.googlesource.com/89356 Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2017-10-26 15:40:17 -04:00
case abbrev == DW_ABRV_PARAM_LOCLIST && v.PutLocationList == nil:
cmd/compile: cover control flow insns in location lists The information that's used to generate DWARF location lists is very ssa.Value centric; it uses Values as start and end coordinates to define ranges. That mostly works fine, but control flow instructions don't come from Values, so the ranges couldn't cover them. Control flow instructions are generated when the SSA representation is converted to assembly, so that's the best place to extend the ranges to cover them. (Before that, there's nothing to refer to, and afterward the boundaries between blocks have been lost.) That requires block information in the debugInfo type, which then flows down to make everything else awkward. On the plus side, there's a little less copying slices around than there used to be, so it should be a little faster. Previously, the ranges for empty blocks were not very meaningful. That was fine, because they had no Values to cover, so no debug information was generated for them. But they do have control flow instructions (that's why they exist) and so now it's important that the information be correct. Introduce two sentinel values, BlockStart and BlockEnd, that denote the boundary of a block, even if the block is empty. BlockEnd replaces the previous SurvivedBlock flag. There's one more problem: the last instruction in the function will be a control flow instruction, so any live ranges need to be extended past it. But there's no instruction after it to use as the end of the range. Instead, leave the EndProg field of those ranges as nil and fix it up to point to past the end of the assembled text at the very last moment. Change-Id: I81f884020ff36fd6fe8d7888fc57c99412c4245b Reviewed-on: https://go-review.googlesource.com/63010 Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-11 14:28:34 -04:00
missing = true
abbrev = DW_ABRV_PARAM
cmd/compile: cover control flow insns in location lists The information that's used to generate DWARF location lists is very ssa.Value centric; it uses Values as start and end coordinates to define ranges. That mostly works fine, but control flow instructions don't come from Values, so the ranges couldn't cover them. Control flow instructions are generated when the SSA representation is converted to assembly, so that's the best place to extend the ranges to cover them. (Before that, there's nothing to refer to, and afterward the boundaries between blocks have been lost.) That requires block information in the debugInfo type, which then flows down to make everything else awkward. On the plus side, there's a little less copying slices around than there used to be, so it should be a little faster. Previously, the ranges for empty blocks were not very meaningful. That was fine, because they had no Values to cover, so no debug information was generated for them. But they do have control flow instructions (that's why they exist) and so now it's important that the information be correct. Introduce two sentinel values, BlockStart and BlockEnd, that denote the boundary of a block, even if the block is empty. BlockEnd replaces the previous SurvivedBlock flag. There's one more problem: the last instruction in the function will be a control flow instruction, so any live ranges need to be extended past it. But there's no instruction after it to use as the end of the range. Instead, leave the EndProg field of those ranges as nil and fix it up to point to past the end of the assembled text at the very last moment. Change-Id: I81f884020ff36fd6fe8d7888fc57c99412c4245b Reviewed-on: https://go-review.googlesource.com/63010 Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-11 14:28:34 -04:00
}
// Determine whether to use a concrete variable or regular variable DIE.
concrete := true
switch fnabbrev {
case DW_ABRV_FUNCTION:
concrete = false
break
case DW_ABRV_FUNCTION_CONCRETE:
// If we're emitting a concrete subprogram DIE and the variable
// in question is not part of the corresponding abstract function DIE,
// then use the default (non-concrete) abbrev for this param.
if !v.IsInAbstract {
concrete = false
}
case DW_ABRV_INLINED_SUBROUTINE, DW_ABRV_INLINED_SUBROUTINE_RANGES:
default:
panic("should never happen")
}
// Select proper abbrev based on concrete/non-concrete
if concrete {
abbrev = concreteVarAbbrev(abbrev)
}
return abbrev, missing, concrete
}
func abbrevUsesLoclist(abbrev int) bool {
switch abbrev {
case DW_ABRV_AUTO_LOCLIST, DW_ABRV_AUTO_CONCRETE_LOCLIST,
DW_ABRV_PARAM_LOCLIST, DW_ABRV_PARAM_CONCRETE_LOCLIST:
return true
default:
return false
}
}
// Emit DWARF attributes for a variable belonging to an 'abstract' subprogram.
func putAbstractVar(ctxt Context, info Sym, v *Var) {
// Remap abbrev
abbrev := v.Abbrev
switch abbrev {
case DW_ABRV_AUTO, DW_ABRV_AUTO_LOCLIST:
abbrev = DW_ABRV_AUTO_ABSTRACT
case DW_ABRV_PARAM, DW_ABRV_PARAM_LOCLIST:
abbrev = DW_ABRV_PARAM_ABSTRACT
}
Uleb128put(ctxt, info, int64(abbrev))
putattr(ctxt, info, abbrev, DW_FORM_string, DW_CLS_STRING, int64(len(v.Name)), v.Name)
// Isreturn attribute if this is a param
if abbrev == DW_ABRV_PARAM_ABSTRACT {
var isReturn int64
if v.IsReturnValue {
isReturn = 1
}
putattr(ctxt, info, abbrev, DW_FORM_flag, DW_CLS_FLAG, isReturn, nil)
}
// Line
if abbrev != DW_ABRV_PARAM_ABSTRACT {
// See issue 23374 for more on why decl line is skipped for abs params.
putattr(ctxt, info, abbrev, DW_FORM_udata, DW_CLS_CONSTANT, int64(v.DeclLine), nil)
}
// Type
putattr(ctxt, info, abbrev, DW_FORM_ref_addr, DW_CLS_REFERENCE, 0, v.Type)
// Var has no children => no terminator
}
func putvar(ctxt Context, s *FnState, v *Var, absfn Sym, fnabbrev, inlIndex int, encbuf []byte) {
// Remap abbrev according to parent DIE abbrev
abbrev, missing, concrete := determineVarAbbrev(v, fnabbrev)
Uleb128put(ctxt, s.Info, int64(abbrev))
// Abstract origin for concrete / inlined case
if concrete {
// Here we are making a reference to a child DIE of an abstract
// function subprogram DIE. The child DIE has no LSym, so instead
// after the call to 'putattr' below we make a call to register
// the child DIE reference.
putattr(ctxt, s.Info, abbrev, DW_FORM_ref_addr, DW_CLS_REFERENCE, 0, absfn)
ctxt.RecordDclReference(s.Info, absfn, int(v.ChildIndex), inlIndex)
} else {
// Var name, line for abstract and default cases
n := v.Name
putattr(ctxt, s.Info, abbrev, DW_FORM_string, DW_CLS_STRING, int64(len(n)), n)
if abbrev == DW_ABRV_PARAM || abbrev == DW_ABRV_PARAM_LOCLIST || abbrev == DW_ABRV_PARAM_ABSTRACT {
var isReturn int64
if v.IsReturnValue {
isReturn = 1
}
putattr(ctxt, s.Info, abbrev, DW_FORM_flag, DW_CLS_FLAG, isReturn, nil)
}
putattr(ctxt, s.Info, abbrev, DW_FORM_udata, DW_CLS_CONSTANT, int64(v.DeclLine), nil)
putattr(ctxt, s.Info, abbrev, DW_FORM_ref_addr, DW_CLS_REFERENCE, 0, v.Type)
}
if abbrevUsesLoclist(abbrev) {
putattr(ctxt, s.Info, abbrev, DW_FORM_sec_offset, DW_CLS_PTR, s.Loc.Len(), s.Loc)
cmd/compile: reimplement location list generation Completely redesign and reimplement location list generation to be more efficient, and hopefully not too hard to understand. RegKills are gone. Instead of using the regalloc's liveness calculations, redo them using the Ops' clobber information. Besides saving a lot of Values, this avoids adding RegKills to blocks that would be empty otherwise, which was messing up optimizations. This does mean that it's much harder to tell whether the generation process is buggy (there's nothing to cross-check it with), and there may be disagreements with GC liveness. But the performance gain is significant, and it's nice not to be messing with earlier compiler phases. The intermediate representations are gone. Instead of producing ssa.BlockDebugs, then dwarf.LocationLists, and then finally real location lists, go directly from the SSA to a (mostly) real location list. Because the SSA analysis happens before assembly, it stores encoded block/value IDs where PCs would normally go. It would be easier to do the SSA analysis after assembly, but I didn't want to retain the SSA just for that. Generation proceeds in two phases: first, it traverses the function in CFG order, storing the state of the block at the beginning and end. End states are used to produce the start states of the successor blocks. In the second phase, it traverses in program text order and produces the location lists. The processing in the second phase is redundant, but much cheaper than storing the intermediate representation. It might be possible to combine the two phases somewhat to take advantage of cases where the CFG matches the block layout, but I haven't tried. Location lists are finalized by adding a base address selection entry, translating each encoded block/value ID to a real PC, and adding the terminating zero entry. This probably won't work on OSX, where dsymutil will choke on the base address selection. I tried emitting CU-relative relocations for each address, and it was *very* bad for performance -- it uses more memory storing all the relocations than it does for the actual location list bytes. I think I'm going to end up synthesizing the relocations in the linker only on OSX, but TBD. TestNexting needs updating: with more optimizations working, the debugger doesn't stop on the continue (line 88) any more, and the test's duplicate suppression kicks in. Also, dx and dy live a little longer now, but they have the correct values. Change-Id: Ie772dfe23a4e389ca573624fac4d05401ae32307 Reviewed-on: https://go-review.googlesource.com/89356 Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>
2017-10-26 15:40:17 -04:00
v.PutLocationList(s.Loc, s.StartPC)
[dev.debug] cmd/compile: better DWARF with optimizations on Debuggers use DWARF information to find local variables on the stack and in registers. Prior to this CL, the DWARF information for functions claimed that all variables were on the stack at all times. That's incorrect when optimizations are enabled, and results in debuggers showing data that is out of date or complete gibberish. After this CL, the compiler is capable of representing variable locations more accurately, and attempts to do so. Due to limitations of the SSA backend, it's not possible to be completely correct. There are a number of problems in the current design. One of the easier to understand is that variable names currently must be attached to an SSA value, but not all assignments in the source code actually result in machine code. For example: type myint int var a int b := myint(int) and b := (*uint64)(unsafe.Pointer(a)) don't generate machine code because the underlying representation is the same, so the correct value of b will not be set when the user would expect. Generating the more precise debug information is behind a flag, dwarflocationlists. Because of the issues described above, setting the flag may not make the debugging experience much better, and may actually make it worse in cases where the variable actually is on the stack and the more complicated analysis doesn't realize it. A number of changes are included: - Add a new pseudo-instruction, RegKill, which indicates that the value in the register has been clobbered. - Adjust regalloc to emit RegKills in the right places. Significantly, this means that phis are mixed with StoreReg and RegKills after regalloc. - Track variable decomposition in ssa.LocalSlots. - After the SSA backend is done, analyze the result and build location lists for each LocalSlot. - After assembly is done, update the location lists with the assembled PC offsets, recompose variables, and build DWARF location lists. Emit the list as a new linker symbol, one per function. - In the linker, aggregate the location lists into a .debug_loc section. TODO: - currently disabled for non-X86/AMD64 because there are no data tables. go build -toolexec 'toolstash -cmp' -a std succeeds. With -dwarflocationlists false: before: f02812195637909ff675782c0b46836a8ff01976 after: 06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec benchstat -geomean /tmp/220352263 /tmp/621364410 completed 15 of 15, estimated time remaining 0s (eta 3:52PM) name old time/op new time/op delta Template 199ms ± 3% 198ms ± 2% ~ (p=0.400 n=15+14) Unicode 96.6ms ± 5% 96.4ms ± 5% ~ (p=0.838 n=15+15) GoTypes 653ms ± 2% 647ms ± 2% ~ (p=0.102 n=15+14) Flate 133ms ± 6% 129ms ± 3% -2.62% (p=0.041 n=15+15) GoParser 164ms ± 5% 159ms ± 3% -3.05% (p=0.000 n=15+15) Reflect 428ms ± 4% 422ms ± 3% ~ (p=0.156 n=15+13) Tar 123ms ±10% 124ms ± 8% ~ (p=0.461 n=15+15) XML 228ms ± 3% 224ms ± 3% -1.57% (p=0.045 n=15+15) [Geo mean] 206ms 377ms +82.86% name old user-time/op new user-time/op delta Template 292ms ±10% 301ms ±12% ~ (p=0.189 n=15+15) Unicode 166ms ±37% 158ms ±14% ~ (p=0.418 n=15+14) GoTypes 962ms ± 6% 963ms ± 7% ~ (p=0.976 n=15+15) Flate 207ms ±19% 200ms ±14% ~ (p=0.345 n=14+15) GoParser 246ms ±22% 240ms ±15% ~ (p=0.587 n=15+15) Reflect 611ms ±13% 587ms ±14% ~ (p=0.085 n=15+13) Tar 211ms ±12% 217ms ±14% ~ (p=0.355 n=14+15) XML 335ms ±15% 320ms ±18% ~ (p=0.169 n=15+15) [Geo mean] 317ms 583ms +83.72% name old alloc/op new alloc/op delta Template 40.2MB ± 0% 40.2MB ± 0% -0.15% (p=0.000 n=14+15) Unicode 29.2MB ± 0% 29.3MB ± 0% ~ (p=0.624 n=15+15) GoTypes 114MB ± 0% 114MB ± 0% -0.15% (p=0.000 n=15+14) Flate 25.7MB ± 0% 25.6MB ± 0% -0.18% (p=0.000 n=13+15) GoParser 32.2MB ± 0% 32.2MB ± 0% -0.14% (p=0.003 n=15+15) Reflect 77.8MB ± 0% 77.9MB ± 0% ~ (p=0.061 n=15+15) Tar 27.1MB ± 0% 27.0MB ± 0% -0.11% (p=0.029 n=15+15) XML 42.7MB ± 0% 42.5MB ± 0% -0.29% (p=0.000 n=15+15) [Geo mean] 42.1MB 75.0MB +78.05% name old allocs/op new allocs/op delta Template 402k ± 1% 398k ± 0% -0.91% (p=0.000 n=15+15) Unicode 344k ± 1% 344k ± 0% ~ (p=0.715 n=15+14) GoTypes 1.18M ± 0% 1.17M ± 0% -0.91% (p=0.000 n=15+14) Flate 243k ± 0% 240k ± 1% -1.05% (p=0.000 n=13+15) GoParser 327k ± 1% 324k ± 1% -0.96% (p=0.000 n=15+15) Reflect 984k ± 1% 982k ± 0% ~ (p=0.050 n=15+15) Tar 261k ± 1% 259k ± 1% -0.77% (p=0.000 n=15+15) XML 411k ± 0% 404k ± 1% -1.55% (p=0.000 n=15+15) [Geo mean] 439k 755k +72.01% name old text-bytes new text-bytes delta HelloSize 694kB ± 0% 694kB ± 0% -0.00% (p=0.000 n=15+15) name old data-bytes new data-bytes delta HelloSize 5.55kB ± 0% 5.55kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 133kB ± 0% 133kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.04MB ± 0% 1.04MB ± 0% ~ (all equal) Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8 Reviewed-on: https://go-review.googlesource.com/41770 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-21 18:30:19 -04:00
} else {
loc := encbuf[:0]
cmd/compile: cover control flow insns in location lists The information that's used to generate DWARF location lists is very ssa.Value centric; it uses Values as start and end coordinates to define ranges. That mostly works fine, but control flow instructions don't come from Values, so the ranges couldn't cover them. Control flow instructions are generated when the SSA representation is converted to assembly, so that's the best place to extend the ranges to cover them. (Before that, there's nothing to refer to, and afterward the boundaries between blocks have been lost.) That requires block information in the debugInfo type, which then flows down to make everything else awkward. On the plus side, there's a little less copying slices around than there used to be, so it should be a little faster. Previously, the ranges for empty blocks were not very meaningful. That was fine, because they had no Values to cover, so no debug information was generated for them. But they do have control flow instructions (that's why they exist) and so now it's important that the information be correct. Introduce two sentinel values, BlockStart and BlockEnd, that denote the boundary of a block, even if the block is empty. BlockEnd replaces the previous SurvivedBlock flag. There's one more problem: the last instruction in the function will be a control flow instruction, so any live ranges need to be extended past it. But there's no instruction after it to use as the end of the range. Instead, leave the EndProg field of those ranges as nil and fix it up to point to past the end of the assembled text at the very last moment. Change-Id: I81f884020ff36fd6fe8d7888fc57c99412c4245b Reviewed-on: https://go-review.googlesource.com/63010 Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-11 14:28:34 -04:00
switch {
case missing:
break // no location
case v.StackOffset == 0:
loc = append(loc, DW_OP_call_frame_cfa)
cmd/compile: cover control flow insns in location lists The information that's used to generate DWARF location lists is very ssa.Value centric; it uses Values as start and end coordinates to define ranges. That mostly works fine, but control flow instructions don't come from Values, so the ranges couldn't cover them. Control flow instructions are generated when the SSA representation is converted to assembly, so that's the best place to extend the ranges to cover them. (Before that, there's nothing to refer to, and afterward the boundaries between blocks have been lost.) That requires block information in the debugInfo type, which then flows down to make everything else awkward. On the plus side, there's a little less copying slices around than there used to be, so it should be a little faster. Previously, the ranges for empty blocks were not very meaningful. That was fine, because they had no Values to cover, so no debug information was generated for them. But they do have control flow instructions (that's why they exist) and so now it's important that the information be correct. Introduce two sentinel values, BlockStart and BlockEnd, that denote the boundary of a block, even if the block is empty. BlockEnd replaces the previous SurvivedBlock flag. There's one more problem: the last instruction in the function will be a control flow instruction, so any live ranges need to be extended past it. But there's no instruction after it to use as the end of the range. Instead, leave the EndProg field of those ranges as nil and fix it up to point to past the end of the assembled text at the very last moment. Change-Id: I81f884020ff36fd6fe8d7888fc57c99412c4245b Reviewed-on: https://go-review.googlesource.com/63010 Reviewed-by: Alessandro Arzilli <alessandro.arzilli@gmail.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Heschi Kreinick <heschi@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-09-11 14:28:34 -04:00
default:
loc = append(loc, DW_OP_fbreg)
[dev.debug] cmd/compile: better DWARF with optimizations on Debuggers use DWARF information to find local variables on the stack and in registers. Prior to this CL, the DWARF information for functions claimed that all variables were on the stack at all times. That's incorrect when optimizations are enabled, and results in debuggers showing data that is out of date or complete gibberish. After this CL, the compiler is capable of representing variable locations more accurately, and attempts to do so. Due to limitations of the SSA backend, it's not possible to be completely correct. There are a number of problems in the current design. One of the easier to understand is that variable names currently must be attached to an SSA value, but not all assignments in the source code actually result in machine code. For example: type myint int var a int b := myint(int) and b := (*uint64)(unsafe.Pointer(a)) don't generate machine code because the underlying representation is the same, so the correct value of b will not be set when the user would expect. Generating the more precise debug information is behind a flag, dwarflocationlists. Because of the issues described above, setting the flag may not make the debugging experience much better, and may actually make it worse in cases where the variable actually is on the stack and the more complicated analysis doesn't realize it. A number of changes are included: - Add a new pseudo-instruction, RegKill, which indicates that the value in the register has been clobbered. - Adjust regalloc to emit RegKills in the right places. Significantly, this means that phis are mixed with StoreReg and RegKills after regalloc. - Track variable decomposition in ssa.LocalSlots. - After the SSA backend is done, analyze the result and build location lists for each LocalSlot. - After assembly is done, update the location lists with the assembled PC offsets, recompose variables, and build DWARF location lists. Emit the list as a new linker symbol, one per function. - In the linker, aggregate the location lists into a .debug_loc section. TODO: - currently disabled for non-X86/AMD64 because there are no data tables. go build -toolexec 'toolstash -cmp' -a std succeeds. With -dwarflocationlists false: before: f02812195637909ff675782c0b46836a8ff01976 after: 06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec benchstat -geomean /tmp/220352263 /tmp/621364410 completed 15 of 15, estimated time remaining 0s (eta 3:52PM) name old time/op new time/op delta Template 199ms ± 3% 198ms ± 2% ~ (p=0.400 n=15+14) Unicode 96.6ms ± 5% 96.4ms ± 5% ~ (p=0.838 n=15+15) GoTypes 653ms ± 2% 647ms ± 2% ~ (p=0.102 n=15+14) Flate 133ms ± 6% 129ms ± 3% -2.62% (p=0.041 n=15+15) GoParser 164ms ± 5% 159ms ± 3% -3.05% (p=0.000 n=15+15) Reflect 428ms ± 4% 422ms ± 3% ~ (p=0.156 n=15+13) Tar 123ms ±10% 124ms ± 8% ~ (p=0.461 n=15+15) XML 228ms ± 3% 224ms ± 3% -1.57% (p=0.045 n=15+15) [Geo mean] 206ms 377ms +82.86% name old user-time/op new user-time/op delta Template 292ms ±10% 301ms ±12% ~ (p=0.189 n=15+15) Unicode 166ms ±37% 158ms ±14% ~ (p=0.418 n=15+14) GoTypes 962ms ± 6% 963ms ± 7% ~ (p=0.976 n=15+15) Flate 207ms ±19% 200ms ±14% ~ (p=0.345 n=14+15) GoParser 246ms ±22% 240ms ±15% ~ (p=0.587 n=15+15) Reflect 611ms ±13% 587ms ±14% ~ (p=0.085 n=15+13) Tar 211ms ±12% 217ms ±14% ~ (p=0.355 n=14+15) XML 335ms ±15% 320ms ±18% ~ (p=0.169 n=15+15) [Geo mean] 317ms 583ms +83.72% name old alloc/op new alloc/op delta Template 40.2MB ± 0% 40.2MB ± 0% -0.15% (p=0.000 n=14+15) Unicode 29.2MB ± 0% 29.3MB ± 0% ~ (p=0.624 n=15+15) GoTypes 114MB ± 0% 114MB ± 0% -0.15% (p=0.000 n=15+14) Flate 25.7MB ± 0% 25.6MB ± 0% -0.18% (p=0.000 n=13+15) GoParser 32.2MB ± 0% 32.2MB ± 0% -0.14% (p=0.003 n=15+15) Reflect 77.8MB ± 0% 77.9MB ± 0% ~ (p=0.061 n=15+15) Tar 27.1MB ± 0% 27.0MB ± 0% -0.11% (p=0.029 n=15+15) XML 42.7MB ± 0% 42.5MB ± 0% -0.29% (p=0.000 n=15+15) [Geo mean] 42.1MB 75.0MB +78.05% name old allocs/op new allocs/op delta Template 402k ± 1% 398k ± 0% -0.91% (p=0.000 n=15+15) Unicode 344k ± 1% 344k ± 0% ~ (p=0.715 n=15+14) GoTypes 1.18M ± 0% 1.17M ± 0% -0.91% (p=0.000 n=15+14) Flate 243k ± 0% 240k ± 1% -1.05% (p=0.000 n=13+15) GoParser 327k ± 1% 324k ± 1% -0.96% (p=0.000 n=15+15) Reflect 984k ± 1% 982k ± 0% ~ (p=0.050 n=15+15) Tar 261k ± 1% 259k ± 1% -0.77% (p=0.000 n=15+15) XML 411k ± 0% 404k ± 1% -1.55% (p=0.000 n=15+15) [Geo mean] 439k 755k +72.01% name old text-bytes new text-bytes delta HelloSize 694kB ± 0% 694kB ± 0% -0.00% (p=0.000 n=15+15) name old data-bytes new data-bytes delta HelloSize 5.55kB ± 0% 5.55kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 133kB ± 0% 133kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.04MB ± 0% 1.04MB ± 0% ~ (all equal) Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8 Reviewed-on: https://go-review.googlesource.com/41770 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-21 18:30:19 -04:00
loc = AppendSleb128(loc, int64(v.StackOffset))
}
putattr(ctxt, s.Info, abbrev, DW_FORM_block1, DW_CLS_BLOCK, int64(len(loc)), loc)
[dev.debug] cmd/compile: better DWARF with optimizations on Debuggers use DWARF information to find local variables on the stack and in registers. Prior to this CL, the DWARF information for functions claimed that all variables were on the stack at all times. That's incorrect when optimizations are enabled, and results in debuggers showing data that is out of date or complete gibberish. After this CL, the compiler is capable of representing variable locations more accurately, and attempts to do so. Due to limitations of the SSA backend, it's not possible to be completely correct. There are a number of problems in the current design. One of the easier to understand is that variable names currently must be attached to an SSA value, but not all assignments in the source code actually result in machine code. For example: type myint int var a int b := myint(int) and b := (*uint64)(unsafe.Pointer(a)) don't generate machine code because the underlying representation is the same, so the correct value of b will not be set when the user would expect. Generating the more precise debug information is behind a flag, dwarflocationlists. Because of the issues described above, setting the flag may not make the debugging experience much better, and may actually make it worse in cases where the variable actually is on the stack and the more complicated analysis doesn't realize it. A number of changes are included: - Add a new pseudo-instruction, RegKill, which indicates that the value in the register has been clobbered. - Adjust regalloc to emit RegKills in the right places. Significantly, this means that phis are mixed with StoreReg and RegKills after regalloc. - Track variable decomposition in ssa.LocalSlots. - After the SSA backend is done, analyze the result and build location lists for each LocalSlot. - After assembly is done, update the location lists with the assembled PC offsets, recompose variables, and build DWARF location lists. Emit the list as a new linker symbol, one per function. - In the linker, aggregate the location lists into a .debug_loc section. TODO: - currently disabled for non-X86/AMD64 because there are no data tables. go build -toolexec 'toolstash -cmp' -a std succeeds. With -dwarflocationlists false: before: f02812195637909ff675782c0b46836a8ff01976 after: 06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec benchstat -geomean /tmp/220352263 /tmp/621364410 completed 15 of 15, estimated time remaining 0s (eta 3:52PM) name old time/op new time/op delta Template 199ms ± 3% 198ms ± 2% ~ (p=0.400 n=15+14) Unicode 96.6ms ± 5% 96.4ms ± 5% ~ (p=0.838 n=15+15) GoTypes 653ms ± 2% 647ms ± 2% ~ (p=0.102 n=15+14) Flate 133ms ± 6% 129ms ± 3% -2.62% (p=0.041 n=15+15) GoParser 164ms ± 5% 159ms ± 3% -3.05% (p=0.000 n=15+15) Reflect 428ms ± 4% 422ms ± 3% ~ (p=0.156 n=15+13) Tar 123ms ±10% 124ms ± 8% ~ (p=0.461 n=15+15) XML 228ms ± 3% 224ms ± 3% -1.57% (p=0.045 n=15+15) [Geo mean] 206ms 377ms +82.86% name old user-time/op new user-time/op delta Template 292ms ±10% 301ms ±12% ~ (p=0.189 n=15+15) Unicode 166ms ±37% 158ms ±14% ~ (p=0.418 n=15+14) GoTypes 962ms ± 6% 963ms ± 7% ~ (p=0.976 n=15+15) Flate 207ms ±19% 200ms ±14% ~ (p=0.345 n=14+15) GoParser 246ms ±22% 240ms ±15% ~ (p=0.587 n=15+15) Reflect 611ms ±13% 587ms ±14% ~ (p=0.085 n=15+13) Tar 211ms ±12% 217ms ±14% ~ (p=0.355 n=14+15) XML 335ms ±15% 320ms ±18% ~ (p=0.169 n=15+15) [Geo mean] 317ms 583ms +83.72% name old alloc/op new alloc/op delta Template 40.2MB ± 0% 40.2MB ± 0% -0.15% (p=0.000 n=14+15) Unicode 29.2MB ± 0% 29.3MB ± 0% ~ (p=0.624 n=15+15) GoTypes 114MB ± 0% 114MB ± 0% -0.15% (p=0.000 n=15+14) Flate 25.7MB ± 0% 25.6MB ± 0% -0.18% (p=0.000 n=13+15) GoParser 32.2MB ± 0% 32.2MB ± 0% -0.14% (p=0.003 n=15+15) Reflect 77.8MB ± 0% 77.9MB ± 0% ~ (p=0.061 n=15+15) Tar 27.1MB ± 0% 27.0MB ± 0% -0.11% (p=0.029 n=15+15) XML 42.7MB ± 0% 42.5MB ± 0% -0.29% (p=0.000 n=15+15) [Geo mean] 42.1MB 75.0MB +78.05% name old allocs/op new allocs/op delta Template 402k ± 1% 398k ± 0% -0.91% (p=0.000 n=15+15) Unicode 344k ± 1% 344k ± 0% ~ (p=0.715 n=15+14) GoTypes 1.18M ± 0% 1.17M ± 0% -0.91% (p=0.000 n=15+14) Flate 243k ± 0% 240k ± 1% -1.05% (p=0.000 n=13+15) GoParser 327k ± 1% 324k ± 1% -0.96% (p=0.000 n=15+15) Reflect 984k ± 1% 982k ± 0% ~ (p=0.050 n=15+15) Tar 261k ± 1% 259k ± 1% -0.77% (p=0.000 n=15+15) XML 411k ± 0% 404k ± 1% -1.55% (p=0.000 n=15+15) [Geo mean] 439k 755k +72.01% name old text-bytes new text-bytes delta HelloSize 694kB ± 0% 694kB ± 0% -0.00% (p=0.000 n=15+15) name old data-bytes new data-bytes delta HelloSize 5.55kB ± 0% 5.55kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 133kB ± 0% 133kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.04MB ± 0% 1.04MB ± 0% ~ (all equal) Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8 Reviewed-on: https://go-review.googlesource.com/41770 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-21 18:30:19 -04:00
}
// Var has no children => no terminator
[dev.debug] cmd/compile: better DWARF with optimizations on Debuggers use DWARF information to find local variables on the stack and in registers. Prior to this CL, the DWARF information for functions claimed that all variables were on the stack at all times. That's incorrect when optimizations are enabled, and results in debuggers showing data that is out of date or complete gibberish. After this CL, the compiler is capable of representing variable locations more accurately, and attempts to do so. Due to limitations of the SSA backend, it's not possible to be completely correct. There are a number of problems in the current design. One of the easier to understand is that variable names currently must be attached to an SSA value, but not all assignments in the source code actually result in machine code. For example: type myint int var a int b := myint(int) and b := (*uint64)(unsafe.Pointer(a)) don't generate machine code because the underlying representation is the same, so the correct value of b will not be set when the user would expect. Generating the more precise debug information is behind a flag, dwarflocationlists. Because of the issues described above, setting the flag may not make the debugging experience much better, and may actually make it worse in cases where the variable actually is on the stack and the more complicated analysis doesn't realize it. A number of changes are included: - Add a new pseudo-instruction, RegKill, which indicates that the value in the register has been clobbered. - Adjust regalloc to emit RegKills in the right places. Significantly, this means that phis are mixed with StoreReg and RegKills after regalloc. - Track variable decomposition in ssa.LocalSlots. - After the SSA backend is done, analyze the result and build location lists for each LocalSlot. - After assembly is done, update the location lists with the assembled PC offsets, recompose variables, and build DWARF location lists. Emit the list as a new linker symbol, one per function. - In the linker, aggregate the location lists into a .debug_loc section. TODO: - currently disabled for non-X86/AMD64 because there are no data tables. go build -toolexec 'toolstash -cmp' -a std succeeds. With -dwarflocationlists false: before: f02812195637909ff675782c0b46836a8ff01976 after: 06f61e8112a42ac34fb80e0c818b3cdb84a5e7ec benchstat -geomean /tmp/220352263 /tmp/621364410 completed 15 of 15, estimated time remaining 0s (eta 3:52PM) name old time/op new time/op delta Template 199ms ± 3% 198ms ± 2% ~ (p=0.400 n=15+14) Unicode 96.6ms ± 5% 96.4ms ± 5% ~ (p=0.838 n=15+15) GoTypes 653ms ± 2% 647ms ± 2% ~ (p=0.102 n=15+14) Flate 133ms ± 6% 129ms ± 3% -2.62% (p=0.041 n=15+15) GoParser 164ms ± 5% 159ms ± 3% -3.05% (p=0.000 n=15+15) Reflect 428ms ± 4% 422ms ± 3% ~ (p=0.156 n=15+13) Tar 123ms ±10% 124ms ± 8% ~ (p=0.461 n=15+15) XML 228ms ± 3% 224ms ± 3% -1.57% (p=0.045 n=15+15) [Geo mean] 206ms 377ms +82.86% name old user-time/op new user-time/op delta Template 292ms ±10% 301ms ±12% ~ (p=0.189 n=15+15) Unicode 166ms ±37% 158ms ±14% ~ (p=0.418 n=15+14) GoTypes 962ms ± 6% 963ms ± 7% ~ (p=0.976 n=15+15) Flate 207ms ±19% 200ms ±14% ~ (p=0.345 n=14+15) GoParser 246ms ±22% 240ms ±15% ~ (p=0.587 n=15+15) Reflect 611ms ±13% 587ms ±14% ~ (p=0.085 n=15+13) Tar 211ms ±12% 217ms ±14% ~ (p=0.355 n=14+15) XML 335ms ±15% 320ms ±18% ~ (p=0.169 n=15+15) [Geo mean] 317ms 583ms +83.72% name old alloc/op new alloc/op delta Template 40.2MB ± 0% 40.2MB ± 0% -0.15% (p=0.000 n=14+15) Unicode 29.2MB ± 0% 29.3MB ± 0% ~ (p=0.624 n=15+15) GoTypes 114MB ± 0% 114MB ± 0% -0.15% (p=0.000 n=15+14) Flate 25.7MB ± 0% 25.6MB ± 0% -0.18% (p=0.000 n=13+15) GoParser 32.2MB ± 0% 32.2MB ± 0% -0.14% (p=0.003 n=15+15) Reflect 77.8MB ± 0% 77.9MB ± 0% ~ (p=0.061 n=15+15) Tar 27.1MB ± 0% 27.0MB ± 0% -0.11% (p=0.029 n=15+15) XML 42.7MB ± 0% 42.5MB ± 0% -0.29% (p=0.000 n=15+15) [Geo mean] 42.1MB 75.0MB +78.05% name old allocs/op new allocs/op delta Template 402k ± 1% 398k ± 0% -0.91% (p=0.000 n=15+15) Unicode 344k ± 1% 344k ± 0% ~ (p=0.715 n=15+14) GoTypes 1.18M ± 0% 1.17M ± 0% -0.91% (p=0.000 n=15+14) Flate 243k ± 0% 240k ± 1% -1.05% (p=0.000 n=13+15) GoParser 327k ± 1% 324k ± 1% -0.96% (p=0.000 n=15+15) Reflect 984k ± 1% 982k ± 0% ~ (p=0.050 n=15+15) Tar 261k ± 1% 259k ± 1% -0.77% (p=0.000 n=15+15) XML 411k ± 0% 404k ± 1% -1.55% (p=0.000 n=15+15) [Geo mean] 439k 755k +72.01% name old text-bytes new text-bytes delta HelloSize 694kB ± 0% 694kB ± 0% -0.00% (p=0.000 n=15+15) name old data-bytes new data-bytes delta HelloSize 5.55kB ± 0% 5.55kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 133kB ± 0% 133kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.04MB ± 0% 1.04MB ± 0% ~ (all equal) Change-Id: I991fc553ef175db46bb23b2128317bbd48de70d8 Reviewed-on: https://go-review.googlesource.com/41770 Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2017-07-21 18:30:19 -04:00
}
// VarsByOffset attaches the methods of sort.Interface to []*Var,
// sorting in increasing StackOffset.
type VarsByOffset []*Var
func (s VarsByOffset) Len() int { return len(s) }
func (s VarsByOffset) Less(i, j int) bool { return s[i].StackOffset < s[j].StackOffset }
func (s VarsByOffset) Swap(i, j int) { s[i], s[j] = s[j], s[i] }
// byChildIndex implements sort.Interface for []*dwarf.Var by child index.
type byChildIndex []*Var
func (s byChildIndex) Len() int { return len(s) }
func (s byChildIndex) Less(i, j int) bool { return s[i].ChildIndex < s[j].ChildIndex }
func (s byChildIndex) Swap(i, j int) { s[i], s[j] = s[j], s[i] }