2015-02-27 22:57:28 -05:00
// Derived from Inferno utils/6l/obj.c and utils/6l/span.c
2016-08-28 17:04:46 -07:00
// https://bitbucket.org/inferno-os/inferno-os/src/default/utils/6l/obj.c
// https://bitbucket.org/inferno-os/inferno-os/src/default/utils/6l/span.c
2015-02-27 22:57:28 -05:00
//
// Copyright © 1994-1999 Lucent Technologies Inc. All rights reserved.
// Portions Copyright © 1995-1997 C H Forsyth (forsyth@terzarima.net)
// Portions Copyright © 1997-1999 Vita Nuova Limited
// Portions Copyright © 2000-2007 Vita Nuova Holdings Limited (www.vitanuova.com)
// Portions Copyright © 2004,2006 Bruce Ellis
// Portions Copyright © 2005-2007 C H Forsyth (forsyth@terzarima.net)
// Revisions Copyright © 2000-2007 Lucent Technologies Inc. and others
2016-04-10 14:32:26 -07:00
// Portions Copyright © 2009 The Go Authors. All rights reserved.
2015-02-27 22:57:28 -05:00
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package ld
import (
cmd/link: compress DWARF sections in ELF binaries
Forked from CL 111895.
The trickiest part of this is that the binary layout code (blk,
elfshbits, and various other things) assumes a constant offset between
symbols' and sections' file locations and their virtual addresses.
Compression, of course, breaks this constant offset. But we need to
assign virtual addresses to everything before compression in order to
resolve relocations before compression. As a result, compression needs
to re-compute the "address" of the DWARF sections and symbols based on
their compressed size. Luckily, these are at the end of the file, so
this doesn't perturb any other sections or symbols. (And there is, of
course, a surprising amount of code that assumes the DWARF segment
comes last, so what's one more place?)
Relevant benchmarks:
name old time/op new time/op delta
StdCmd 10.3s ± 2% 10.8s ± 1% +5.43% (p=0.000 n=30+30)
name old text-bytes new text-bytes delta
HelloSize 746kB ± 0% 746kB ± 0% ~ (all equal)
CmdGoSize 8.41MB ± 0% 8.41MB ± 0% ~ (all equal)
[Geo mean] 2.50MB 2.50MB +0.00%
name old data-bytes new data-bytes delta
HelloSize 10.6kB ± 0% 10.6kB ± 0% ~ (all equal)
CmdGoSize 252kB ± 0% 252kB ± 0% ~ (all equal)
[Geo mean] 51.5kB 51.5kB +0.00%
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
CmdGoSize 145kB ± 0% 145kB ± 0% ~ (all equal)
[Geo mean] 135kB 135kB +0.00%
name old exe-bytes new exe-bytes delta
HelloSize 1.60MB ± 0% 1.05MB ± 0% -34.39% (p=0.000 n=30+30)
CmdGoSize 16.5MB ± 0% 11.3MB ± 0% -31.76% (p=0.000 n=30+30)
[Geo mean] 5.14MB 3.44MB -33.08%
Fixes #11799.
Updates #6853.
Change-Id: I64197afe4c01a237523a943088051ee056331c6f
Reviewed-on: https://go-review.googlesource.com/118276
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-05 21:49:40 -04:00
"bytes"
runtime: replace GC programs with simpler encoding, faster decoder
Small types record the location of pointers in their memory layout
by using a simple bitmap. In Go 1.4 the bitmap held 4-bit entries,
and in Go 1.5 the bitmap holds 1-bit entries, but in both cases using
a bitmap for a large type containing arrays does not make sense:
if someone refers to the type [1<<28]*byte in a program in such
a way that the type information makes it into the binary, it would be
a waste of space to write a 128 MB (for 4-bit entries) or even 32 MB
(for 1-bit entries) bitmap full of 1s into the binary or even to keep
one in memory during the execution of the program.
For large types containing arrays, it is much more compact to describe
the locations of pointers using a notation that can express repetition
than to lay out a bitmap of pointers. Go 1.4 included such a notation,
called ``GC programs'' but it was complex, required recursion during
decoding, and was generally slow. Dmitriy measured the execution of
these programs writing directly to the heap bitmap as being 7x slower
than copying from a preunrolled 4-bit mask (and frankly that code was
not terribly fast either). For some tests, unrollgcprog1 was seen costing
as much as 3x more than the rest of malloc combined.
This CL introduces a different form for the GC programs. They use a
simple Lempel-Ziv-style encoding of the 1-bit pointer information,
in which the only operations are (1) emit the following n bits
and (2) repeat the last n bits c more times. This encoding can be
generated directly from the Go type information (using repetition
only for arrays or large runs of non-pointer data) and it can be decoded
very efficiently. In particular the decoding requires little state and
no recursion, so that the entire decoding can run without any memory
accesses other than the reads of the encoding and the writes of the
decoded form to the heap bitmap. For recursive types like arrays of
arrays of arrays, the inner instructions are only executed once, not
n times, so that large repetitions run at full speed. (In contrast, large
repetitions in the old programs repeated the individual bit-level layout
of the inner data over and over.) The result is as much as 25x faster
decoding compared to the old form.
Because the old decoder was so slow, Go 1.4 had three (or so) cases
for how to set the heap bitmap bits for an allocation of a given type:
(1) If the type had an even number of words up to 32 words, then
the 4-bit pointer mask for the type fit in no more than 16 bytes;
store the 4-bit pointer mask directly in the binary and copy from it.
(1b) If the type had an odd number of words up to 15 words, then
the 4-bit pointer mask for the type, doubled to end on a byte boundary,
fit in no more than 16 bytes; store that doubled mask directly in the
binary and copy from it.
(2) If the type had an even number of words up to 128 words,
or an odd number of words up to 63 words (again due to doubling),
then the 4-bit pointer mask would fit in a 64-byte unrolled mask.
Store a GC program in the binary, but leave space in the BSS for
the unrolled mask. Execute the GC program to construct the mask the
first time it is needed, and thereafter copy from the mask.
(3) Otherwise, store a GC program and execute it to write directly to
the heap bitmap each time an object of that type is allocated.
(This is the case that was 7x slower than the other two.)
Because the new pointer masks store 1-bit entries instead of 4-bit
entries and because using the decoder no longer carries a significant
overhead, after this CL (that is, for Go 1.5) there are only two cases:
(1) If the type is 128 words or less (no condition about odd or even),
store the 1-bit pointer mask directly in the binary and use it to
initialize the heap bitmap during malloc. (Implemented in CL 9702.)
(2) There is no case 2 anymore.
(3) Otherwise, store a GC program and execute it to write directly to
the heap bitmap each time an object of that type is allocated.
Executing the GC program directly into the heap bitmap (case (3) above)
was disabled for the Go 1.5 dev cycle, both to avoid needing to use
GC programs for typedmemmove and to avoid updating that code as
the heap bitmap format changed. Typedmemmove no longer uses this
type information; as of CL 9886 it uses the heap bitmap directly.
Now that the heap bitmap format is stable, we reintroduce GC programs
and their space savings.
Benchmarks for heapBitsSetType, before this CL vs this CL:
name old mean new mean delta
SetTypePtr 7.59ns × (0.99,1.02) 5.16ns × (1.00,1.00) -32.05% (p=0.000)
SetTypePtr8 21.0ns × (0.98,1.05) 21.4ns × (1.00,1.00) ~ (p=0.179)
SetTypePtr16 24.1ns × (0.99,1.01) 24.6ns × (1.00,1.00) +2.41% (p=0.001)
SetTypePtr32 31.2ns × (0.99,1.01) 32.4ns × (0.99,1.02) +3.72% (p=0.001)
SetTypePtr64 45.2ns × (1.00,1.00) 47.2ns × (1.00,1.00) +4.42% (p=0.000)
SetTypePtr126 75.8ns × (0.99,1.01) 79.1ns × (1.00,1.00) +4.25% (p=0.000)
SetTypePtr128 74.3ns × (0.99,1.01) 77.6ns × (1.00,1.01) +4.55% (p=0.000)
SetTypePtrSlice 726ns × (1.00,1.01) 712ns × (1.00,1.00) -1.95% (p=0.001)
SetTypeNode1 20.0ns × (0.99,1.01) 20.7ns × (1.00,1.00) +3.71% (p=0.000)
SetTypeNode1Slice 112ns × (1.00,1.00) 113ns × (0.99,1.00) ~ (p=0.070)
SetTypeNode8 23.9ns × (1.00,1.00) 24.7ns × (1.00,1.01) +3.18% (p=0.000)
SetTypeNode8Slice 294ns × (0.99,1.02) 287ns × (0.99,1.01) -2.38% (p=0.015)
SetTypeNode64 52.8ns × (0.99,1.03) 51.8ns × (0.99,1.01) ~ (p=0.069)
SetTypeNode64Slice 1.13µs × (0.99,1.05) 1.14µs × (0.99,1.00) ~ (p=0.767)
SetTypeNode64Dead 36.0ns × (1.00,1.01) 32.5ns × (0.99,1.00) -9.67% (p=0.000)
SetTypeNode64DeadSlice 1.43µs × (0.99,1.01) 1.40µs × (1.00,1.00) -2.39% (p=0.001)
SetTypeNode124 75.7ns × (1.00,1.01) 79.0ns × (1.00,1.00) +4.44% (p=0.000)
SetTypeNode124Slice 1.94µs × (1.00,1.01) 2.04µs × (0.99,1.01) +4.98% (p=0.000)
SetTypeNode126 75.4ns × (1.00,1.01) 77.7ns × (0.99,1.01) +3.11% (p=0.000)
SetTypeNode126Slice 1.95µs × (0.99,1.01) 2.03µs × (1.00,1.00) +3.74% (p=0.000)
SetTypeNode128 85.4ns × (0.99,1.01) 122.0ns × (1.00,1.00) +42.89% (p=0.000)
SetTypeNode128Slice 2.20µs × (1.00,1.01) 2.36µs × (0.98,1.02) +7.48% (p=0.001)
SetTypeNode130 83.3ns × (1.00,1.00) 123.0ns × (1.00,1.00) +47.61% (p=0.000)
SetTypeNode130Slice 2.30µs × (0.99,1.01) 2.40µs × (0.98,1.01) +4.37% (p=0.000)
SetTypeNode1024 498ns × (1.00,1.00) 537ns × (1.00,1.00) +7.96% (p=0.000)
SetTypeNode1024Slice 15.5µs × (0.99,1.01) 17.8µs × (1.00,1.00) +15.27% (p=0.000)
The above compares always using a cached pointer mask (and the
corresponding waste of memory) against using the programs directly.
Some slowdown is expected, in exchange for having a better general algorithm.
The GC programs kick in for SetTypeNode128, SetTypeNode130, SetTypeNode1024,
along with the slice variants of those.
It is possible that the cutoff of 128 words (bits) should be raised
in a followup CL, but even with this low cutoff the GC programs are
faster than Go 1.4's "fast path" non-GC program case.
Benchmarks for heapBitsSetType, Go 1.4 vs this CL:
name old mean new mean delta
SetTypePtr 6.89ns × (1.00,1.00) 5.17ns × (1.00,1.00) -25.02% (p=0.000)
SetTypePtr8 25.8ns × (0.97,1.05) 21.5ns × (1.00,1.00) -16.70% (p=0.000)
SetTypePtr16 39.8ns × (0.97,1.02) 24.7ns × (0.99,1.01) -37.81% (p=0.000)
SetTypePtr32 68.8ns × (0.98,1.01) 32.2ns × (1.00,1.01) -53.18% (p=0.000)
SetTypePtr64 130ns × (1.00,1.00) 47ns × (1.00,1.00) -63.67% (p=0.000)
SetTypePtr126 241ns × (0.99,1.01) 79ns × (1.00,1.01) -67.25% (p=0.000)
SetTypePtr128 2.07µs × (1.00,1.00) 0.08µs × (1.00,1.00) -96.27% (p=0.000)
SetTypePtrSlice 1.05µs × (0.99,1.01) 0.72µs × (0.99,1.02) -31.70% (p=0.000)
SetTypeNode1 16.0ns × (0.99,1.01) 20.8ns × (0.99,1.03) +29.91% (p=0.000)
SetTypeNode1Slice 184ns × (0.99,1.01) 112ns × (0.99,1.01) -39.26% (p=0.000)
SetTypeNode8 29.5ns × (0.97,1.02) 24.6ns × (1.00,1.00) -16.50% (p=0.000)
SetTypeNode8Slice 624ns × (0.98,1.02) 285ns × (1.00,1.00) -54.31% (p=0.000)
SetTypeNode64 135ns × (0.96,1.08) 52ns × (0.99,1.02) -61.32% (p=0.000)
SetTypeNode64Slice 3.83µs × (1.00,1.00) 1.14µs × (0.99,1.01) -70.16% (p=0.000)
SetTypeNode64Dead 134ns × (0.99,1.01) 32ns × (1.00,1.01) -75.74% (p=0.000)
SetTypeNode64DeadSlice 3.83µs × (0.99,1.00) 1.40µs × (1.00,1.01) -63.42% (p=0.000)
SetTypeNode124 240ns × (0.99,1.01) 79ns × (1.00,1.01) -67.05% (p=0.000)
SetTypeNode124Slice 7.27µs × (1.00,1.00) 2.04µs × (1.00,1.00) -71.95% (p=0.000)
SetTypeNode126 2.06µs × (0.99,1.01) 0.08µs × (0.99,1.01) -96.23% (p=0.000)
SetTypeNode126Slice 64.4µs × (1.00,1.00) 2.0µs × (1.00,1.00) -96.85% (p=0.000)
SetTypeNode128 2.09µs × (1.00,1.01) 0.12µs × (1.00,1.00) -94.15% (p=0.000)
SetTypeNode128Slice 65.4µs × (1.00,1.00) 2.4µs × (0.99,1.03) -96.39% (p=0.000)
SetTypeNode130 2.11µs × (1.00,1.00) 0.12µs × (1.00,1.00) -94.18% (p=0.000)
SetTypeNode130Slice 66.3µs × (1.00,1.00) 2.4µs × (0.97,1.08) -96.34% (p=0.000)
SetTypeNode1024 16.0µs × (1.00,1.01) 0.5µs × (1.00,1.00) -96.65% (p=0.000)
SetTypeNode1024Slice 512µs × (1.00,1.00) 18µs × (0.98,1.04) -96.45% (p=0.000)
SetTypeNode124 uses a 124 data + 2 ptr = 126-word allocation.
Both Go 1.4 and this CL are using pointer bitmaps for this case,
so that's an overall 3x speedup for using pointer bitmaps.
SetTypeNode128 uses a 128 data + 2 ptr = 130-word allocation.
Both Go 1.4 and this CL are running the GC program for this case,
so that's an overall 17x speedup when using GC programs (and
I've seen >20x on other systems).
Comparing Go 1.4's SetTypeNode124 (pointer bitmap) against
this CL's SetTypeNode128 (GC program), the slow path in the
code in this CL is 2x faster than the fast path in Go 1.4.
The Go 1 benchmarks are basically unaffected compared to just before this CL.
Go 1 benchmarks, before this CL vs this CL:
name old mean new mean delta
BinaryTree17 5.87s × (0.97,1.04) 5.91s × (0.96,1.04) ~ (p=0.306)
Fannkuch11 4.38s × (1.00,1.00) 4.37s × (1.00,1.01) -0.22% (p=0.006)
FmtFprintfEmpty 90.7ns × (0.97,1.10) 89.3ns × (0.96,1.09) ~ (p=0.280)
FmtFprintfString 282ns × (0.98,1.04) 287ns × (0.98,1.07) +1.72% (p=0.039)
FmtFprintfInt 269ns × (0.99,1.03) 282ns × (0.97,1.04) +4.87% (p=0.000)
FmtFprintfIntInt 478ns × (0.99,1.02) 481ns × (0.99,1.02) +0.61% (p=0.048)
FmtFprintfPrefixedInt 399ns × (0.98,1.03) 400ns × (0.98,1.05) ~ (p=0.533)
FmtFprintfFloat 563ns × (0.99,1.01) 570ns × (1.00,1.01) +1.37% (p=0.000)
FmtManyArgs 1.89µs × (0.99,1.01) 1.92µs × (0.99,1.02) +1.88% (p=0.000)
GobDecode 15.2ms × (0.99,1.01) 15.2ms × (0.98,1.05) ~ (p=0.609)
GobEncode 11.6ms × (0.98,1.03) 11.9ms × (0.98,1.04) +2.17% (p=0.000)
Gzip 648ms × (0.99,1.01) 648ms × (1.00,1.01) ~ (p=0.835)
Gunzip 142ms × (1.00,1.00) 143ms × (1.00,1.01) ~ (p=0.169)
HTTPClientServer 90.5µs × (0.98,1.03) 91.5µs × (0.98,1.04) +1.04% (p=0.045)
JSONEncode 31.5ms × (0.98,1.03) 31.4ms × (0.98,1.03) ~ (p=0.549)
JSONDecode 111ms × (0.99,1.01) 107ms × (0.99,1.01) -3.21% (p=0.000)
Mandelbrot200 6.01ms × (1.00,1.00) 6.01ms × (1.00,1.00) ~ (p=0.878)
GoParse 6.54ms × (0.99,1.02) 6.61ms × (0.99,1.03) +1.08% (p=0.004)
RegexpMatchEasy0_32 160ns × (1.00,1.01) 161ns × (1.00,1.00) +0.40% (p=0.000)
RegexpMatchEasy0_1K 560ns × (0.99,1.01) 559ns × (0.99,1.01) ~ (p=0.088)
RegexpMatchEasy1_32 138ns × (0.99,1.01) 138ns × (1.00,1.00) ~ (p=0.380)
RegexpMatchEasy1_1K 877ns × (1.00,1.00) 878ns × (1.00,1.00) ~ (p=0.157)
RegexpMatchMedium_32 251ns × (0.99,1.00) 251ns × (1.00,1.01) +0.28% (p=0.021)
RegexpMatchMedium_1K 72.6µs × (1.00,1.00) 72.6µs × (1.00,1.00) ~ (p=0.539)
RegexpMatchHard_32 3.84µs × (1.00,1.00) 3.84µs × (1.00,1.00) ~ (p=0.378)
RegexpMatchHard_1K 117µs × (1.00,1.00) 117µs × (1.00,1.00) ~ (p=0.067)
Revcomp 904ms × (0.99,1.02) 904ms × (0.99,1.01) ~ (p=0.943)
Template 125ms × (0.99,1.02) 127ms × (0.99,1.01) +1.79% (p=0.000)
TimeParse 627ns × (0.99,1.01) 622ns × (0.99,1.01) -0.88% (p=0.000)
TimeFormat 655ns × (0.99,1.02) 655ns × (0.99,1.02) ~ (p=0.976)
For the record, Go 1 benchmarks, Go 1.4 vs this CL:
name old mean new mean delta
BinaryTree17 4.61s × (0.97,1.05) 5.91s × (0.98,1.03) +28.35% (p=0.000)
Fannkuch11 4.40s × (0.99,1.03) 4.41s × (0.99,1.01) ~ (p=0.212)
FmtFprintfEmpty 102ns × (0.99,1.01) 84ns × (0.99,1.02) -18.38% (p=0.000)
FmtFprintfString 302ns × (0.98,1.01) 303ns × (0.99,1.02) ~ (p=0.203)
FmtFprintfInt 313ns × (0.97,1.05) 270ns × (0.99,1.01) -13.69% (p=0.000)
FmtFprintfIntInt 524ns × (0.98,1.02) 477ns × (0.99,1.00) -8.87% (p=0.000)
FmtFprintfPrefixedInt 424ns × (0.98,1.02) 386ns × (0.99,1.01) -8.96% (p=0.000)
FmtFprintfFloat 652ns × (0.98,1.02) 594ns × (0.97,1.05) -8.97% (p=0.000)
FmtManyArgs 2.13µs × (0.99,1.02) 1.94µs × (0.99,1.01) -8.92% (p=0.000)
GobDecode 17.1ms × (0.99,1.02) 14.9ms × (0.98,1.03) -13.07% (p=0.000)
GobEncode 13.5ms × (0.98,1.03) 11.5ms × (0.98,1.03) -15.25% (p=0.000)
Gzip 656ms × (0.99,1.02) 647ms × (0.99,1.01) -1.29% (p=0.000)
Gunzip 143ms × (0.99,1.02) 144ms × (0.99,1.01) ~ (p=0.204)
HTTPClientServer 88.2µs × (0.98,1.02) 90.8µs × (0.98,1.01) +2.93% (p=0.000)
JSONEncode 32.2ms × (0.98,1.02) 30.9ms × (0.97,1.04) -4.06% (p=0.001)
JSONDecode 121ms × (0.98,1.02) 110ms × (0.98,1.05) -8.95% (p=0.000)
Mandelbrot200 6.06ms × (0.99,1.01) 6.11ms × (0.98,1.04) ~ (p=0.184)
GoParse 6.76ms × (0.97,1.04) 6.58ms × (0.98,1.05) -2.63% (p=0.003)
RegexpMatchEasy0_32 195ns × (1.00,1.01) 155ns × (0.99,1.01) -20.43% (p=0.000)
RegexpMatchEasy0_1K 479ns × (0.98,1.03) 535ns × (0.99,1.02) +11.59% (p=0.000)
RegexpMatchEasy1_32 169ns × (0.99,1.02) 131ns × (0.99,1.03) -22.44% (p=0.000)
RegexpMatchEasy1_1K 1.53µs × (0.99,1.01) 0.87µs × (0.99,1.02) -43.07% (p=0.000)
RegexpMatchMedium_32 334ns × (0.99,1.01) 242ns × (0.99,1.01) -27.53% (p=0.000)
RegexpMatchMedium_1K 125µs × (1.00,1.01) 72µs × (0.99,1.03) -42.53% (p=0.000)
RegexpMatchHard_32 6.03µs × (0.99,1.01) 3.79µs × (0.99,1.01) -37.12% (p=0.000)
RegexpMatchHard_1K 189µs × (0.99,1.02) 115µs × (0.99,1.01) -39.20% (p=0.000)
Revcomp 935ms × (0.96,1.03) 926ms × (0.98,1.02) ~ (p=0.083)
Template 146ms × (0.97,1.05) 119ms × (0.99,1.01) -18.37% (p=0.000)
TimeParse 660ns × (0.99,1.01) 624ns × (0.99,1.02) -5.43% (p=0.000)
TimeFormat 670ns × (0.98,1.02) 710ns × (1.00,1.01) +5.97% (p=0.000)
This CL is a bit larger than I would like, but the compiler, linker, runtime,
and package reflect all need to be in sync about the format of these programs,
so there is no easy way to split this into independent changes (at least
while keeping the build working at each change).
Fixes #9625.
Fixes #10524.
Change-Id: I9e3e20d6097099d0f8532d1cb5b1af528804989a
Reviewed-on: https://go-review.googlesource.com/9888
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Russ Cox <rsc@golang.org>
2015-05-08 01:43:18 -04:00
"cmd/internal/gcprog"
2017-04-18 12:53:25 -07:00
"cmd/internal/objabi"
2016-04-06 12:01:40 -07:00
"cmd/internal/sys"
2020-02-12 17:20:00 -05:00
"cmd/link/internal/loader"
2017-10-04 17:54:04 -04:00
"cmd/link/internal/sym"
cmd/link: compress DWARF sections in ELF binaries
Forked from CL 111895.
The trickiest part of this is that the binary layout code (blk,
elfshbits, and various other things) assumes a constant offset between
symbols' and sections' file locations and their virtual addresses.
Compression, of course, breaks this constant offset. But we need to
assign virtual addresses to everything before compression in order to
resolve relocations before compression. As a result, compression needs
to re-compute the "address" of the DWARF sections and symbols based on
their compressed size. Luckily, these are at the end of the file, so
this doesn't perturb any other sections or symbols. (And there is, of
course, a surprising amount of code that assumes the DWARF segment
comes last, so what's one more place?)
Relevant benchmarks:
name old time/op new time/op delta
StdCmd 10.3s ± 2% 10.8s ± 1% +5.43% (p=0.000 n=30+30)
name old text-bytes new text-bytes delta
HelloSize 746kB ± 0% 746kB ± 0% ~ (all equal)
CmdGoSize 8.41MB ± 0% 8.41MB ± 0% ~ (all equal)
[Geo mean] 2.50MB 2.50MB +0.00%
name old data-bytes new data-bytes delta
HelloSize 10.6kB ± 0% 10.6kB ± 0% ~ (all equal)
CmdGoSize 252kB ± 0% 252kB ± 0% ~ (all equal)
[Geo mean] 51.5kB 51.5kB +0.00%
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
CmdGoSize 145kB ± 0% 145kB ± 0% ~ (all equal)
[Geo mean] 135kB 135kB +0.00%
name old exe-bytes new exe-bytes delta
HelloSize 1.60MB ± 0% 1.05MB ± 0% -34.39% (p=0.000 n=30+30)
CmdGoSize 16.5MB ± 0% 11.3MB ± 0% -31.76% (p=0.000 n=30+30)
[Geo mean] 5.14MB 3.44MB -33.08%
Fixes #11799.
Updates #6853.
Change-Id: I64197afe4c01a237523a943088051ee056331c6f
Reviewed-on: https://go-review.googlesource.com/118276
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-05 21:49:40 -04:00
"compress/zlib"
"encoding/binary"
2015-02-27 22:57:28 -05:00
"fmt"
"log"
runtime: replace GC programs with simpler encoding, faster decoder
Small types record the location of pointers in their memory layout
by using a simple bitmap. In Go 1.4 the bitmap held 4-bit entries,
and in Go 1.5 the bitmap holds 1-bit entries, but in both cases using
a bitmap for a large type containing arrays does not make sense:
if someone refers to the type [1<<28]*byte in a program in such
a way that the type information makes it into the binary, it would be
a waste of space to write a 128 MB (for 4-bit entries) or even 32 MB
(for 1-bit entries) bitmap full of 1s into the binary or even to keep
one in memory during the execution of the program.
For large types containing arrays, it is much more compact to describe
the locations of pointers using a notation that can express repetition
than to lay out a bitmap of pointers. Go 1.4 included such a notation,
called ``GC programs'' but it was complex, required recursion during
decoding, and was generally slow. Dmitriy measured the execution of
these programs writing directly to the heap bitmap as being 7x slower
than copying from a preunrolled 4-bit mask (and frankly that code was
not terribly fast either). For some tests, unrollgcprog1 was seen costing
as much as 3x more than the rest of malloc combined.
This CL introduces a different form for the GC programs. They use a
simple Lempel-Ziv-style encoding of the 1-bit pointer information,
in which the only operations are (1) emit the following n bits
and (2) repeat the last n bits c more times. This encoding can be
generated directly from the Go type information (using repetition
only for arrays or large runs of non-pointer data) and it can be decoded
very efficiently. In particular the decoding requires little state and
no recursion, so that the entire decoding can run without any memory
accesses other than the reads of the encoding and the writes of the
decoded form to the heap bitmap. For recursive types like arrays of
arrays of arrays, the inner instructions are only executed once, not
n times, so that large repetitions run at full speed. (In contrast, large
repetitions in the old programs repeated the individual bit-level layout
of the inner data over and over.) The result is as much as 25x faster
decoding compared to the old form.
Because the old decoder was so slow, Go 1.4 had three (or so) cases
for how to set the heap bitmap bits for an allocation of a given type:
(1) If the type had an even number of words up to 32 words, then
the 4-bit pointer mask for the type fit in no more than 16 bytes;
store the 4-bit pointer mask directly in the binary and copy from it.
(1b) If the type had an odd number of words up to 15 words, then
the 4-bit pointer mask for the type, doubled to end on a byte boundary,
fit in no more than 16 bytes; store that doubled mask directly in the
binary and copy from it.
(2) If the type had an even number of words up to 128 words,
or an odd number of words up to 63 words (again due to doubling),
then the 4-bit pointer mask would fit in a 64-byte unrolled mask.
Store a GC program in the binary, but leave space in the BSS for
the unrolled mask. Execute the GC program to construct the mask the
first time it is needed, and thereafter copy from the mask.
(3) Otherwise, store a GC program and execute it to write directly to
the heap bitmap each time an object of that type is allocated.
(This is the case that was 7x slower than the other two.)
Because the new pointer masks store 1-bit entries instead of 4-bit
entries and because using the decoder no longer carries a significant
overhead, after this CL (that is, for Go 1.5) there are only two cases:
(1) If the type is 128 words or less (no condition about odd or even),
store the 1-bit pointer mask directly in the binary and use it to
initialize the heap bitmap during malloc. (Implemented in CL 9702.)
(2) There is no case 2 anymore.
(3) Otherwise, store a GC program and execute it to write directly to
the heap bitmap each time an object of that type is allocated.
Executing the GC program directly into the heap bitmap (case (3) above)
was disabled for the Go 1.5 dev cycle, both to avoid needing to use
GC programs for typedmemmove and to avoid updating that code as
the heap bitmap format changed. Typedmemmove no longer uses this
type information; as of CL 9886 it uses the heap bitmap directly.
Now that the heap bitmap format is stable, we reintroduce GC programs
and their space savings.
Benchmarks for heapBitsSetType, before this CL vs this CL:
name old mean new mean delta
SetTypePtr 7.59ns × (0.99,1.02) 5.16ns × (1.00,1.00) -32.05% (p=0.000)
SetTypePtr8 21.0ns × (0.98,1.05) 21.4ns × (1.00,1.00) ~ (p=0.179)
SetTypePtr16 24.1ns × (0.99,1.01) 24.6ns × (1.00,1.00) +2.41% (p=0.001)
SetTypePtr32 31.2ns × (0.99,1.01) 32.4ns × (0.99,1.02) +3.72% (p=0.001)
SetTypePtr64 45.2ns × (1.00,1.00) 47.2ns × (1.00,1.00) +4.42% (p=0.000)
SetTypePtr126 75.8ns × (0.99,1.01) 79.1ns × (1.00,1.00) +4.25% (p=0.000)
SetTypePtr128 74.3ns × (0.99,1.01) 77.6ns × (1.00,1.01) +4.55% (p=0.000)
SetTypePtrSlice 726ns × (1.00,1.01) 712ns × (1.00,1.00) -1.95% (p=0.001)
SetTypeNode1 20.0ns × (0.99,1.01) 20.7ns × (1.00,1.00) +3.71% (p=0.000)
SetTypeNode1Slice 112ns × (1.00,1.00) 113ns × (0.99,1.00) ~ (p=0.070)
SetTypeNode8 23.9ns × (1.00,1.00) 24.7ns × (1.00,1.01) +3.18% (p=0.000)
SetTypeNode8Slice 294ns × (0.99,1.02) 287ns × (0.99,1.01) -2.38% (p=0.015)
SetTypeNode64 52.8ns × (0.99,1.03) 51.8ns × (0.99,1.01) ~ (p=0.069)
SetTypeNode64Slice 1.13µs × (0.99,1.05) 1.14µs × (0.99,1.00) ~ (p=0.767)
SetTypeNode64Dead 36.0ns × (1.00,1.01) 32.5ns × (0.99,1.00) -9.67% (p=0.000)
SetTypeNode64DeadSlice 1.43µs × (0.99,1.01) 1.40µs × (1.00,1.00) -2.39% (p=0.001)
SetTypeNode124 75.7ns × (1.00,1.01) 79.0ns × (1.00,1.00) +4.44% (p=0.000)
SetTypeNode124Slice 1.94µs × (1.00,1.01) 2.04µs × (0.99,1.01) +4.98% (p=0.000)
SetTypeNode126 75.4ns × (1.00,1.01) 77.7ns × (0.99,1.01) +3.11% (p=0.000)
SetTypeNode126Slice 1.95µs × (0.99,1.01) 2.03µs × (1.00,1.00) +3.74% (p=0.000)
SetTypeNode128 85.4ns × (0.99,1.01) 122.0ns × (1.00,1.00) +42.89% (p=0.000)
SetTypeNode128Slice 2.20µs × (1.00,1.01) 2.36µs × (0.98,1.02) +7.48% (p=0.001)
SetTypeNode130 83.3ns × (1.00,1.00) 123.0ns × (1.00,1.00) +47.61% (p=0.000)
SetTypeNode130Slice 2.30µs × (0.99,1.01) 2.40µs × (0.98,1.01) +4.37% (p=0.000)
SetTypeNode1024 498ns × (1.00,1.00) 537ns × (1.00,1.00) +7.96% (p=0.000)
SetTypeNode1024Slice 15.5µs × (0.99,1.01) 17.8µs × (1.00,1.00) +15.27% (p=0.000)
The above compares always using a cached pointer mask (and the
corresponding waste of memory) against using the programs directly.
Some slowdown is expected, in exchange for having a better general algorithm.
The GC programs kick in for SetTypeNode128, SetTypeNode130, SetTypeNode1024,
along with the slice variants of those.
It is possible that the cutoff of 128 words (bits) should be raised
in a followup CL, but even with this low cutoff the GC programs are
faster than Go 1.4's "fast path" non-GC program case.
Benchmarks for heapBitsSetType, Go 1.4 vs this CL:
name old mean new mean delta
SetTypePtr 6.89ns × (1.00,1.00) 5.17ns × (1.00,1.00) -25.02% (p=0.000)
SetTypePtr8 25.8ns × (0.97,1.05) 21.5ns × (1.00,1.00) -16.70% (p=0.000)
SetTypePtr16 39.8ns × (0.97,1.02) 24.7ns × (0.99,1.01) -37.81% (p=0.000)
SetTypePtr32 68.8ns × (0.98,1.01) 32.2ns × (1.00,1.01) -53.18% (p=0.000)
SetTypePtr64 130ns × (1.00,1.00) 47ns × (1.00,1.00) -63.67% (p=0.000)
SetTypePtr126 241ns × (0.99,1.01) 79ns × (1.00,1.01) -67.25% (p=0.000)
SetTypePtr128 2.07µs × (1.00,1.00) 0.08µs × (1.00,1.00) -96.27% (p=0.000)
SetTypePtrSlice 1.05µs × (0.99,1.01) 0.72µs × (0.99,1.02) -31.70% (p=0.000)
SetTypeNode1 16.0ns × (0.99,1.01) 20.8ns × (0.99,1.03) +29.91% (p=0.000)
SetTypeNode1Slice 184ns × (0.99,1.01) 112ns × (0.99,1.01) -39.26% (p=0.000)
SetTypeNode8 29.5ns × (0.97,1.02) 24.6ns × (1.00,1.00) -16.50% (p=0.000)
SetTypeNode8Slice 624ns × (0.98,1.02) 285ns × (1.00,1.00) -54.31% (p=0.000)
SetTypeNode64 135ns × (0.96,1.08) 52ns × (0.99,1.02) -61.32% (p=0.000)
SetTypeNode64Slice 3.83µs × (1.00,1.00) 1.14µs × (0.99,1.01) -70.16% (p=0.000)
SetTypeNode64Dead 134ns × (0.99,1.01) 32ns × (1.00,1.01) -75.74% (p=0.000)
SetTypeNode64DeadSlice 3.83µs × (0.99,1.00) 1.40µs × (1.00,1.01) -63.42% (p=0.000)
SetTypeNode124 240ns × (0.99,1.01) 79ns × (1.00,1.01) -67.05% (p=0.000)
SetTypeNode124Slice 7.27µs × (1.00,1.00) 2.04µs × (1.00,1.00) -71.95% (p=0.000)
SetTypeNode126 2.06µs × (0.99,1.01) 0.08µs × (0.99,1.01) -96.23% (p=0.000)
SetTypeNode126Slice 64.4µs × (1.00,1.00) 2.0µs × (1.00,1.00) -96.85% (p=0.000)
SetTypeNode128 2.09µs × (1.00,1.01) 0.12µs × (1.00,1.00) -94.15% (p=0.000)
SetTypeNode128Slice 65.4µs × (1.00,1.00) 2.4µs × (0.99,1.03) -96.39% (p=0.000)
SetTypeNode130 2.11µs × (1.00,1.00) 0.12µs × (1.00,1.00) -94.18% (p=0.000)
SetTypeNode130Slice 66.3µs × (1.00,1.00) 2.4µs × (0.97,1.08) -96.34% (p=0.000)
SetTypeNode1024 16.0µs × (1.00,1.01) 0.5µs × (1.00,1.00) -96.65% (p=0.000)
SetTypeNode1024Slice 512µs × (1.00,1.00) 18µs × (0.98,1.04) -96.45% (p=0.000)
SetTypeNode124 uses a 124 data + 2 ptr = 126-word allocation.
Both Go 1.4 and this CL are using pointer bitmaps for this case,
so that's an overall 3x speedup for using pointer bitmaps.
SetTypeNode128 uses a 128 data + 2 ptr = 130-word allocation.
Both Go 1.4 and this CL are running the GC program for this case,
so that's an overall 17x speedup when using GC programs (and
I've seen >20x on other systems).
Comparing Go 1.4's SetTypeNode124 (pointer bitmap) against
this CL's SetTypeNode128 (GC program), the slow path in the
code in this CL is 2x faster than the fast path in Go 1.4.
The Go 1 benchmarks are basically unaffected compared to just before this CL.
Go 1 benchmarks, before this CL vs this CL:
name old mean new mean delta
BinaryTree17 5.87s × (0.97,1.04) 5.91s × (0.96,1.04) ~ (p=0.306)
Fannkuch11 4.38s × (1.00,1.00) 4.37s × (1.00,1.01) -0.22% (p=0.006)
FmtFprintfEmpty 90.7ns × (0.97,1.10) 89.3ns × (0.96,1.09) ~ (p=0.280)
FmtFprintfString 282ns × (0.98,1.04) 287ns × (0.98,1.07) +1.72% (p=0.039)
FmtFprintfInt 269ns × (0.99,1.03) 282ns × (0.97,1.04) +4.87% (p=0.000)
FmtFprintfIntInt 478ns × (0.99,1.02) 481ns × (0.99,1.02) +0.61% (p=0.048)
FmtFprintfPrefixedInt 399ns × (0.98,1.03) 400ns × (0.98,1.05) ~ (p=0.533)
FmtFprintfFloat 563ns × (0.99,1.01) 570ns × (1.00,1.01) +1.37% (p=0.000)
FmtManyArgs 1.89µs × (0.99,1.01) 1.92µs × (0.99,1.02) +1.88% (p=0.000)
GobDecode 15.2ms × (0.99,1.01) 15.2ms × (0.98,1.05) ~ (p=0.609)
GobEncode 11.6ms × (0.98,1.03) 11.9ms × (0.98,1.04) +2.17% (p=0.000)
Gzip 648ms × (0.99,1.01) 648ms × (1.00,1.01) ~ (p=0.835)
Gunzip 142ms × (1.00,1.00) 143ms × (1.00,1.01) ~ (p=0.169)
HTTPClientServer 90.5µs × (0.98,1.03) 91.5µs × (0.98,1.04) +1.04% (p=0.045)
JSONEncode 31.5ms × (0.98,1.03) 31.4ms × (0.98,1.03) ~ (p=0.549)
JSONDecode 111ms × (0.99,1.01) 107ms × (0.99,1.01) -3.21% (p=0.000)
Mandelbrot200 6.01ms × (1.00,1.00) 6.01ms × (1.00,1.00) ~ (p=0.878)
GoParse 6.54ms × (0.99,1.02) 6.61ms × (0.99,1.03) +1.08% (p=0.004)
RegexpMatchEasy0_32 160ns × (1.00,1.01) 161ns × (1.00,1.00) +0.40% (p=0.000)
RegexpMatchEasy0_1K 560ns × (0.99,1.01) 559ns × (0.99,1.01) ~ (p=0.088)
RegexpMatchEasy1_32 138ns × (0.99,1.01) 138ns × (1.00,1.00) ~ (p=0.380)
RegexpMatchEasy1_1K 877ns × (1.00,1.00) 878ns × (1.00,1.00) ~ (p=0.157)
RegexpMatchMedium_32 251ns × (0.99,1.00) 251ns × (1.00,1.01) +0.28% (p=0.021)
RegexpMatchMedium_1K 72.6µs × (1.00,1.00) 72.6µs × (1.00,1.00) ~ (p=0.539)
RegexpMatchHard_32 3.84µs × (1.00,1.00) 3.84µs × (1.00,1.00) ~ (p=0.378)
RegexpMatchHard_1K 117µs × (1.00,1.00) 117µs × (1.00,1.00) ~ (p=0.067)
Revcomp 904ms × (0.99,1.02) 904ms × (0.99,1.01) ~ (p=0.943)
Template 125ms × (0.99,1.02) 127ms × (0.99,1.01) +1.79% (p=0.000)
TimeParse 627ns × (0.99,1.01) 622ns × (0.99,1.01) -0.88% (p=0.000)
TimeFormat 655ns × (0.99,1.02) 655ns × (0.99,1.02) ~ (p=0.976)
For the record, Go 1 benchmarks, Go 1.4 vs this CL:
name old mean new mean delta
BinaryTree17 4.61s × (0.97,1.05) 5.91s × (0.98,1.03) +28.35% (p=0.000)
Fannkuch11 4.40s × (0.99,1.03) 4.41s × (0.99,1.01) ~ (p=0.212)
FmtFprintfEmpty 102ns × (0.99,1.01) 84ns × (0.99,1.02) -18.38% (p=0.000)
FmtFprintfString 302ns × (0.98,1.01) 303ns × (0.99,1.02) ~ (p=0.203)
FmtFprintfInt 313ns × (0.97,1.05) 270ns × (0.99,1.01) -13.69% (p=0.000)
FmtFprintfIntInt 524ns × (0.98,1.02) 477ns × (0.99,1.00) -8.87% (p=0.000)
FmtFprintfPrefixedInt 424ns × (0.98,1.02) 386ns × (0.99,1.01) -8.96% (p=0.000)
FmtFprintfFloat 652ns × (0.98,1.02) 594ns × (0.97,1.05) -8.97% (p=0.000)
FmtManyArgs 2.13µs × (0.99,1.02) 1.94µs × (0.99,1.01) -8.92% (p=0.000)
GobDecode 17.1ms × (0.99,1.02) 14.9ms × (0.98,1.03) -13.07% (p=0.000)
GobEncode 13.5ms × (0.98,1.03) 11.5ms × (0.98,1.03) -15.25% (p=0.000)
Gzip 656ms × (0.99,1.02) 647ms × (0.99,1.01) -1.29% (p=0.000)
Gunzip 143ms × (0.99,1.02) 144ms × (0.99,1.01) ~ (p=0.204)
HTTPClientServer 88.2µs × (0.98,1.02) 90.8µs × (0.98,1.01) +2.93% (p=0.000)
JSONEncode 32.2ms × (0.98,1.02) 30.9ms × (0.97,1.04) -4.06% (p=0.001)
JSONDecode 121ms × (0.98,1.02) 110ms × (0.98,1.05) -8.95% (p=0.000)
Mandelbrot200 6.06ms × (0.99,1.01) 6.11ms × (0.98,1.04) ~ (p=0.184)
GoParse 6.76ms × (0.97,1.04) 6.58ms × (0.98,1.05) -2.63% (p=0.003)
RegexpMatchEasy0_32 195ns × (1.00,1.01) 155ns × (0.99,1.01) -20.43% (p=0.000)
RegexpMatchEasy0_1K 479ns × (0.98,1.03) 535ns × (0.99,1.02) +11.59% (p=0.000)
RegexpMatchEasy1_32 169ns × (0.99,1.02) 131ns × (0.99,1.03) -22.44% (p=0.000)
RegexpMatchEasy1_1K 1.53µs × (0.99,1.01) 0.87µs × (0.99,1.02) -43.07% (p=0.000)
RegexpMatchMedium_32 334ns × (0.99,1.01) 242ns × (0.99,1.01) -27.53% (p=0.000)
RegexpMatchMedium_1K 125µs × (1.00,1.01) 72µs × (0.99,1.03) -42.53% (p=0.000)
RegexpMatchHard_32 6.03µs × (0.99,1.01) 3.79µs × (0.99,1.01) -37.12% (p=0.000)
RegexpMatchHard_1K 189µs × (0.99,1.02) 115µs × (0.99,1.01) -39.20% (p=0.000)
Revcomp 935ms × (0.96,1.03) 926ms × (0.98,1.02) ~ (p=0.083)
Template 146ms × (0.97,1.05) 119ms × (0.99,1.01) -18.37% (p=0.000)
TimeParse 660ns × (0.99,1.01) 624ns × (0.99,1.02) -5.43% (p=0.000)
TimeFormat 670ns × (0.98,1.02) 710ns × (1.00,1.01) +5.97% (p=0.000)
This CL is a bit larger than I would like, but the compiler, linker, runtime,
and package reflect all need to be in sync about the format of these programs,
so there is no easy way to split this into independent changes (at least
while keeping the build working at each change).
Fixes #9625.
Fixes #10524.
Change-Id: I9e3e20d6097099d0f8532d1cb5b1af528804989a
Reviewed-on: https://go-review.googlesource.com/9888
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Russ Cox <rsc@golang.org>
2015-05-08 01:43:18 -04:00
"os"
2016-03-09 16:23:25 +02:00
"sort"
2015-06-04 15:15:48 -04:00
"strconv"
2015-02-27 22:57:28 -05:00
"strings"
2016-04-18 14:50:14 -04:00
"sync"
2015-02-27 22:57:28 -05:00
)
2018-11-22 11:46:44 +01:00
// isRuntimeDepPkg reports whether pkg is the runtime package or its dependency
2016-09-14 14:47:12 -04:00
func isRuntimeDepPkg ( pkg string ) bool {
switch pkg {
case "runtime" ,
2018-03-01 16:38:41 -08:00
"sync/atomic" , // runtime may call to sync/atomic, due to go:linkname
"internal/bytealg" , // for IndexByte
"internal/cpu" : // for cpu features
2016-09-14 14:47:12 -04:00
return true
}
return strings . HasPrefix ( pkg , "runtime/internal/" ) && ! strings . HasSuffix ( pkg , "_test" )
}
2017-06-08 08:26:19 -04:00
// Estimate the max size needed to hold any new trampolines created for this function. This
// is used to determine when the section can be split if it becomes too large, to ensure that
// the trampolines are in the same section as the function that uses them.
2020-04-06 15:58:21 -04:00
func maxSizeTrampolinesPPC64 ( ldr * loader . Loader , s loader . Sym , isTramp bool ) uint64 {
2018-04-01 00:58:48 +03:00
// If thearch.Trampoline is nil, then trampoline support is not available on this arch.
2017-06-08 08:26:19 -04:00
// A trampoline does not need any dependent trampolines.
2018-04-01 00:58:48 +03:00
if thearch . Trampoline == nil || isTramp {
2017-06-08 08:26:19 -04:00
return 0
}
n := uint64 ( 0 )
2020-04-06 15:58:21 -04:00
relocs := ldr . Relocs ( s )
for ri := 0 ; ri < relocs . Count ( ) ; ri ++ {
r := relocs . At2 ( ri )
if r . Type ( ) . IsDirectCallOrJump ( ) {
2017-06-08 08:26:19 -04:00
n ++
}
}
// Trampolines in ppc64 are 4 instructions.
return n * 16
}
2016-09-14 14:47:12 -04:00
// detect too-far jumps in function s, and add trampolines if necessary
2017-06-08 08:26:19 -04:00
// ARM, PPC64 & PPC64LE support trampoline insertion for internal and external linking
// On PPC64 & PPC64LE the text sections might be split but will still insert trampolines
// where necessary.
2020-04-06 15:58:21 -04:00
func trampoline ( ctxt * Link , s loader . Sym ) {
2018-04-01 00:58:48 +03:00
if thearch . Trampoline == nil {
2016-09-14 14:47:12 -04:00
return // no need or no support of trampolines on this arch
}
2020-04-06 15:58:21 -04:00
ldr := ctxt . loader
relocs := ldr . Relocs ( s )
for ri := 0 ; ri < relocs . Count ( ) ; ri ++ {
r := relocs . At2 ( ri )
if ! r . Type ( ) . IsDirectCallOrJump ( ) {
2016-09-14 14:47:12 -04:00
continue
}
2020-04-06 15:58:21 -04:00
rs := r . Sym ( )
2020-04-08 11:54:11 -04:00
if ! ldr . AttrReachable ( rs ) || ldr . SymType ( rs ) == sym . Sxxx {
2020-04-06 15:58:21 -04:00
continue // something is wrong. skip it here and we'll emit a better error later
}
rs = ldr . ResolveABIAlias ( rs )
if ldr . SymValue ( rs ) == 0 && ( ldr . SymType ( rs ) != sym . SDYNIMPORT && ldr . SymType ( rs ) != sym . SUNDEFEXT ) {
if ldr . SymPkg ( rs ) != ldr . SymPkg ( s ) {
if ! isRuntimeDepPkg ( ldr . SymPkg ( s ) ) || ! isRuntimeDepPkg ( ldr . SymPkg ( rs ) ) {
ctxt . Errorf ( s , "unresolved inter-package jump to %s(%s) from %s" , ldr . SymName ( rs ) , ldr . SymPkg ( rs ) , ldr . SymPkg ( s ) )
2016-09-14 14:47:12 -04:00
}
// runtime and its dependent packages may call to each other.
// they are fine, as they will be laid down together.
}
continue
}
2020-04-06 15:58:21 -04:00
thearch . Trampoline ( ctxt , ldr , ri , rs , s )
2016-09-14 14:47:12 -04:00
}
}
2020-04-26 01:42:55 -04:00
// foldSubSymbolOffset computes the offset of symbol s to its top-level outer
// symbol. Returns the top-level symbol and the offset.
// This is used in generating external relocations.
func foldSubSymbolOffset ( ldr * loader . Loader , s loader . Sym ) ( loader . Sym , int64 ) {
outer := ldr . OuterSym ( s )
off := int64 ( 0 )
for outer != 0 {
off += ldr . SymValue ( s ) - ldr . SymValue ( outer )
s = outer
outer = ldr . OuterSym ( s )
}
return s , off
}
2020-04-25 14:17:10 -04:00
// relocsym resolve relocations in "s", updating the symbol's content
// in "P".
// The main loop walks through the list of relocations attached to "s"
// and resolves them where applicable. Relocations are often
// architecture-specific, requiring calls into the 'archreloc' and/or
// 'archrelocvariant' functions for the architecture. When external
// linking is in effect, it may not be possible to completely resolve
// the address/offset for a symbol, in which case the goal is to lay
// the groundwork for turning a given relocation into an external reloc
// (to be applied by the external linker). For more on how relocations
// work in general, see
2018-07-06 12:45:03 -04:00
//
// "Linkers and Loaders", by John R. Levine (Morgan Kaufmann, 1999), ch. 7
//
// This is a performance-critical function for the linker; be careful
// to avoid introducing unnecessary allocations in the main loop.
2020-04-30 10:19:28 -04:00
func ( st * relocSymState ) relocsym ( s loader . Sym , P [ ] byte ) {
ldr := st . ldr
2020-04-25 14:17:10 -04:00
relocs := ldr . Relocs ( s )
if relocs . Count ( ) == 0 {
2019-04-04 00:29:16 -04:00
return
}
2020-04-30 10:19:28 -04:00
target := st . target
syms := st . syms
2020-04-26 01:42:55 -04:00
var extRelocs [ ] loader . ExtReloc
2020-04-30 10:19:28 -04:00
if target . IsExternal ( ) {
// preallocate a slice conservatively assuming that all
// relocs will require an external reloc
extRelocs = st . preallocExtRelocSlice ( relocs . Count ( ) )
}
2020-04-25 14:17:10 -04:00
for ri := 0 ; ri < relocs . Count ( ) ; ri ++ {
r := relocs . At2 ( ri )
off := r . Off ( )
siz := int32 ( r . Siz ( ) )
rs := r . Sym ( )
2020-04-30 01:39:18 -04:00
rs = ldr . ResolveABIAlias ( rs )
2020-04-25 14:17:10 -04:00
rt := r . Type ( )
if off < 0 || off + siz > int32 ( len ( P ) ) {
2016-09-17 09:39:33 -04:00
rname := ""
2020-04-25 14:17:10 -04:00
if rs != 0 {
rname = ldr . SymName ( rs )
2016-09-17 09:39:33 -04:00
}
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "invalid relocation %s: %d+%d not in [%d,%d)" , rname , off , siz , 0 , len ( P ) )
2015-02-27 22:57:28 -05:00
continue
}
2020-04-25 14:17:10 -04:00
var rst sym . SymKind
if rs != 0 {
rst = ldr . SymType ( rs )
}
if rs != 0 && ( ( rst == sym . Sxxx && ! ldr . AttrVisibilityHidden ( rs ) ) || rst == sym . SXREF ) {
2015-03-30 02:59:10 +00:00
// When putting the runtime but not main into a shared library
// these symbols are undefined and that's OK.
2020-02-24 20:59:02 -05:00
if target . IsShared ( ) || target . IsPlugin ( ) {
2020-04-25 14:17:10 -04:00
if ldr . SymName ( rs ) == "main.main" || ( ! target . IsPlugin ( ) && ldr . SymName ( rs ) == "main..inittask" ) {
sb := ldr . MakeSymbolUpdater ( rs )
sb . SetType ( sym . SDYNIMPORT )
} else if strings . HasPrefix ( ldr . SymName ( rs ) , "go.info." ) {
2016-07-28 13:04:41 -04:00
// Skip go.info symbols. They are only needed to communicate
// DWARF info between the compiler and linker.
continue
}
2015-03-30 02:59:10 +00:00
} else {
2020-04-30 10:19:28 -04:00
st . err . errorUnresolved ( ldr , s , rs )
2015-03-30 02:59:10 +00:00
continue
}
2015-02-27 22:57:28 -05:00
}
2020-04-25 14:17:10 -04:00
if rt >= objabi . ElfRelocOffset {
2015-02-27 22:57:28 -05:00
continue
}
2020-04-25 14:17:10 -04:00
if siz == 0 { // informational relocation - no work to do
2015-02-27 22:57:28 -05:00
continue
}
2015-03-30 02:59:10 +00:00
// We need to be able to reference dynimport symbols when linking against
2018-11-23 13:29:59 +01:00
// shared libraries, and Solaris, Darwin and AIX need it always
2020-04-25 14:17:10 -04:00
if ! target . IsSolaris ( ) && ! target . IsDarwin ( ) && ! target . IsAIX ( ) && rs != 0 && rst == sym . SDYNIMPORT && ! target . IsDynlinkingGo ( ) && ! ldr . AttrSubSymbol ( rs ) {
if ! ( target . IsPPC64 ( ) && target . IsExternal ( ) && ldr . SymName ( rs ) == ".TOC." ) {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "unhandled relocation for %s (type %d (%s) rtype %d (%s))" , ldr . SymName ( rs ) , rst , rst , rt , sym . RelocName ( target . Arch , rt ) )
cmd/compile, cmd/link, runtime: on ppc64x, maintain the TOC pointer in R2 when compiling PIC
The PowerPC ISA does not have a PC-relative load instruction, which poses
obvious challenges when generating position-independent code. The way the ELFv2
ABI addresses this is to specify that r2 points to a per "module" (shared
library or executable) TOC pointer. Maintaining this pointer requires
cooperation between codegen and the system linker:
* Non-leaf functions leave space on the stack at r1+24 to save the TOC pointer.
* A call to a function that *might* have to go via a PLT stub must be followed
by a nop instruction that the system linker can replace with "ld r1, 24(r1)"
to restore the TOC pointer (only when dynamically linking Go code).
* When calling a function via a function pointer, the address of the function
must be in r12, and the first couple of instructions (the "global entry
point") of the called function use this to derive the address of the TOC
for the module it is in.
* When calling a function that is implemented in the same module, the system
linker adjusts the call to skip over the instructions mentioned above (the
"local entry point"), assuming that r2 is already correctly set.
So this changeset adds the global entry point instructions, sets the metadata so
the system linker knows where the local entry point is, inserts code to save the
TOC pointer at 24(r1), adds a nop after any call not known to be local and copes
with the odd non-local code transfer in the runtime (e.g. the stuff around
jmpdefer). It does not actually compile PIC yet.
Change-Id: I7522e22bdfd2f891745a900c60254fe9e372c854
Reviewed-on: https://go-review.googlesource.com/15967
Reviewed-by: Russ Cox <rsc@golang.org>
2015-10-16 15:42:09 +13:00
}
2015-02-27 22:57:28 -05:00
}
2020-04-25 14:17:10 -04:00
if rs != 0 && rst != sym . STLSBSS && rt != objabi . R_WEAKADDROFF && rt != objabi . R_METHODOFF && ! ldr . AttrReachable ( rs ) {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "unreachable sym in relocation: %s" , ldr . SymName ( rs ) )
2015-02-27 22:57:28 -05:00
}
2020-04-26 01:42:55 -04:00
var rr loader . ExtReloc
needExtReloc := false // will set to true below in case it is needed
2020-02-24 20:59:02 -05:00
if target . IsExternal ( ) {
2020-04-29 22:00:28 -04:00
rr . Idx = ri
2019-04-23 11:34:58 -04:00
}
2016-03-18 16:57:54 -04:00
// TODO(mundaym): remove this special case - see issue 14218.
2020-04-25 14:17:10 -04:00
//if target.IsS390X() {
// switch r.Type {
// case objabi.R_PCRELDBL:
// r.InitExt()
// r.Type = objabi.R_PCREL
// r.Variant = sym.RV_390_DBL
// case objabi.R_CALL:
// r.InitExt()
// r.Variant = sym.RV_390_DBL
// }
//}
2016-03-18 16:57:54 -04:00
2017-10-01 14:39:04 +11:00
var o int64
2020-04-25 14:17:10 -04:00
switch rt {
2015-02-27 22:57:28 -05:00
default :
2020-04-28 17:28:52 -04:00
switch siz {
default :
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "bad reloc size %#x for %s" , uint32 ( siz ) , ldr . SymName ( rs ) )
2020-04-28 17:28:52 -04:00
case 1 :
o = int64 ( P [ off ] )
case 2 :
o = int64 ( target . Arch . ByteOrder . Uint16 ( P [ off : ] ) )
case 4 :
o = int64 ( target . Arch . ByteOrder . Uint32 ( P [ off : ] ) )
case 8 :
o = int64 ( target . Arch . ByteOrder . Uint64 ( P [ off : ] ) )
}
2020-04-29 23:53:33 -04:00
var rp * loader . ExtReloc
if target . IsExternal ( ) {
// Don't pass &rr directly to Archreloc2, which will escape rr
// even if this case is not taken. Instead, as Archreloc2 will
// likely return true, we speculatively add rr to extRelocs
// and use that space to pass to Archreloc2.
extRelocs = append ( extRelocs , rr )
rp = & extRelocs [ len ( extRelocs ) - 1 ]
}
out , needExtReloc1 , ok := thearch . Archreloc2 ( target , ldr , syms , r , rp , s , o )
if target . IsExternal ( ) && ! needExtReloc1 {
// Speculation failed. Undo the append.
extRelocs = extRelocs [ : len ( extRelocs ) - 1 ]
}
needExtReloc = false // already appended
2020-04-29 12:20:38 -04:00
if ok {
2020-04-28 17:28:52 -04:00
o = out
} else {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "unknown reloc to %v: %d (%s)" , ldr . SymName ( rs ) , rt , sym . RelocName ( target . Arch , rt ) )
2020-04-28 17:28:52 -04:00
}
2017-04-18 12:53:25 -07:00
case objabi . R_TLS_LE :
2020-04-26 01:42:55 -04:00
if target . IsExternal ( ) && target . IsElf ( ) {
needExtReloc = true
2020-04-29 22:00:28 -04:00
rr . Xsym = rs
if rr . Xsym == 0 {
rr . Xsym = syms . Tlsg2
2020-04-26 01:42:55 -04:00
}
2020-04-29 22:00:28 -04:00
rr . Xadd = r . Add ( )
2020-04-26 01:42:55 -04:00
o = 0
if ! target . IsAMD64 ( ) {
2020-04-29 22:00:28 -04:00
o = r . Add ( )
2020-04-26 01:42:55 -04:00
}
break
}
2015-02-27 22:57:28 -05:00
2020-02-24 20:59:02 -05:00
if target . IsElf ( ) && target . IsARM ( ) {
2015-09-02 10:35:54 +12:00
// On ELF ARM, the thread pointer is 8 bytes before
// the start of the thread-local data block, so add 8
// to the actual TLS offset (r->sym->value).
// This 8 seems to be a fundamental constant of
// ELF on ARM (or maybe Glibc on ARM); it is not
// related to the fact that our own TLS storage happens
// to take up 8 bytes.
2020-04-25 14:17:10 -04:00
o = 8 + ldr . SymValue ( rs )
2020-02-24 20:59:02 -05:00
} else if target . IsElf ( ) || target . IsPlan9 ( ) || target . IsDarwin ( ) {
2020-04-25 14:17:10 -04:00
o = int64 ( syms . Tlsoffset ) + r . Add ( )
2020-02-24 20:59:02 -05:00
} else if target . IsWindows ( ) {
2020-04-25 14:17:10 -04:00
o = r . Add ( )
2015-04-23 21:53:48 +12:00
} else {
2020-02-24 20:59:02 -05:00
log . Fatalf ( "unexpected R_TLS_LE relocation for %v" , target . HeadType )
2015-04-23 21:53:48 +12:00
}
2017-04-18 12:53:25 -07:00
case objabi . R_TLS_IE :
2020-04-26 01:42:55 -04:00
if target . IsExternal ( ) && target . IsElf ( ) {
needExtReloc = true
2020-04-29 22:00:28 -04:00
rr . Xsym = rs
if rr . Xsym == 0 {
rr . Xsym = syms . Tlsg2
2020-04-26 01:42:55 -04:00
}
2020-04-29 22:00:28 -04:00
rr . Xadd = r . Add ( )
2020-04-26 01:42:55 -04:00
o = 0
if ! target . IsAMD64 ( ) {
2020-04-29 22:00:28 -04:00
o = r . Add ( )
2020-04-26 01:42:55 -04:00
}
break
}
2020-04-25 22:08:50 -04:00
if target . IsPIE ( ) && target . IsElf ( ) {
// We are linking the final executable, so we
// can optimize any TLS IE relocation to LE.
if thearch . TLSIEtoLE == nil {
log . Fatalf ( "internal linking of TLS IE not supported on %v" , target . Arch . Family )
}
thearch . TLSIEtoLE ( P , int ( off ) , int ( siz ) )
o = int64 ( syms . Tlsoffset )
} else {
log . Fatalf ( "cannot handle R_TLS_IE (sym %s) when linking internally" , ldr . SymName ( s ) )
}
2017-04-18 12:53:25 -07:00
case objabi . R_ADDR :
2020-04-26 01:42:55 -04:00
if target . IsExternal ( ) && rst != sym . SCONST {
needExtReloc = true
// set up addend for eventual relocation via outer symbol.
rs := rs
rs , off := foldSubSymbolOffset ( ldr , rs )
2020-04-29 22:00:28 -04:00
rr . Xadd = r . Add ( ) + off
2020-04-26 01:42:55 -04:00
rst := ldr . SymType ( rs )
if rst != sym . SHOSTOBJ && rst != sym . SDYNIMPORT && rst != sym . SUNDEFEXT && ldr . SymSect ( rs ) == nil {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "missing section for relocation target %s" , ldr . SymName ( rs ) )
2020-04-26 01:42:55 -04:00
}
rr . Xsym = rs
o = rr . Xadd
if target . IsElf ( ) {
if target . IsAMD64 ( ) {
o = 0
}
} else if target . IsDarwin ( ) {
if ldr . SymType ( rs ) != sym . SHOSTOBJ {
o += ldr . SymValue ( rs )
}
} else if target . IsWindows ( ) {
// nothing to do
} else if target . IsAIX ( ) {
2020-04-29 22:00:28 -04:00
o = ldr . SymValue ( rs ) + r . Add ( )
2020-04-26 01:42:55 -04:00
} else {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "unhandled pcrel relocation to %s on %v" , ldr . SymName ( rs ) , target . HeadType )
2020-04-26 01:42:55 -04:00
}
break
}
2018-11-23 14:20:19 +01:00
// On AIX, a second relocation must be done by the loader,
// as section addresses can change once loaded.
2018-09-28 17:02:16 +02:00
// The "default" symbol address is still needed by the loader so
// the current relocation can't be skipped.
2020-04-25 14:17:10 -04:00
if target . IsAIX ( ) && rst != sym . SDYNIMPORT {
2018-11-23 14:20:19 +01:00
// It's not possible to make a loader relocation in a
// symbol which is not inside .data section.
// FIXME: It should be forbidden to have R_ADDR from a
// symbol which isn't in .data. However, as .text has the
// same address once loaded, this is possible.
2020-04-25 14:17:10 -04:00
if ldr . SymSect ( s ) . Seg == & Segdata {
2020-04-26 01:42:55 -04:00
panic ( "not implemented" )
2020-04-25 14:17:10 -04:00
//Xcoffadddynrel(target, ldr, err, s, &r) // XXX
2018-09-28 17:02:16 +02:00
}
}
2015-02-27 22:57:28 -05:00
2020-04-25 14:17:10 -04:00
o = ldr . SymValue ( rs ) + r . Add ( )
2015-02-27 22:57:28 -05:00
// On amd64, 4-byte offsets will be sign-extended, so it is impossible to
// access more than 2GB of static data; fail at link time is better than
2015-07-10 17:17:11 -06:00
// fail at runtime. See https://golang.org/issue/7980.
2015-02-27 22:57:28 -05:00
// Instead of special casing only amd64, we treat this as an error on all
// 64-bit architectures so as to be future-proof.
2020-02-24 20:59:02 -05:00
if int32 ( o ) < 0 && target . Arch . PtrSize > 4 && siz == 4 {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "non-pc-relative relocation address for %s is too big: %#x (%#x + %#x)" , ldr . SymName ( rs ) , uint64 ( o ) , ldr . SymValue ( rs ) , r . Add ( ) )
2015-04-09 07:37:17 -04:00
errorexit ( )
2015-02-27 22:57:28 -05:00
}
2017-10-24 16:08:46 -04:00
case objabi . R_DWARFSECREF :
2020-04-25 14:17:10 -04:00
if ldr . SymSect ( rs ) == nil {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "missing DWARF section for relocation target %s" , ldr . SymName ( rs ) )
2020-04-25 14:17:10 -04:00
}
2020-04-26 01:42:55 -04:00
if target . IsExternal ( ) {
needExtReloc = true
// On most platforms, the external linker needs to adjust DWARF references
// as it combines DWARF sections. However, on Darwin, dsymutil does the
// DWARF linking, and it understands how to follow section offsets.
// Leaving in the relocation records confuses it (see
// https://golang.org/issue/22068) so drop them for Darwin.
if target . IsDarwin ( ) {
needExtReloc = false
}
2020-04-29 22:00:28 -04:00
rr . Xsym = loader . Sym ( ldr . SymSect ( rs ) . Sym2 )
rr . Xadd = r . Add ( ) + ldr . SymValue ( rs ) - int64 ( ldr . SymSect ( rs ) . Vaddr )
2020-04-26 01:42:55 -04:00
o = rr . Xadd
if target . IsElf ( ) && target . IsAMD64 ( ) {
o = 0
}
break
}
2020-04-25 14:17:10 -04:00
o = ldr . SymValue ( rs ) + r . Add ( ) - int64 ( ldr . SymSect ( rs ) . Vaddr )
case objabi . R_WEAKADDROFF , objabi . R_METHODOFF :
if ! ldr . AttrReachable ( rs ) {
2016-11-21 16:58:55 -05:00
continue
}
fallthrough
2017-04-18 12:53:25 -07:00
case objabi . R_ADDROFF :
2016-08-25 11:07:33 -05:00
// The method offset tables using this relocation expect the offset to be relative
// to the start of the first text section, even if there are multiple.
2020-04-25 14:17:10 -04:00
if ldr . SymSect ( rs ) . Name == ".text" {
o = ldr . SymValue ( rs ) - int64 ( Segtext . Sections [ 0 ] . Vaddr ) + r . Add ( )
2016-08-25 11:07:33 -05:00
} else {
2020-04-25 14:17:10 -04:00
o = ldr . SymValue ( rs ) - int64 ( ldr . SymSect ( rs ) . Vaddr ) + r . Add ( )
2016-08-25 11:07:33 -05:00
}
2016-03-27 10:21:48 -04:00
2017-10-21 12:45:23 +02:00
case objabi . R_ADDRCUOFF :
// debug_range and debug_loc elements use this relocation type to get an
// offset from the start of the compile unit.
2020-04-25 14:17:10 -04:00
o = ldr . SymValue ( rs ) + r . Add ( ) - ldr . SymValue ( loader . Sym ( ldr . SymUnit ( rs ) . Textp2 [ 0 ] ) )
2017-10-21 12:45:23 +02:00
2020-04-25 14:17:10 -04:00
// r.Sym() can be 0 when CALL $(constant) is transformed from absolute PC to relative PC call.
2017-04-18 12:53:25 -07:00
case objabi . R_GOTPCREL :
2020-04-26 01:42:55 -04:00
if target . IsDynlinkingGo ( ) && target . IsDarwin ( ) && rs != 0 && rst != sym . SCONST {
needExtReloc = true
2020-04-29 22:00:28 -04:00
rr . Xadd = r . Add ( )
rr . Xadd -= int64 ( siz ) // relative to address after the relocated chunk
rr . Xsym = rs
2020-04-26 01:42:55 -04:00
o = rr . Xadd
2020-04-29 22:00:28 -04:00
o += int64 ( siz )
2020-04-26 01:42:55 -04:00
break
}
2016-11-13 21:20:58 -05:00
fallthrough
2017-04-18 12:53:25 -07:00
case objabi . R_CALL , objabi . R_PCREL :
2020-04-26 01:42:55 -04:00
if target . IsExternal ( ) && rs != 0 && rst == sym . SUNDEFEXT {
// pass through to the external linker.
needExtReloc = true
rr . Xadd = 0
if target . IsElf ( ) {
2020-04-29 22:00:28 -04:00
rr . Xadd -= int64 ( siz )
2020-04-26 01:42:55 -04:00
}
2020-04-29 22:00:28 -04:00
rr . Xsym = rs
2020-04-26 01:42:55 -04:00
o = 0
break
}
if target . IsExternal ( ) && rs != 0 && rst != sym . SCONST && ( ldr . SymSect ( rs ) != ldr . SymSect ( s ) || rt == objabi . R_GOTPCREL ) {
needExtReloc = true
// set up addend for eventual relocation via outer symbol.
rs := rs
rs , off := foldSubSymbolOffset ( ldr , rs )
2020-04-29 22:00:28 -04:00
rr . Xadd = r . Add ( ) + off
rr . Xadd -= int64 ( siz ) // relative to address after the relocated chunk
2020-04-26 01:42:55 -04:00
rst := ldr . SymType ( rs )
if rst != sym . SHOSTOBJ && rst != sym . SDYNIMPORT && ldr . SymSect ( rs ) == nil {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "missing section for relocation target %s" , ldr . SymName ( rs ) )
2020-04-26 01:42:55 -04:00
}
rr . Xsym = rs
o = rr . Xadd
if target . IsElf ( ) {
if target . IsAMD64 ( ) {
o = 0
}
} else if target . IsDarwin ( ) {
2020-04-29 22:00:28 -04:00
if rt == objabi . R_CALL {
2020-04-26 01:42:55 -04:00
if target . IsExternal ( ) && rst == sym . SDYNIMPORT {
if target . IsAMD64 ( ) {
// AMD64 dynamic relocations are relative to the end of the relocation.
2020-04-29 22:00:28 -04:00
o += int64 ( siz )
2020-04-26 01:42:55 -04:00
}
} else {
if rst != sym . SHOSTOBJ {
o += int64 ( uint64 ( ldr . SymValue ( rs ) ) - ldr . SymSect ( rs ) . Vaddr )
}
2020-04-29 22:00:28 -04:00
o -= int64 ( off ) // relative to section offset, not symbol
2020-04-26 01:42:55 -04:00
}
} else {
2020-04-29 22:00:28 -04:00
o += int64 ( siz )
2020-04-26 01:42:55 -04:00
}
} else if target . IsWindows ( ) && target . IsAMD64 ( ) { // only amd64 needs PCREL
// PE/COFF's PC32 relocation uses the address after the relocated
// bytes as the base. Compensate by skewing the addend.
2020-04-29 22:00:28 -04:00
o += int64 ( siz )
2020-04-26 01:42:55 -04:00
} else {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "unhandled pcrel relocation to %s on %v" , ldr . SymName ( rs ) , target . HeadType )
2020-04-26 01:42:55 -04:00
}
break
}
2015-02-27 22:57:28 -05:00
o = 0
2020-04-25 14:17:10 -04:00
if rs != 0 {
o = ldr . SymValue ( rs )
2015-02-27 22:57:28 -05:00
}
2020-04-25 14:17:10 -04:00
o += r . Add ( ) - ( ldr . SymValue ( s ) + int64 ( off ) + int64 ( siz ) )
2017-04-18 12:53:25 -07:00
case objabi . R_SIZE :
2020-04-25 14:17:10 -04:00
o = ldr . SymSize ( rs ) + r . Add ( )
2019-02-20 16:20:56 +01:00
case objabi . R_XCOFFREF :
2020-02-24 20:59:02 -05:00
if ! target . IsAIX ( ) {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "find XCOFF R_REF on non-XCOFF files" )
2019-02-20 16:20:56 +01:00
}
2020-02-24 20:59:02 -05:00
if ! target . IsExternal ( ) {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "find XCOFF R_REF with internal linking" )
2019-02-20 16:20:56 +01:00
}
2020-04-26 01:42:55 -04:00
needExtReloc = true
2020-04-29 22:00:28 -04:00
rr . Xsym = rs
rr . Xadd = r . Add ( )
2019-02-20 16:20:56 +01:00
// This isn't a real relocation so it must not update
// its offset value.
continue
2019-04-04 00:30:25 -04:00
case objabi . R_DWARFFILEREF :
2020-04-25 14:17:10 -04:00
// We don't renumber files in dwarf.go:writelines anymore.
continue
2020-04-29 10:07:07 -04:00
case objabi . R_CONST :
o = r . Add ( )
case objabi . R_GOTOFF :
o = ldr . SymValue ( rs ) + r . Add ( ) - ldr . SymValue ( syms . GOT2 )
2015-02-27 22:57:28 -05:00
}
2020-04-25 14:17:10 -04:00
//if target.IsPPC64() || target.IsS390X() {
// r.InitExt()
// if r.Variant != sym.RV_NONE {
// o = thearch.Archrelocvariant(ldr, target, syms, &r, s, o)
// }
//}
2015-02-27 22:57:28 -05:00
switch siz {
default :
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "bad reloc size %#x for %s" , uint32 ( siz ) , ldr . SymName ( rs ) )
2015-02-27 22:57:28 -05:00
case 1 :
2020-04-25 14:17:10 -04:00
P [ off ] = byte ( int8 ( o ) )
2015-02-27 22:57:28 -05:00
case 2 :
if o != int64 ( int16 ( o ) ) {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "relocation address for %s is too big: %#x" , ldr . SymName ( rs ) , o )
2015-02-27 22:57:28 -05:00
}
2020-04-25 14:17:10 -04:00
target . Arch . ByteOrder . PutUint16 ( P [ off : ] , uint16 ( o ) )
2015-02-27 22:57:28 -05:00
case 4 :
2020-04-25 14:17:10 -04:00
if rt == objabi . R_PCREL || rt == objabi . R_CALL {
2015-02-27 22:57:28 -05:00
if o != int64 ( int32 ( o ) ) {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "pc-relative relocation address for %s is too big: %#x" , ldr . SymName ( rs ) , o )
2015-02-27 22:57:28 -05:00
}
} else {
if o != int64 ( int32 ( o ) ) && o != int64 ( uint32 ( o ) ) {
2020-04-30 10:19:28 -04:00
st . err . Errorf ( s , "non-pc-relative relocation address for %s is too big: %#x" , ldr . SymName ( rs ) , uint64 ( o ) )
2015-02-27 22:57:28 -05:00
}
}
2020-04-25 14:17:10 -04:00
target . Arch . ByteOrder . PutUint32 ( P [ off : ] , uint32 ( o ) )
2015-02-27 22:57:28 -05:00
case 8 :
2020-04-25 14:17:10 -04:00
target . Arch . ByteOrder . PutUint64 ( P [ off : ] , uint64 ( o ) )
2015-02-27 22:57:28 -05:00
}
2020-04-26 01:42:55 -04:00
if needExtReloc {
extRelocs = append ( extRelocs , rr )
}
}
if len ( extRelocs ) != 0 {
2020-04-30 10:19:28 -04:00
st . finalizeExtRelocSlice ( extRelocs )
2020-04-26 01:42:55 -04:00
ldr . SetExtRelocs ( s , extRelocs )
2015-02-27 22:57:28 -05:00
}
}
2020-04-30 10:19:28 -04:00
const extRelocSlabSize = 2048
// relocSymState hold state information needed when making a series of
// successive calls to relocsym(). The items here are invariant
// (meaning that they are set up once initially and then don't change
// during the execution of relocsym), with the exception of a slice
// used to facilitate batch allocation of external relocations. Calls
// to relocsym happen in parallel; the assumption is that each
// parallel thread will have its own state object.
type relocSymState struct {
target * Target
ldr * loader . Loader
err * ErrorReporter
syms * ArchSyms
batch [ ] loader . ExtReloc
}
// preallocExtRelocs returns a subslice from an internally allocated
// slab owned by the state object. Client requests a slice of size
// 'sz', however it may be that fewer relocs are needed; the
// assumption is that the final size is set in a [required] subsequent
// call to 'finalizeExtRelocSlice'.
func ( st * relocSymState ) preallocExtRelocSlice ( sz int ) [ ] loader . ExtReloc {
if len ( st . batch ) < sz {
slabSize := extRelocSlabSize
if sz > extRelocSlabSize {
slabSize = sz
}
st . batch = make ( [ ] loader . ExtReloc , slabSize )
}
rval := st . batch [ : sz : sz ]
return rval [ : 0 ]
}
// finalizeExtRelocSlice takes a slice returned from preallocExtRelocSlice,
// from which it determines how many of the pre-allocated relocs were
// actually needed; it then carves that number off the batch slice.
func ( st * relocSymState ) finalizeExtRelocSlice ( finalsl [ ] loader . ExtReloc ) {
if & st . batch [ 0 ] != & finalsl [ 0 ] {
panic ( "preallocExtRelocSlice size invariant violation" )
}
st . batch = st . batch [ len ( finalsl ) : ]
}
// makeRelocSymState creates a relocSymState container object to
// pass to relocsym(). If relocsym() calls happen in parallel,
// each parallel thread should have its own state object.
func ( ctxt * Link ) makeRelocSymState ( ) * relocSymState {
return & relocSymState {
target : & ctxt . Target ,
ldr : ctxt . loader ,
err : & ctxt . ErrorReporter ,
syms : & ctxt . ArchSyms ,
}
}
2016-08-19 22:40:38 -04:00
func ( ctxt * Link ) reloc ( ) {
2020-02-24 21:08:12 -05:00
var wg sync . WaitGroup
2020-03-24 12:55:14 -04:00
ldr := ctxt . loader
2020-04-26 01:42:55 -04:00
if ctxt . IsExternal ( ) {
ldr . InitExtRelocs ( )
}
2020-02-24 21:08:12 -05:00
wg . Add ( 3 )
go func ( ) {
2020-04-24 20:22:39 -04:00
if ! ctxt . IsWasm ( ) { // On Wasm, text relocations are applied in Asmb2.
2020-04-30 10:19:28 -04:00
st := ctxt . makeRelocSymState ( )
2020-04-25 22:08:50 -04:00
for _ , s := range ctxt . Textp2 {
2020-04-30 10:19:28 -04:00
st . relocsym ( s , ldr . OutData ( s ) )
2020-04-24 20:22:39 -04:00
}
2020-02-24 21:08:12 -05:00
}
wg . Done ( )
} ( )
go func ( ) {
2020-04-30 10:19:28 -04:00
st := ctxt . makeRelocSymState ( )
2020-04-25 22:08:50 -04:00
for _ , s := range ctxt . datap2 {
2020-04-30 10:19:28 -04:00
st . relocsym ( s , ldr . OutData ( s ) )
2020-02-24 21:08:12 -05:00
}
wg . Done ( )
} ( )
go func ( ) {
2020-04-30 10:19:28 -04:00
st := ctxt . makeRelocSymState ( )
2020-04-25 22:08:50 -04:00
for _ , si := range dwarfp2 {
2020-04-17 09:11:57 -04:00
for _ , s := range si . syms {
2020-04-30 10:19:28 -04:00
st . relocsym ( s , ldr . OutData ( s ) )
2020-04-17 09:11:57 -04:00
}
2020-02-24 21:08:12 -05:00
}
wg . Done ( )
} ( )
wg . Wait ( )
2015-02-27 22:57:28 -05:00
}
2020-03-24 09:20:01 -04:00
func windynrelocsym ( ctxt * Link , rel * loader . SymbolBuilder , s loader . Sym ) {
var su * loader . SymbolBuilder
relocs := ctxt . loader . Relocs ( s )
2020-03-30 10:04:00 -04:00
for ri := 0 ; ri < relocs . Count ( ) ; ri ++ {
2020-03-24 09:20:01 -04:00
r := relocs . At2 ( ri )
targ := r . Sym ( )
if targ == 0 {
2017-09-09 11:23:29 +09:00
continue
2015-02-27 22:57:28 -05:00
}
2020-03-24 09:20:01 -04:00
rt := r . Type ( )
if ! ctxt . loader . AttrReachable ( targ ) {
if rt == objabi . R_WEAKADDROFF {
2015-02-27 22:57:28 -05:00
continue
}
2020-03-24 09:20:01 -04:00
ctxt . Errorf ( s , "dynamic relocation to unreachable symbol %s" ,
ctxt . loader . SymName ( targ ) )
2017-09-09 11:23:29 +09:00
}
2020-03-24 09:20:01 -04:00
tplt := ctxt . loader . SymPlt ( targ )
tgot := ctxt . loader . SymGot ( targ )
if tplt == - 2 && tgot != - 2 { // make dynimport JMP table for PE object files.
tplt := int32 ( rel . Size ( ) )
ctxt . loader . SetPlt ( targ , tplt )
if su == nil {
su = ctxt . loader . MakeSymbolUpdater ( s )
}
2020-03-28 16:46:47 -04:00
r . SetSym ( rel . Sym ( ) )
r . SetAdd ( int64 ( tplt ) )
2017-09-09 11:23:29 +09:00
// jmp *addr
2018-07-24 15:13:41 -07:00
switch ctxt . Arch . Family {
default :
2020-03-24 09:20:01 -04:00
ctxt . Errorf ( s , "unsupported arch %v" , ctxt . Arch . Family )
2018-07-24 15:13:41 -07:00
return
case sys . I386 :
2017-09-30 15:06:44 +00:00
rel . AddUint8 ( 0xff )
rel . AddUint8 ( 0x25 )
2020-03-24 09:20:01 -04:00
rel . AddAddrPlus ( ctxt . Arch , targ , 0 )
2017-09-30 15:06:44 +00:00
rel . AddUint8 ( 0x90 )
rel . AddUint8 ( 0x90 )
2018-07-24 15:13:41 -07:00
case sys . AMD64 :
2017-09-30 15:06:44 +00:00
rel . AddUint8 ( 0xff )
rel . AddUint8 ( 0x24 )
rel . AddUint8 ( 0x25 )
2020-03-24 09:20:01 -04:00
rel . AddAddrPlus4 ( ctxt . Arch , targ , 0 )
2017-09-30 15:06:44 +00:00
rel . AddUint8 ( 0x90 )
2015-02-27 22:57:28 -05:00
}
2020-03-24 09:20:01 -04:00
} else if tplt >= 0 {
if su == nil {
su = ctxt . loader . MakeSymbolUpdater ( s )
}
2020-03-28 16:46:47 -04:00
r . SetSym ( rel . Sym ( ) )
r . SetAdd ( int64 ( tplt ) )
2015-02-27 22:57:28 -05:00
}
2017-09-09 11:23:29 +09:00
}
}
2015-02-27 22:57:28 -05:00
2018-06-16 16:23:52 +10:00
// windynrelocsyms generates jump table to C library functions that will be
// added later. windynrelocsyms writes the table into .rel symbol.
func ( ctxt * Link ) windynrelocsyms ( ) {
2020-03-24 09:20:01 -04:00
if ! ( ctxt . IsWindows ( ) && iscgo && ctxt . IsInternal ( ) ) {
2015-02-27 22:57:28 -05:00
return
}
2018-06-16 16:23:52 +10:00
2020-03-24 09:20:01 -04:00
rel := ctxt . loader . LookupOrCreateSym ( ".rel" , 0 )
relu := ctxt . loader . MakeSymbolUpdater ( rel )
relu . SetType ( sym . STEXT )
2018-06-16 16:23:52 +10:00
2020-03-24 09:20:01 -04:00
for _ , s := range ctxt . Textp2 {
windynrelocsym ( ctxt , relu , s )
2018-06-16 16:23:52 +10:00
}
2020-03-24 09:20:01 -04:00
ctxt . Textp2 = append ( ctxt . Textp2 , rel )
2018-06-16 16:23:52 +10:00
}
2015-02-27 22:57:28 -05:00
2020-04-21 18:37:43 -04:00
func dynrelocsym2 ( ctxt * Link , s loader . Sym ) {
2020-03-04 15:27:47 -05:00
target := & ctxt . Target
2020-03-24 12:55:14 -04:00
ldr := ctxt . loader
2020-03-04 15:27:47 -05:00
syms := & ctxt . ArchSyms
2020-04-21 18:37:43 -04:00
relocs := ldr . Relocs ( s )
for ri := 0 ; ri < relocs . Count ( ) ; ri ++ {
r := relocs . At2 ( ri )
2017-10-05 10:20:17 -04:00
if ctxt . BuildMode == BuildModePIE && ctxt . LinkMode == LinkInternal {
2016-09-05 23:49:53 -04:00
// It's expected that some relocations will be done
// later by relocsym (R_TLS_LE, R_ADDROFF), so
// don't worry if Adddynrel returns false.
2020-04-30 01:32:48 -04:00
thearch . Adddynrel2 ( target , ldr , syms , s , r , ri )
2016-09-05 23:49:53 -04:00
continue
}
2018-09-28 17:02:16 +02:00
2020-04-21 18:37:43 -04:00
rSym := r . Sym ( )
if rSym != 0 && ldr . SymType ( rSym ) == sym . SDYNIMPORT || r . Type ( ) >= objabi . ElfRelocOffset {
if rSym != 0 && ! ldr . AttrReachable ( rSym ) {
ctxt . Errorf ( s , "dynamic relocation to unreachable symbol %s" , ldr . SymName ( rSym ) )
2015-02-27 22:57:28 -05:00
}
2020-04-30 01:32:48 -04:00
if ! thearch . Adddynrel2 ( target , ldr , syms , s , r , ri ) {
2020-04-21 18:37:43 -04:00
ctxt . Errorf ( s , "unsupported dynamic relocation for symbol %s (type=%d (%s) stype=%d (%s))" , ldr . SymName ( rSym ) , r . Type ( ) , sym . RelocName ( ctxt . Arch , r . Type ( ) ) , ldr . SymType ( rSym ) , ldr . SymType ( rSym ) )
2016-09-05 23:49:53 -04:00
}
2015-02-27 22:57:28 -05:00
}
}
}
2020-04-21 18:37:43 -04:00
func ( state * dodataState ) dynreloc2 ( ctxt * Link ) {
2018-06-16 16:23:52 +10:00
if ctxt . HeadType == objabi . Hwindows {
return
}
2015-02-27 22:57:28 -05:00
// -d suppresses dynamic loader format, so we may as well not
// compute these sections or mark their symbols as reachable.
2018-06-16 16:23:52 +10:00
if * FlagD {
2015-02-27 22:57:28 -05:00
return
}
2020-04-21 18:37:43 -04:00
for _ , s := range ctxt . Textp2 {
dynrelocsym2 ( ctxt , s )
2015-02-27 22:57:28 -05:00
}
2020-04-21 18:37:43 -04:00
for _ , syms := range state . data2 {
2017-10-04 17:54:04 -04:00
for _ , s := range syms {
2020-04-21 18:37:43 -04:00
dynrelocsym2 ( ctxt , s )
2016-04-18 14:50:14 -04:00
}
2015-02-27 22:57:28 -05:00
}
2017-10-07 13:43:38 -04:00
if ctxt . IsELF {
2020-04-21 18:37:43 -04:00
elfdynhash2 ( ctxt )
2015-02-27 22:57:28 -05:00
}
}
2020-03-20 12:24:35 -04:00
func Codeblk ( ctxt * Link , out * OutBuf , addr int64 , size int64 ) {
CodeblkPad ( ctxt , out , addr , size , zeros [ : ] )
2016-06-11 17:12:28 -07:00
}
2020-03-17 10:24:40 -04:00
func CodeblkPad ( ctxt * Link , out * OutBuf , addr int64 , size int64 , pad [ ] byte ) {
2020-04-25 17:50:48 -04:00
writeBlocks ( out , ctxt . outSem , ctxt . loader , ctxt . Textp2 , addr , size , pad )
2015-02-27 22:57:28 -05:00
}
2020-03-17 10:24:40 -04:00
const blockSize = 1 << 20 // 1MB chunks written at a time.
// writeBlocks writes a specified chunk of symbols to the output buffer. It
// breaks the write up into ≥blockSize chunks to write them out, and schedules
// as many goroutines as necessary to accomplish this task. This call then
// blocks, waiting on the writes to complete. Note that we use the sem parameter
// to limit the number of concurrent writes taking place.
2020-04-25 17:50:48 -04:00
func writeBlocks ( out * OutBuf , sem chan int , ldr * loader . Loader , syms [ ] loader . Sym , addr , size int64 , pad [ ] byte ) {
2020-03-17 10:24:40 -04:00
for i , s := range syms {
2020-04-25 17:50:48 -04:00
if ldr . SymValue ( s ) >= addr && ! ldr . AttrSubSymbol ( s ) {
2020-03-17 10:24:40 -04:00
syms = syms [ i : ]
break
}
}
var wg sync . WaitGroup
max , lastAddr , written := int64 ( blockSize ) , addr + size , int64 ( 0 )
for addr < lastAddr {
// Find the last symbol we'd write.
idx := - 1
for i , s := range syms {
2020-04-25 17:50:48 -04:00
if ldr . AttrSubSymbol ( s ) {
2020-04-10 22:11:51 -04:00
continue
}
2020-03-17 10:24:40 -04:00
// If the next symbol's size would put us out of bounds on the total length,
// stop looking.
2020-04-25 17:50:48 -04:00
end := ldr . SymValue ( s ) + ldr . SymSize ( s )
if end > lastAddr {
2020-03-17 10:24:40 -04:00
break
}
// We're gonna write this symbol.
idx = i
// If we cross over the max size, we've got enough symbols.
2020-04-25 17:50:48 -04:00
if end > addr + max {
2020-03-17 10:24:40 -04:00
break
}
}
// If we didn't find any symbols to write, we're done here.
if idx < 0 {
break
}
2020-03-31 14:36:19 -04:00
// Compute the length to write, including padding.
2020-03-31 18:53:17 -04:00
// We need to write to the end address (lastAddr), or the next symbol's
// start address, whichever comes first. If there is no more symbols,
// just write to lastAddr. This ensures we don't leave holes between the
// blocks or at the end.
length := int64 ( 0 )
2020-03-31 14:36:19 -04:00
if idx + 1 < len ( syms ) {
2020-04-10 22:11:51 -04:00
// Find the next top-level symbol.
// Skip over sub symbols so we won't split a containter symbol
// into two blocks.
next := syms [ idx + 1 ]
2020-04-25 17:50:48 -04:00
for ldr . AttrSubSymbol ( next ) {
2020-04-10 22:11:51 -04:00
idx ++
next = syms [ idx + 1 ]
}
2020-04-25 17:50:48 -04:00
length = ldr . SymValue ( next ) - addr
2020-03-31 18:53:17 -04:00
}
if length == 0 || length > lastAddr - addr {
2020-03-31 14:36:19 -04:00
length = lastAddr - addr
}
2020-03-17 10:24:40 -04:00
// Start the block output operator.
if o , err := out . View ( uint64 ( out . Offset ( ) + written ) ) ; err == nil {
sem <- 1
wg . Add ( 1 )
2020-04-25 17:50:48 -04:00
go func ( o * OutBuf , ldr * loader . Loader , syms [ ] loader . Sym , addr , size int64 , pad [ ] byte ) {
writeBlock ( o , ldr , syms , addr , size , pad )
2020-03-17 10:24:40 -04:00
wg . Done ( )
<- sem
2020-04-25 17:50:48 -04:00
} ( o , ldr , syms , addr , length , pad )
2020-03-17 10:24:40 -04:00
} else { // output not mmaped, don't parallelize.
2020-04-25 17:50:48 -04:00
writeBlock ( out , ldr , syms , addr , length , pad )
2020-03-17 10:24:40 -04:00
}
// Prepare for the next loop.
if idx != - 1 {
syms = syms [ idx + 1 : ]
}
written += length
addr += length
}
wg . Wait ( )
}
2020-04-25 17:50:48 -04:00
func writeBlock ( out * OutBuf , ldr * loader . Loader , syms [ ] loader . Sym , addr , size int64 , pad [ ] byte ) {
2016-04-18 14:50:14 -04:00
for i , s := range syms {
2020-04-25 17:50:48 -04:00
if ldr . SymValue ( s ) >= addr && ! ldr . AttrSubSymbol ( s ) {
2016-04-18 14:50:14 -04:00
syms = syms [ i : ]
break
}
}
cmd/link: compress DWARF sections in ELF binaries
Forked from CL 111895.
The trickiest part of this is that the binary layout code (blk,
elfshbits, and various other things) assumes a constant offset between
symbols' and sections' file locations and their virtual addresses.
Compression, of course, breaks this constant offset. But we need to
assign virtual addresses to everything before compression in order to
resolve relocations before compression. As a result, compression needs
to re-compute the "address" of the DWARF sections and symbols based on
their compressed size. Luckily, these are at the end of the file, so
this doesn't perturb any other sections or symbols. (And there is, of
course, a surprising amount of code that assumes the DWARF segment
comes last, so what's one more place?)
Relevant benchmarks:
name old time/op new time/op delta
StdCmd 10.3s ± 2% 10.8s ± 1% +5.43% (p=0.000 n=30+30)
name old text-bytes new text-bytes delta
HelloSize 746kB ± 0% 746kB ± 0% ~ (all equal)
CmdGoSize 8.41MB ± 0% 8.41MB ± 0% ~ (all equal)
[Geo mean] 2.50MB 2.50MB +0.00%
name old data-bytes new data-bytes delta
HelloSize 10.6kB ± 0% 10.6kB ± 0% ~ (all equal)
CmdGoSize 252kB ± 0% 252kB ± 0% ~ (all equal)
[Geo mean] 51.5kB 51.5kB +0.00%
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
CmdGoSize 145kB ± 0% 145kB ± 0% ~ (all equal)
[Geo mean] 135kB 135kB +0.00%
name old exe-bytes new exe-bytes delta
HelloSize 1.60MB ± 0% 1.05MB ± 0% -34.39% (p=0.000 n=30+30)
CmdGoSize 16.5MB ± 0% 11.3MB ± 0% -31.76% (p=0.000 n=30+30)
[Geo mean] 5.14MB 3.44MB -33.08%
Fixes #11799.
Updates #6853.
Change-Id: I64197afe4c01a237523a943088051ee056331c6f
Reviewed-on: https://go-review.googlesource.com/118276
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-05 21:49:40 -04:00
// This doesn't distinguish the memory size from the file
// size, and it lays out the file based on Symbol.Value, which
// is the virtual address. DWARF compression changes file sizes,
// so dwarfcompress will fix this up later if necessary.
2016-04-18 14:50:14 -04:00
eaddr := addr + size
for _ , s := range syms {
2020-04-25 17:50:48 -04:00
if ldr . AttrSubSymbol ( s ) {
2016-04-18 14:50:14 -04:00
continue
}
2020-04-25 17:50:48 -04:00
val := ldr . SymValue ( s )
if val >= eaddr {
2016-04-18 14:50:14 -04:00
break
}
2020-04-25 17:50:48 -04:00
if val < addr {
ldr . Errorf ( s , "phase error: addr=%#x but sym=%#x type=%d" , addr , val , ldr . SymType ( s ) )
2016-04-18 14:50:14 -04:00
errorexit ( )
}
2020-04-25 17:50:48 -04:00
if addr < val {
out . WriteStringPad ( "" , int ( val - addr ) , pad )
addr = val
2016-04-18 14:50:14 -04:00
}
2020-04-25 17:50:48 -04:00
out . WriteSym ( ldr , s )
addr += int64 ( len ( ldr . Data ( s ) ) )
siz := ldr . SymSize ( s )
if addr < val + siz {
out . WriteStringPad ( "" , int ( val + siz - addr ) , pad )
addr = val + siz
2016-04-18 14:50:14 -04:00
}
2020-04-25 17:50:48 -04:00
if addr != val + siz {
ldr . Errorf ( s , "phase error: addr=%#x value+size=%#x" , addr , val + siz )
2016-04-18 14:50:14 -04:00
errorexit ( )
}
2020-04-25 17:50:48 -04:00
if val + siz >= eaddr {
2016-04-18 14:50:14 -04:00
break
}
}
if addr < eaddr {
2019-03-17 15:51:30 +01:00
out . WriteStringPad ( "" , int ( eaddr - addr ) , pad )
2016-04-18 14:50:14 -04:00
}
2020-03-17 10:24:40 -04:00
}
type writeFn func ( * Link , * OutBuf , int64 , int64 )
// WriteParallel handles scheduling parallel execution of data write functions.
func WriteParallel ( wg * sync . WaitGroup , fn writeFn , ctxt * Link , seek , vaddr , length uint64 ) {
if out , err := ctxt . Out . View ( seek ) ; err != nil {
ctxt . Out . SeekSet ( int64 ( seek ) )
fn ( ctxt , ctxt . Out , int64 ( vaddr ) , int64 ( length ) )
} else {
wg . Add ( 1 )
go func ( ) {
defer wg . Done ( )
fn ( ctxt , out , int64 ( vaddr ) , int64 ( length ) )
} ( )
}
2016-04-18 14:50:14 -04:00
}
2020-03-20 12:24:35 -04:00
func Datblk ( ctxt * Link , out * OutBuf , addr , size int64 ) {
2020-03-17 10:24:40 -04:00
writeDatblkToOutBuf ( ctxt , out , addr , size )
}
2019-04-03 22:41:48 -04:00
// Used only on Wasm for now.
2019-03-17 15:51:30 +01:00
func DatblkBytes ( ctxt * Link , addr int64 , size int64 ) [ ] byte {
2020-04-17 16:07:45 -04:00
buf := make ( [ ] byte , size )
out := & OutBuf { heap : buf }
2019-03-17 15:51:30 +01:00
writeDatblkToOutBuf ( ctxt , out , addr , size )
2020-04-17 16:07:45 -04:00
return buf
2019-03-17 15:51:30 +01:00
}
func writeDatblkToOutBuf ( ctxt * Link , out * OutBuf , addr int64 , size int64 ) {
2020-04-25 17:50:48 -04:00
writeBlocks ( out , ctxt . outSem , ctxt . loader , ctxt . datap2 , addr , size , zeros [ : ] )
2015-02-27 22:57:28 -05:00
}
2020-03-20 12:24:35 -04:00
func Dwarfblk ( ctxt * Link , out * OutBuf , addr int64 , size int64 ) {
2020-04-17 09:11:57 -04:00
// Concatenate the section symbol lists into a single list to pass
// to writeBlocks.
//
// NB: ideally we would do a separate writeBlocks call for each
// section, but this would run the risk of undoing any file offset
// adjustments made during layout.
n := 0
2020-04-25 17:50:48 -04:00
for i := range dwarfp2 {
n += len ( dwarfp2 [ i ] . syms )
2020-04-17 09:11:57 -04:00
}
2020-04-25 17:50:48 -04:00
syms := make ( [ ] loader . Sym , 0 , n )
for i := range dwarfp2 {
syms = append ( syms , dwarfp2 [ i ] . syms ... )
2020-04-17 09:11:57 -04:00
}
2020-04-25 17:50:48 -04:00
writeBlocks ( out , ctxt . outSem , ctxt . loader , syms , addr , size , zeros [ : ] )
2020-03-17 10:24:40 -04:00
}
2016-02-29 11:49:49 +02:00
var zeros [ 512 ] byte
2015-02-27 22:57:28 -05:00
2018-01-09 10:10:46 -05:00
var (
strdata = make ( map [ string ] string )
strnames [ ] string
)
2015-06-29 13:03:11 -04:00
2016-08-19 22:40:38 -04:00
func addstrdata1 ( ctxt * Link , arg string ) {
2017-10-05 15:49:32 +02:00
eq := strings . Index ( arg , "=" )
2016-10-25 08:59:35 -04:00
dot := strings . LastIndex ( arg [ : eq + 1 ] , "." )
if eq < 0 || dot < 0 {
2015-05-21 14:35:02 -04:00
Exitf ( "-X flag requires argument of the form importpath.name=value" )
2015-02-27 22:57:28 -05:00
}
2017-10-21 07:29:46 -04:00
pkg := arg [ : dot ]
2017-10-05 10:20:17 -04:00
if ctxt . BuildMode == BuildModePlugin && pkg == "main" {
2017-10-01 20:28:53 -04:00
pkg = * flagPluginPath
}
2017-10-21 07:29:46 -04:00
pkg = objabi . PathToPrefix ( pkg )
2018-01-09 10:10:46 -05:00
name := pkg + arg [ dot : eq ]
value := arg [ eq + 1 : ]
if _ , ok := strdata [ name ] ; ! ok {
strnames = append ( strnames , name )
}
strdata [ name ] = value
2015-02-27 22:57:28 -05:00
}
2018-06-26 13:48:05 +07:00
// addstrdata sets the initial value of the string variable name to value.
2020-02-12 17:20:00 -05:00
func addstrdata ( arch * sys . Arch , l * loader . Loader , name , value string ) {
s := l . Lookup ( name , 0 )
if s == 0 {
2018-01-09 10:10:46 -05:00
return
}
2020-02-12 17:20:00 -05:00
if goType := l . SymGoType ( s ) ; goType == 0 {
return
} else if typeName := l . SymName ( goType ) ; typeName != "type.string" {
Errorf ( nil , "%s: cannot set with -X: not a var of type string (%s)" , name , typeName )
2018-01-09 10:10:46 -05:00
return
}
2020-03-13 16:56:00 -04:00
if ! l . AttrReachable ( s ) {
return // don't bother setting unreachable variable
}
2020-02-18 10:54:26 -05:00
bld := l . MakeSymbolUpdater ( s )
2020-02-12 17:20:00 -05:00
if bld . Type ( ) == sym . SBSS {
bld . SetType ( sym . SDATA )
2018-01-09 10:10:46 -05:00
}
2020-02-12 17:20:00 -05:00
p := fmt . Sprintf ( "%s.str" , name )
2020-02-18 10:54:26 -05:00
sp := l . LookupOrCreateSym ( p , 0 )
sbld := l . MakeSymbolUpdater ( sp )
2015-02-27 22:57:28 -05:00
2020-02-12 17:20:00 -05:00
sbld . Addstring ( value )
sbld . SetType ( sym . SRODATA )
2015-02-27 22:57:28 -05:00
2020-02-12 17:20:00 -05:00
bld . SetSize ( 0 )
bld . SetData ( make ( [ ] byte , 0 , arch . PtrSize * 2 ) )
bld . SetReadOnly ( false )
bld . SetRelocs ( nil )
bld . AddAddrPlus ( arch , sp , 0 )
bld . AddUint ( arch , uint64 ( len ( value ) ) )
2015-02-27 22:57:28 -05:00
}
2018-01-09 10:10:46 -05:00
func ( ctxt * Link ) dostrdata ( ) {
for _ , name := range strnames {
2020-02-12 17:20:00 -05:00
addstrdata ( ctxt . Arch , ctxt . loader , name , strdata [ name ] )
2015-06-29 13:03:11 -04:00
}
}
2015-04-11 12:05:21 +08:00
// addgostring adds str, as a Go string value, to s. symname is the name of the
// symbol used to define the string data and must be unique per linked object.
2020-04-20 18:42:35 -04:00
func addgostring ( ctxt * Link , ldr * loader . Loader , s * loader . SymbolBuilder , symname , str string ) {
sdata := ldr . CreateSymForUpdate ( symname , 0 )
if sdata . Type ( ) != sym . Sxxx {
ctxt . Errorf ( s . Sym ( ) , "duplicate symname in addgostring: %s" , symname )
2015-04-11 12:05:21 +08:00
}
2020-04-20 18:42:35 -04:00
sdata . SetReachable ( true )
sdata . SetLocal ( true )
sdata . SetType ( sym . SRODATA )
sdata . SetSize ( int64 ( len ( str ) ) )
sdata . SetData ( [ ] byte ( str ) )
s . AddAddr ( ctxt . Arch , sdata . Sym ( ) )
2017-09-30 15:06:44 +00:00
s . AddUint ( ctxt . Arch , uint64 ( len ( str ) ) )
2015-04-11 12:05:21 +08:00
}
2020-04-20 18:42:35 -04:00
func addinitarrdata ( ctxt * Link , ldr * loader . Loader , s loader . Sym ) {
p := ldr . SymName ( s ) + ".ptr"
sp := ldr . CreateSymForUpdate ( p , 0 )
sp . SetType ( sym . SINITARR )
sp . SetSize ( 0 )
sp . SetDuplicateOK ( true )
2017-09-30 15:06:44 +00:00
sp . AddAddr ( ctxt . Arch , s )
2015-03-17 09:47:01 -07:00
}
2016-03-02 21:24:04 -05:00
// symalign returns the required alignment for the given symbol s.
2020-04-21 18:37:43 -04:00
func ( state * dodataState ) symalign2 ( s loader . Sym ) int32 {
2018-04-01 00:58:48 +03:00
min := int32 ( thearch . Minalign )
2020-04-21 18:37:43 -04:00
ldr := state . ctxt . loader
align := ldr . SymAlign ( s )
if align >= min {
return align
} else if align != 0 {
2016-03-02 21:24:04 -05:00
return min
2015-02-27 22:57:28 -05:00
}
2020-04-21 18:37:43 -04:00
// FIXME: figure out a way to avoid checking by name here.
sname := ldr . SymName ( s )
if strings . HasPrefix ( sname , "go.string." ) || strings . HasPrefix ( sname , "type..namedata." ) {
2016-03-07 22:48:07 -05:00
// String data is just bytes.
// If we align it, we waste a lot of space to padding.
2016-03-18 16:57:54 -04:00
return min
2016-03-07 22:48:07 -05:00
}
2020-04-21 18:37:43 -04:00
align = int32 ( thearch . Maxalign )
ssz := ldr . SymSize ( s )
for int64 ( align ) > ssz && align > min {
2015-02-27 22:57:28 -05:00
align >>= 1
}
2020-04-21 18:37:43 -04:00
ldr . SetSymAlign ( s , align )
2015-02-27 22:57:28 -05:00
return align
}
2020-04-21 18:37:43 -04:00
func aligndatsize2 ( state * dodataState , datsize int64 , s loader . Sym ) int64 {
return Rnd ( datsize , int64 ( state . symalign2 ( s ) ) )
2015-02-27 22:57:28 -05:00
}
runtime: replace GC programs with simpler encoding, faster decoder
Small types record the location of pointers in their memory layout
by using a simple bitmap. In Go 1.4 the bitmap held 4-bit entries,
and in Go 1.5 the bitmap holds 1-bit entries, but in both cases using
a bitmap for a large type containing arrays does not make sense:
if someone refers to the type [1<<28]*byte in a program in such
a way that the type information makes it into the binary, it would be
a waste of space to write a 128 MB (for 4-bit entries) or even 32 MB
(for 1-bit entries) bitmap full of 1s into the binary or even to keep
one in memory during the execution of the program.
For large types containing arrays, it is much more compact to describe
the locations of pointers using a notation that can express repetition
than to lay out a bitmap of pointers. Go 1.4 included such a notation,
called ``GC programs'' but it was complex, required recursion during
decoding, and was generally slow. Dmitriy measured the execution of
these programs writing directly to the heap bitmap as being 7x slower
than copying from a preunrolled 4-bit mask (and frankly that code was
not terribly fast either). For some tests, unrollgcprog1 was seen costing
as much as 3x more than the rest of malloc combined.
This CL introduces a different form for the GC programs. They use a
simple Lempel-Ziv-style encoding of the 1-bit pointer information,
in which the only operations are (1) emit the following n bits
and (2) repeat the last n bits c more times. This encoding can be
generated directly from the Go type information (using repetition
only for arrays or large runs of non-pointer data) and it can be decoded
very efficiently. In particular the decoding requires little state and
no recursion, so that the entire decoding can run without any memory
accesses other than the reads of the encoding and the writes of the
decoded form to the heap bitmap. For recursive types like arrays of
arrays of arrays, the inner instructions are only executed once, not
n times, so that large repetitions run at full speed. (In contrast, large
repetitions in the old programs repeated the individual bit-level layout
of the inner data over and over.) The result is as much as 25x faster
decoding compared to the old form.
Because the old decoder was so slow, Go 1.4 had three (or so) cases
for how to set the heap bitmap bits for an allocation of a given type:
(1) If the type had an even number of words up to 32 words, then
the 4-bit pointer mask for the type fit in no more than 16 bytes;
store the 4-bit pointer mask directly in the binary and copy from it.
(1b) If the type had an odd number of words up to 15 words, then
the 4-bit pointer mask for the type, doubled to end on a byte boundary,
fit in no more than 16 bytes; store that doubled mask directly in the
binary and copy from it.
(2) If the type had an even number of words up to 128 words,
or an odd number of words up to 63 words (again due to doubling),
then the 4-bit pointer mask would fit in a 64-byte unrolled mask.
Store a GC program in the binary, but leave space in the BSS for
the unrolled mask. Execute the GC program to construct the mask the
first time it is needed, and thereafter copy from the mask.
(3) Otherwise, store a GC program and execute it to write directly to
the heap bitmap each time an object of that type is allocated.
(This is the case that was 7x slower than the other two.)
Because the new pointer masks store 1-bit entries instead of 4-bit
entries and because using the decoder no longer carries a significant
overhead, after this CL (that is, for Go 1.5) there are only two cases:
(1) If the type is 128 words or less (no condition about odd or even),
store the 1-bit pointer mask directly in the binary and use it to
initialize the heap bitmap during malloc. (Implemented in CL 9702.)
(2) There is no case 2 anymore.
(3) Otherwise, store a GC program and execute it to write directly to
the heap bitmap each time an object of that type is allocated.
Executing the GC program directly into the heap bitmap (case (3) above)
was disabled for the Go 1.5 dev cycle, both to avoid needing to use
GC programs for typedmemmove and to avoid updating that code as
the heap bitmap format changed. Typedmemmove no longer uses this
type information; as of CL 9886 it uses the heap bitmap directly.
Now that the heap bitmap format is stable, we reintroduce GC programs
and their space savings.
Benchmarks for heapBitsSetType, before this CL vs this CL:
name old mean new mean delta
SetTypePtr 7.59ns × (0.99,1.02) 5.16ns × (1.00,1.00) -32.05% (p=0.000)
SetTypePtr8 21.0ns × (0.98,1.05) 21.4ns × (1.00,1.00) ~ (p=0.179)
SetTypePtr16 24.1ns × (0.99,1.01) 24.6ns × (1.00,1.00) +2.41% (p=0.001)
SetTypePtr32 31.2ns × (0.99,1.01) 32.4ns × (0.99,1.02) +3.72% (p=0.001)
SetTypePtr64 45.2ns × (1.00,1.00) 47.2ns × (1.00,1.00) +4.42% (p=0.000)
SetTypePtr126 75.8ns × (0.99,1.01) 79.1ns × (1.00,1.00) +4.25% (p=0.000)
SetTypePtr128 74.3ns × (0.99,1.01) 77.6ns × (1.00,1.01) +4.55% (p=0.000)
SetTypePtrSlice 726ns × (1.00,1.01) 712ns × (1.00,1.00) -1.95% (p=0.001)
SetTypeNode1 20.0ns × (0.99,1.01) 20.7ns × (1.00,1.00) +3.71% (p=0.000)
SetTypeNode1Slice 112ns × (1.00,1.00) 113ns × (0.99,1.00) ~ (p=0.070)
SetTypeNode8 23.9ns × (1.00,1.00) 24.7ns × (1.00,1.01) +3.18% (p=0.000)
SetTypeNode8Slice 294ns × (0.99,1.02) 287ns × (0.99,1.01) -2.38% (p=0.015)
SetTypeNode64 52.8ns × (0.99,1.03) 51.8ns × (0.99,1.01) ~ (p=0.069)
SetTypeNode64Slice 1.13µs × (0.99,1.05) 1.14µs × (0.99,1.00) ~ (p=0.767)
SetTypeNode64Dead 36.0ns × (1.00,1.01) 32.5ns × (0.99,1.00) -9.67% (p=0.000)
SetTypeNode64DeadSlice 1.43µs × (0.99,1.01) 1.40µs × (1.00,1.00) -2.39% (p=0.001)
SetTypeNode124 75.7ns × (1.00,1.01) 79.0ns × (1.00,1.00) +4.44% (p=0.000)
SetTypeNode124Slice 1.94µs × (1.00,1.01) 2.04µs × (0.99,1.01) +4.98% (p=0.000)
SetTypeNode126 75.4ns × (1.00,1.01) 77.7ns × (0.99,1.01) +3.11% (p=0.000)
SetTypeNode126Slice 1.95µs × (0.99,1.01) 2.03µs × (1.00,1.00) +3.74% (p=0.000)
SetTypeNode128 85.4ns × (0.99,1.01) 122.0ns × (1.00,1.00) +42.89% (p=0.000)
SetTypeNode128Slice 2.20µs × (1.00,1.01) 2.36µs × (0.98,1.02) +7.48% (p=0.001)
SetTypeNode130 83.3ns × (1.00,1.00) 123.0ns × (1.00,1.00) +47.61% (p=0.000)
SetTypeNode130Slice 2.30µs × (0.99,1.01) 2.40µs × (0.98,1.01) +4.37% (p=0.000)
SetTypeNode1024 498ns × (1.00,1.00) 537ns × (1.00,1.00) +7.96% (p=0.000)
SetTypeNode1024Slice 15.5µs × (0.99,1.01) 17.8µs × (1.00,1.00) +15.27% (p=0.000)
The above compares always using a cached pointer mask (and the
corresponding waste of memory) against using the programs directly.
Some slowdown is expected, in exchange for having a better general algorithm.
The GC programs kick in for SetTypeNode128, SetTypeNode130, SetTypeNode1024,
along with the slice variants of those.
It is possible that the cutoff of 128 words (bits) should be raised
in a followup CL, but even with this low cutoff the GC programs are
faster than Go 1.4's "fast path" non-GC program case.
Benchmarks for heapBitsSetType, Go 1.4 vs this CL:
name old mean new mean delta
SetTypePtr 6.89ns × (1.00,1.00) 5.17ns × (1.00,1.00) -25.02% (p=0.000)
SetTypePtr8 25.8ns × (0.97,1.05) 21.5ns × (1.00,1.00) -16.70% (p=0.000)
SetTypePtr16 39.8ns × (0.97,1.02) 24.7ns × (0.99,1.01) -37.81% (p=0.000)
SetTypePtr32 68.8ns × (0.98,1.01) 32.2ns × (1.00,1.01) -53.18% (p=0.000)
SetTypePtr64 130ns × (1.00,1.00) 47ns × (1.00,1.00) -63.67% (p=0.000)
SetTypePtr126 241ns × (0.99,1.01) 79ns × (1.00,1.01) -67.25% (p=0.000)
SetTypePtr128 2.07µs × (1.00,1.00) 0.08µs × (1.00,1.00) -96.27% (p=0.000)
SetTypePtrSlice 1.05µs × (0.99,1.01) 0.72µs × (0.99,1.02) -31.70% (p=0.000)
SetTypeNode1 16.0ns × (0.99,1.01) 20.8ns × (0.99,1.03) +29.91% (p=0.000)
SetTypeNode1Slice 184ns × (0.99,1.01) 112ns × (0.99,1.01) -39.26% (p=0.000)
SetTypeNode8 29.5ns × (0.97,1.02) 24.6ns × (1.00,1.00) -16.50% (p=0.000)
SetTypeNode8Slice 624ns × (0.98,1.02) 285ns × (1.00,1.00) -54.31% (p=0.000)
SetTypeNode64 135ns × (0.96,1.08) 52ns × (0.99,1.02) -61.32% (p=0.000)
SetTypeNode64Slice 3.83µs × (1.00,1.00) 1.14µs × (0.99,1.01) -70.16% (p=0.000)
SetTypeNode64Dead 134ns × (0.99,1.01) 32ns × (1.00,1.01) -75.74% (p=0.000)
SetTypeNode64DeadSlice 3.83µs × (0.99,1.00) 1.40µs × (1.00,1.01) -63.42% (p=0.000)
SetTypeNode124 240ns × (0.99,1.01) 79ns × (1.00,1.01) -67.05% (p=0.000)
SetTypeNode124Slice 7.27µs × (1.00,1.00) 2.04µs × (1.00,1.00) -71.95% (p=0.000)
SetTypeNode126 2.06µs × (0.99,1.01) 0.08µs × (0.99,1.01) -96.23% (p=0.000)
SetTypeNode126Slice 64.4µs × (1.00,1.00) 2.0µs × (1.00,1.00) -96.85% (p=0.000)
SetTypeNode128 2.09µs × (1.00,1.01) 0.12µs × (1.00,1.00) -94.15% (p=0.000)
SetTypeNode128Slice 65.4µs × (1.00,1.00) 2.4µs × (0.99,1.03) -96.39% (p=0.000)
SetTypeNode130 2.11µs × (1.00,1.00) 0.12µs × (1.00,1.00) -94.18% (p=0.000)
SetTypeNode130Slice 66.3µs × (1.00,1.00) 2.4µs × (0.97,1.08) -96.34% (p=0.000)
SetTypeNode1024 16.0µs × (1.00,1.01) 0.5µs × (1.00,1.00) -96.65% (p=0.000)
SetTypeNode1024Slice 512µs × (1.00,1.00) 18µs × (0.98,1.04) -96.45% (p=0.000)
SetTypeNode124 uses a 124 data + 2 ptr = 126-word allocation.
Both Go 1.4 and this CL are using pointer bitmaps for this case,
so that's an overall 3x speedup for using pointer bitmaps.
SetTypeNode128 uses a 128 data + 2 ptr = 130-word allocation.
Both Go 1.4 and this CL are running the GC program for this case,
so that's an overall 17x speedup when using GC programs (and
I've seen >20x on other systems).
Comparing Go 1.4's SetTypeNode124 (pointer bitmap) against
this CL's SetTypeNode128 (GC program), the slow path in the
code in this CL is 2x faster than the fast path in Go 1.4.
The Go 1 benchmarks are basically unaffected compared to just before this CL.
Go 1 benchmarks, before this CL vs this CL:
name old mean new mean delta
BinaryTree17 5.87s × (0.97,1.04) 5.91s × (0.96,1.04) ~ (p=0.306)
Fannkuch11 4.38s × (1.00,1.00) 4.37s × (1.00,1.01) -0.22% (p=0.006)
FmtFprintfEmpty 90.7ns × (0.97,1.10) 89.3ns × (0.96,1.09) ~ (p=0.280)
FmtFprintfString 282ns × (0.98,1.04) 287ns × (0.98,1.07) +1.72% (p=0.039)
FmtFprintfInt 269ns × (0.99,1.03) 282ns × (0.97,1.04) +4.87% (p=0.000)
FmtFprintfIntInt 478ns × (0.99,1.02) 481ns × (0.99,1.02) +0.61% (p=0.048)
FmtFprintfPrefixedInt 399ns × (0.98,1.03) 400ns × (0.98,1.05) ~ (p=0.533)
FmtFprintfFloat 563ns × (0.99,1.01) 570ns × (1.00,1.01) +1.37% (p=0.000)
FmtManyArgs 1.89µs × (0.99,1.01) 1.92µs × (0.99,1.02) +1.88% (p=0.000)
GobDecode 15.2ms × (0.99,1.01) 15.2ms × (0.98,1.05) ~ (p=0.609)
GobEncode 11.6ms × (0.98,1.03) 11.9ms × (0.98,1.04) +2.17% (p=0.000)
Gzip 648ms × (0.99,1.01) 648ms × (1.00,1.01) ~ (p=0.835)
Gunzip 142ms × (1.00,1.00) 143ms × (1.00,1.01) ~ (p=0.169)
HTTPClientServer 90.5µs × (0.98,1.03) 91.5µs × (0.98,1.04) +1.04% (p=0.045)
JSONEncode 31.5ms × (0.98,1.03) 31.4ms × (0.98,1.03) ~ (p=0.549)
JSONDecode 111ms × (0.99,1.01) 107ms × (0.99,1.01) -3.21% (p=0.000)
Mandelbrot200 6.01ms × (1.00,1.00) 6.01ms × (1.00,1.00) ~ (p=0.878)
GoParse 6.54ms × (0.99,1.02) 6.61ms × (0.99,1.03) +1.08% (p=0.004)
RegexpMatchEasy0_32 160ns × (1.00,1.01) 161ns × (1.00,1.00) +0.40% (p=0.000)
RegexpMatchEasy0_1K 560ns × (0.99,1.01) 559ns × (0.99,1.01) ~ (p=0.088)
RegexpMatchEasy1_32 138ns × (0.99,1.01) 138ns × (1.00,1.00) ~ (p=0.380)
RegexpMatchEasy1_1K 877ns × (1.00,1.00) 878ns × (1.00,1.00) ~ (p=0.157)
RegexpMatchMedium_32 251ns × (0.99,1.00) 251ns × (1.00,1.01) +0.28% (p=0.021)
RegexpMatchMedium_1K 72.6µs × (1.00,1.00) 72.6µs × (1.00,1.00) ~ (p=0.539)
RegexpMatchHard_32 3.84µs × (1.00,1.00) 3.84µs × (1.00,1.00) ~ (p=0.378)
RegexpMatchHard_1K 117µs × (1.00,1.00) 117µs × (1.00,1.00) ~ (p=0.067)
Revcomp 904ms × (0.99,1.02) 904ms × (0.99,1.01) ~ (p=0.943)
Template 125ms × (0.99,1.02) 127ms × (0.99,1.01) +1.79% (p=0.000)
TimeParse 627ns × (0.99,1.01) 622ns × (0.99,1.01) -0.88% (p=0.000)
TimeFormat 655ns × (0.99,1.02) 655ns × (0.99,1.02) ~ (p=0.976)
For the record, Go 1 benchmarks, Go 1.4 vs this CL:
name old mean new mean delta
BinaryTree17 4.61s × (0.97,1.05) 5.91s × (0.98,1.03) +28.35% (p=0.000)
Fannkuch11 4.40s × (0.99,1.03) 4.41s × (0.99,1.01) ~ (p=0.212)
FmtFprintfEmpty 102ns × (0.99,1.01) 84ns × (0.99,1.02) -18.38% (p=0.000)
FmtFprintfString 302ns × (0.98,1.01) 303ns × (0.99,1.02) ~ (p=0.203)
FmtFprintfInt 313ns × (0.97,1.05) 270ns × (0.99,1.01) -13.69% (p=0.000)
FmtFprintfIntInt 524ns × (0.98,1.02) 477ns × (0.99,1.00) -8.87% (p=0.000)
FmtFprintfPrefixedInt 424ns × (0.98,1.02) 386ns × (0.99,1.01) -8.96% (p=0.000)
FmtFprintfFloat 652ns × (0.98,1.02) 594ns × (0.97,1.05) -8.97% (p=0.000)
FmtManyArgs 2.13µs × (0.99,1.02) 1.94µs × (0.99,1.01) -8.92% (p=0.000)
GobDecode 17.1ms × (0.99,1.02) 14.9ms × (0.98,1.03) -13.07% (p=0.000)
GobEncode 13.5ms × (0.98,1.03) 11.5ms × (0.98,1.03) -15.25% (p=0.000)
Gzip 656ms × (0.99,1.02) 647ms × (0.99,1.01) -1.29% (p=0.000)
Gunzip 143ms × (0.99,1.02) 144ms × (0.99,1.01) ~ (p=0.204)
HTTPClientServer 88.2µs × (0.98,1.02) 90.8µs × (0.98,1.01) +2.93% (p=0.000)
JSONEncode 32.2ms × (0.98,1.02) 30.9ms × (0.97,1.04) -4.06% (p=0.001)
JSONDecode 121ms × (0.98,1.02) 110ms × (0.98,1.05) -8.95% (p=0.000)
Mandelbrot200 6.06ms × (0.99,1.01) 6.11ms × (0.98,1.04) ~ (p=0.184)
GoParse 6.76ms × (0.97,1.04) 6.58ms × (0.98,1.05) -2.63% (p=0.003)
RegexpMatchEasy0_32 195ns × (1.00,1.01) 155ns × (0.99,1.01) -20.43% (p=0.000)
RegexpMatchEasy0_1K 479ns × (0.98,1.03) 535ns × (0.99,1.02) +11.59% (p=0.000)
RegexpMatchEasy1_32 169ns × (0.99,1.02) 131ns × (0.99,1.03) -22.44% (p=0.000)
RegexpMatchEasy1_1K 1.53µs × (0.99,1.01) 0.87µs × (0.99,1.02) -43.07% (p=0.000)
RegexpMatchMedium_32 334ns × (0.99,1.01) 242ns × (0.99,1.01) -27.53% (p=0.000)
RegexpMatchMedium_1K 125µs × (1.00,1.01) 72µs × (0.99,1.03) -42.53% (p=0.000)
RegexpMatchHard_32 6.03µs × (0.99,1.01) 3.79µs × (0.99,1.01) -37.12% (p=0.000)
RegexpMatchHard_1K 189µs × (0.99,1.02) 115µs × (0.99,1.01) -39.20% (p=0.000)
Revcomp 935ms × (0.96,1.03) 926ms × (0.98,1.02) ~ (p=0.083)
Template 146ms × (0.97,1.05) 119ms × (0.99,1.01) -18.37% (p=0.000)
TimeParse 660ns × (0.99,1.01) 624ns × (0.99,1.02) -5.43% (p=0.000)
TimeFormat 670ns × (0.98,1.02) 710ns × (1.00,1.01) +5.97% (p=0.000)
This CL is a bit larger than I would like, but the compiler, linker, runtime,
and package reflect all need to be in sync about the format of these programs,
so there is no easy way to split this into independent changes (at least
while keeping the build working at each change).
Fixes #9625.
Fixes #10524.
Change-Id: I9e3e20d6097099d0f8532d1cb5b1af528804989a
Reviewed-on: https://go-review.googlesource.com/9888
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Russ Cox <rsc@golang.org>
2015-05-08 01:43:18 -04:00
const debugGCProg = false
2015-02-27 22:57:28 -05:00
2020-04-20 09:51:21 -04:00
type GCProg2 struct {
ctxt * Link
sym * loader . SymbolBuilder
w gcprog . Writer
}
func ( p * GCProg2 ) Init ( ctxt * Link , name string ) {
p . ctxt = ctxt
symIdx := ctxt . loader . LookupOrCreateSym ( name , 0 )
p . sym = ctxt . loader . MakeSymbolUpdater ( symIdx )
p . w . Init ( p . writeByte ( ) )
if debugGCProg {
fmt . Fprintf ( os . Stderr , "ld: start GCProg %s\n" , name )
p . w . Debug ( os . Stderr )
}
}
func ( p * GCProg2 ) writeByte ( ) func ( x byte ) {
return func ( x byte ) {
p . sym . AddUint8 ( x )
}
}
func ( p * GCProg2 ) End ( size int64 ) {
p . w . ZeroUntil ( size / int64 ( p . ctxt . Arch . PtrSize ) )
p . w . End ( )
if debugGCProg {
fmt . Fprintf ( os . Stderr , "ld: end GCProg\n" )
}
}
func ( p * GCProg2 ) AddSym ( s loader . Sym ) {
ldr := p . ctxt . loader
typ := ldr . SymGoType ( s )
// Things without pointers should be in sym.SNOPTRDATA or sym.SNOPTRBSS;
// everything we see should have pointers and should therefore have a type.
if typ == 0 {
2020-04-24 15:19:55 -04:00
switch ldr . SymName ( s ) {
2020-04-20 09:51:21 -04:00
case "runtime.data" , "runtime.edata" , "runtime.bss" , "runtime.ebss" :
// Ignore special symbols that are sometimes laid out
// as real symbols. See comment about dyld on darwin in
// the address function.
return
}
2020-04-24 15:19:55 -04:00
p . ctxt . Errorf ( p . sym . Sym ( ) , "missing Go type information for global symbol %s: size %d" , ldr . SymName ( s ) , ldr . SymSize ( s ) )
2020-04-20 09:51:21 -04:00
return
}
ptrsize := int64 ( p . ctxt . Arch . PtrSize )
typData := ldr . Data ( typ )
nptr := decodetypePtrdata ( p . ctxt . Arch , typData ) / ptrsize
if debugGCProg {
fmt . Fprintf ( os . Stderr , "gcprog sym: %s at %d (ptr=%d+%d)\n" , ldr . SymName ( s ) , ldr . SymValue ( s ) , ldr . SymValue ( s ) / ptrsize , nptr )
}
sval := ldr . SymValue ( s )
if decodetypeUsegcprog ( p . ctxt . Arch , typData ) == 0 {
// Copy pointers from mask into program.
2020-04-29 17:34:46 -04:00
mask := decodetypeGcmask ( p . ctxt , typ )
2020-04-20 09:51:21 -04:00
for i := int64 ( 0 ) ; i < nptr ; i ++ {
if ( mask [ i / 8 ] >> uint ( i % 8 ) ) & 1 != 0 {
p . w . Ptr ( sval / ptrsize + i )
}
}
return
}
// Copy program.
2020-04-29 17:34:46 -04:00
prog := decodetypeGcprog ( p . ctxt , typ )
2020-04-20 09:51:21 -04:00
p . w . ZeroUntil ( sval / ptrsize )
p . w . Append ( prog [ 4 : ] , nptr )
}
2017-11-08 09:43:56 +01:00
// cutoff is the maximum data section size permitted by the linker
// (see issue #9862).
const cutoff = 2e9 // 2 GB (or so; looks better in errors than 2^31)
2016-04-19 08:59:56 -04:00
2020-04-15 09:42:13 -04:00
func ( state * dodataState ) checkdatsize ( symn sym . SymKind ) {
if state . datsize > cutoff {
2017-11-08 09:43:56 +01:00
Errorf ( nil , "too much data in section %v (over %v bytes)" , symn , cutoff )
2016-04-19 08:59:56 -04:00
}
2015-02-27 22:57:28 -05:00
}
2020-03-09 10:35:31 -04:00
// fixZeroSizedSymbols gives a few special symbols with zero size some space.
2020-04-21 18:37:43 -04:00
func fixZeroSizedSymbols2 ( ctxt * Link ) {
2020-03-09 10:35:31 -04:00
// The values in moduledata are filled out by relocations
// pointing to the addresses of these special symbols.
// Typically these symbols have no size and are not laid
// out with their matching section.
//
// However on darwin, dyld will find the special symbol
// in the first loaded module, even though it is local.
//
// (An hypothesis, formed without looking in the dyld sources:
// these special symbols have no size, so their address
// matches a real symbol. The dynamic linker assumes we
// want the normal symbol with the same address and finds
// it in the other module.)
//
// To work around this we lay out the symbls whose
// addresses are vital for multi-module programs to work
// as normal symbols, and give them a little size.
//
// On AIX, as all DATA sections are merged together, ld might not put
// these symbols at the beginning of their respective section if there
// aren't real symbols, their alignment might not match the
// first symbol alignment. Therefore, there are explicitly put at the
// beginning of their section with the same alignment.
if ! ( ctxt . DynlinkingGo ( ) && ctxt . HeadType == objabi . Hdarwin ) && ! ( ctxt . HeadType == objabi . Haix && ctxt . LinkMode == LinkExternal ) {
return
}
2016-03-09 16:23:25 +02:00
2020-04-21 18:37:43 -04:00
ldr := ctxt . loader
bss := ldr . CreateSymForUpdate ( "runtime.bss" , 0 )
bss . SetSize ( 8 )
ldr . SetAttrSpecial ( bss . Sym ( ) , false )
ebss := ldr . CreateSymForUpdate ( "runtime.ebss" , 0 )
ldr . SetAttrSpecial ( ebss . Sym ( ) , false )
2020-03-09 10:35:31 -04:00
2020-04-21 18:37:43 -04:00
data := ldr . CreateSymForUpdate ( "runtime.data" , 0 )
data . SetSize ( 8 )
ldr . SetAttrSpecial ( data . Sym ( ) , false )
2016-09-19 14:13:07 -04:00
2020-04-21 18:37:43 -04:00
edata := ldr . CreateSymForUpdate ( "runtime.edata" , 0 )
ldr . SetAttrSpecial ( edata . Sym ( ) , false )
2016-09-19 14:13:07 -04:00
2020-03-09 10:35:31 -04:00
if ctxt . HeadType == objabi . Haix {
// XCOFFTOC symbols are part of .data section.
2020-04-21 18:37:43 -04:00
edata . SetType ( sym . SXCOFFTOC )
2020-03-09 10:35:31 -04:00
}
2019-02-20 16:20:56 +01:00
2020-04-21 18:37:43 -04:00
types := ldr . CreateSymForUpdate ( "runtime.types" , 0 )
types . SetType ( sym . STYPE )
types . SetSize ( 8 )
ldr . SetAttrSpecial ( types . Sym ( ) , false )
2019-02-20 16:20:56 +01:00
2020-04-21 18:37:43 -04:00
etypes := ldr . CreateSymForUpdate ( "runtime.etypes" , 0 )
etypes . SetType ( sym . SFUNCTAB )
ldr . SetAttrSpecial ( etypes . Sym ( ) , false )
2020-03-09 10:35:31 -04:00
if ctxt . HeadType == objabi . Haix {
2020-04-21 18:37:43 -04:00
rodata := ldr . CreateSymForUpdate ( "runtime.rodata" , 0 )
rodata . SetType ( sym . SSTRING )
rodata . SetSize ( 8 )
ldr . SetAttrSpecial ( rodata . Sym ( ) , false )
2020-03-09 10:35:31 -04:00
2020-04-21 18:37:43 -04:00
erodata := ldr . CreateSymForUpdate ( "runtime.erodata" , 0 )
ldr . SetAttrSpecial ( erodata . Sym ( ) , false )
2020-03-09 10:35:31 -04:00
}
}
// makeRelroForSharedLib creates a section of readonly data if necessary.
2020-04-21 18:37:43 -04:00
func ( state * dodataState ) makeRelroForSharedLib2 ( target * Link ) {
2020-03-09 10:35:31 -04:00
if ! target . UseRelro ( ) {
return
}
// "read only" data with relocations needs to go in its own section
// when building a shared library. We do this by boosting objects of
// type SXXX with relocations to type SXXXRELRO.
2020-04-21 18:37:43 -04:00
ldr := target . loader
2020-03-09 10:35:31 -04:00
for _ , symnro := range sym . ReadOnly {
symnrelro := sym . RelROMap [ symnro ]
2020-04-21 18:37:43 -04:00
ro := [ ] loader . Sym { }
relro := state . data2 [ symnrelro ]
2020-03-09 10:35:31 -04:00
2020-04-21 18:37:43 -04:00
for _ , s := range state . data2 [ symnro ] {
relocs := ldr . Relocs ( s )
isRelro := relocs . Count ( ) > 0
switch state . symType ( s ) {
2020-03-09 10:35:31 -04:00
case sym . STYPE , sym . STYPERELRO , sym . SGOFUNCRELRO :
// Symbols are not sorted yet, so it is possible
// that an Outer symbol has been changed to a
// relro Type before it reaches here.
isRelro = true
case sym . SFUNCTAB :
2020-04-21 18:37:43 -04:00
if target . IsAIX ( ) && ldr . SymName ( s ) == "runtime.etypes" {
2020-03-09 10:35:31 -04:00
// runtime.etypes must be at the end of
// the relro datas.
isRelro = true
}
}
if isRelro {
2020-04-21 18:37:43 -04:00
state . setSymType ( s , symnrelro )
if outer := ldr . OuterSym ( s ) ; outer != 0 {
state . setSymType ( outer , symnrelro )
2020-03-09 10:35:31 -04:00
}
relro = append ( relro , s )
} else {
ro = append ( ro , s )
}
}
2019-02-20 16:20:56 +01:00
2020-03-09 10:35:31 -04:00
// Check that we haven't made two symbols with the same .Outer into
// different types (because references two symbols with non-nil Outer
// become references to the outer symbol + offset it's vital that the
// symbol and the outer end up in the same section).
for _ , s := range relro {
2020-04-21 18:37:43 -04:00
if outer := ldr . OuterSym ( s ) ; outer != 0 {
st := state . symType ( s )
ost := state . symType ( outer )
if st != ost {
state . ctxt . Errorf ( s , "inconsistent types for symbol and its Outer %s (%v != %v)" ,
ldr . SymName ( outer ) , st , ost )
}
2020-03-09 10:35:31 -04:00
}
2019-02-20 16:20:56 +01:00
}
2020-03-09 10:35:31 -04:00
2020-04-21 18:37:43 -04:00
state . data2 [ symnro ] = ro
state . data2 [ symnrelro ] = relro
2016-09-19 14:13:07 -04:00
}
2020-03-09 10:35:31 -04:00
}
2020-04-14 15:14:54 -04:00
// dodataState holds bits of state information needed by dodata() and the
// various helpers it calls. The lifetime of these items should not extend
// past the end of dodata().
type dodataState struct {
2020-04-15 09:42:13 -04:00
// Link context
ctxt * Link
2020-04-14 15:14:54 -04:00
// Data symbols bucketed by type.
data [ sym . SXREF ] [ ] * sym . Symbol
2020-04-21 18:37:43 -04:00
// Data symbols bucketed by type.
data2 [ sym . SXREF ] [ ] loader . Sym
2020-04-14 15:14:54 -04:00
// Max alignment for each flavor of data symbol.
dataMaxAlign [ sym . SXREF ] int32
2020-04-21 18:37:43 -04:00
// Overridden sym type
symGroupType [ ] sym . SymKind
2020-04-15 09:42:13 -04:00
// Current data size so far.
datsize int64
2020-04-14 15:14:54 -04:00
}
2020-04-21 18:37:43 -04:00
// A note on symType/setSymType below:
//
// In the legacy linker, the types of symbols (notably data symbols) are
// changed during the symtab() phase so as to insure that similar symbols
// are bucketed together, then their types are changed back again during
// dodata. Symbol to section assignment also plays tricks along these lines
// in the case where a relro segment is needed.
//
// The value returned from setType() below reflects the effects of
// any overrides made by symtab and/or dodata.
// symType returns the (possibly overridden) type of 's'.
func ( state * dodataState ) symType ( s loader . Sym ) sym . SymKind {
if int ( s ) < len ( state . symGroupType ) {
if override := state . symGroupType [ s ] ; override != 0 {
return override
}
}
return state . ctxt . loader . SymType ( s )
}
// setSymType sets a new override type for 's'.
func ( state * dodataState ) setSymType ( s loader . Sym , kind sym . SymKind ) {
if s == 0 {
panic ( "bad" )
}
if int ( s ) < len ( state . symGroupType ) {
state . symGroupType [ s ] = kind
} else {
su := state . ctxt . loader . MakeSymbolUpdater ( s )
su . SetType ( kind )
}
}
func ( ctxt * Link ) dodata2 ( symGroupType [ ] sym . SymKind ) {
2020-03-09 10:35:31 -04:00
// Give zeros sized symbols space if necessary.
2020-04-21 18:37:43 -04:00
fixZeroSizedSymbols2 ( ctxt )
2016-09-19 14:13:07 -04:00
2016-04-18 14:50:14 -04:00
// Collect data symbols by type into data.
2020-04-21 18:37:43 -04:00
state := dodataState { ctxt : ctxt , symGroupType : symGroupType }
ldr := ctxt . loader
for s := loader . Sym ( 1 ) ; s < loader . Sym ( ldr . NSym ( ) ) ; s ++ {
if ! ldr . AttrReachable ( s ) || ldr . AttrSpecial ( s ) || ldr . AttrSubSymbol ( s ) ||
! ldr . TopLevelSym ( s ) {
2015-02-27 22:57:28 -05:00
continue
}
2020-04-21 18:37:43 -04:00
st := state . symType ( s )
if st <= sym . STEXT || st >= sym . SXREF {
2016-04-18 14:50:14 -04:00
continue
2015-02-27 22:57:28 -05:00
}
2020-04-21 18:37:43 -04:00
state . data2 [ st ] = append ( state . data2 [ st ] , s )
// Similarly with checking the onlist attr.
if ldr . AttrOnList ( s ) {
log . Fatalf ( "symbol %s listed multiple times" , ldr . SymName ( s ) )
}
ldr . SetAttrOnList ( s , true )
2015-02-27 22:57:28 -05:00
}
2016-04-18 14:50:14 -04:00
// Now that we have the data symbols, but before we start
// to assign addresses, record all the necessary
// dynamic relocations. These will grow the relocation
// symbol, which is itself data.
//
// On darwin, we need the symbol table numbers for dynreloc.
2017-10-07 13:49:44 -04:00
if ctxt . HeadType == objabi . Hdarwin {
2020-04-24 14:57:20 -04:00
machosymorder ( ctxt )
2015-02-27 22:57:28 -05:00
}
2020-04-21 18:37:43 -04:00
state . dynreloc2 ( ctxt )
2015-02-27 22:57:28 -05:00
2020-03-09 10:35:31 -04:00
// Move any RO data with relocations to a separate section.
2020-04-21 18:37:43 -04:00
state . makeRelroForSharedLib2 ( ctxt )
2015-03-30 15:45:33 +02:00
2020-04-26 14:56:19 -04:00
// Set explicit alignment here, so as to avoid having to update
// symbol alignment in doDataSect2, which would cause a concurrent
// map read/write violation.
// NOTE: this needs to be done after dynreloc2, where symbol size
// may change.
for _ , list := range state . data2 {
for _ , s := range list {
state . symalign2 ( s )
}
}
2016-04-18 14:50:14 -04:00
// Sort symbols.
var wg sync . WaitGroup
2020-04-21 18:37:43 -04:00
for symn := range state . data2 {
2017-10-04 17:54:04 -04:00
symn := sym . SymKind ( symn )
2016-04-18 14:50:14 -04:00
wg . Add ( 1 )
go func ( ) {
2020-04-21 18:37:43 -04:00
state . data2 [ symn ] , state . dataMaxAlign [ symn ] = state . dodataSect2 ( ctxt , symn , state . data2 [ symn ] )
2016-04-18 14:50:14 -04:00
wg . Done ( )
} ( )
2015-02-27 22:57:28 -05:00
}
2016-04-18 14:50:14 -04:00
wg . Wait ( )
2015-02-27 22:57:28 -05:00
2020-04-21 18:37:43 -04:00
if ctxt . IsELF {
// Make .rela and .rela.plt contiguous, the ELF ABI requires this
// and Solaris actually cares.
syms := state . data2 [ sym . SELFROSECT ]
reli , plti := - 1 , - 1
for i , s := range syms {
switch ldr . SymName ( s ) {
case ".rel.plt" , ".rela.plt" :
plti = i
case ".rel" , ".rela" :
reli = i
}
}
if reli >= 0 && plti >= 0 && plti != reli + 1 {
var first , second int
if plti > reli {
first , second = reli , plti
} else {
first , second = plti , reli
}
rel , plt := syms [ reli ] , syms [ plti ]
copy ( syms [ first + 2 : ] , syms [ first + 1 : second ] )
syms [ first + 0 ] = rel
syms [ first + 1 ] = plt
// Make sure alignment doesn't introduce a gap.
// Setting the alignment explicitly prevents
// symalign from basing it on the size and
// getting it wrong.
ldr . SetSymAlign ( rel , int32 ( ctxt . Arch . RegSize ) )
ldr . SetSymAlign ( plt , int32 ( ctxt . Arch . RegSize ) )
}
state . data2 [ sym . SELFROSECT ] = syms
}
2019-02-20 16:20:56 +01:00
if ctxt . HeadType == objabi . Haix && ctxt . LinkMode == LinkExternal {
// These symbols must have the same alignment as their section.
// Otherwize, ld might change the layout of Go sections.
2020-04-21 18:37:43 -04:00
ldr . SetSymAlign ( ldr . Lookup ( "runtime.data" , 0 ) , state . dataMaxAlign [ sym . SDATA ] )
ldr . SetSymAlign ( ldr . Lookup ( "runtime.bss" , 0 ) , state . dataMaxAlign [ sym . SBSS ] )
2019-02-20 16:20:56 +01:00
}
2020-04-20 07:47:43 -04:00
// Create *sym.Section objects and assign symbols to sections for
// data/rodata (and related) symbols.
2020-04-21 18:37:43 -04:00
state . allocateDataSections2 ( ctxt )
2020-04-20 07:47:43 -04:00
// Create *sym.Section objects and assign symbols to sections for
// DWARF symbols.
2020-04-21 18:37:43 -04:00
state . allocateDwarfSections2 ( ctxt )
2020-04-14 15:14:54 -04:00
/* number the sections */
n := int16 ( 1 )
for _ , sect := range Segtext . Sections {
sect . Extnum = n
n ++
}
for _ , sect := range Segrodata . Sections {
sect . Extnum = n
n ++
}
for _ , sect := range Segrelrodata . Sections {
sect . Extnum = n
n ++
}
for _ , sect := range Segdata . Sections {
sect . Extnum = n
n ++
}
for _ , sect := range Segdwarf . Sections {
sect . Extnum = n
n ++
}
}
2020-04-17 17:23:02 -04:00
// allocateDataSectionForSym creates a new sym.Section into which a a
// single symbol will be placed. Here "seg" is the segment into which
// the section will go, "s" is the symbol to be placed into the new
// section, and "rwx" contains permissions for the section.
2020-04-21 18:37:43 -04:00
func ( state * dodataState ) allocateDataSectionForSym2 ( seg * sym . Segment , s loader . Sym , rwx int ) * sym . Section {
ldr := state . ctxt . loader
sname := ldr . SymName ( s )
sect := addsection ( ldr , state . ctxt . Arch , seg , sname , rwx )
sect . Align = state . symalign2 ( s )
2020-04-17 17:23:02 -04:00
state . datsize = Rnd ( state . datsize , int64 ( sect . Align ) )
sect . Vaddr = uint64 ( state . datsize )
return sect
}
// allocateNamedDataSection creates a new sym.Section for a category
// of data symbols. Here "seg" is the segment into which the section
// will go, "sName" is the name to give to the section, "types" is a
// range of symbol types to be put into the section, and "rwx"
// contains permissions for the section.
func ( state * dodataState ) allocateNamedDataSection ( seg * sym . Segment , sName string , types [ ] sym . SymKind , rwx int ) * sym . Section {
2020-04-21 18:50:49 -04:00
sect := addsection ( state . ctxt . loader , state . ctxt . Arch , seg , sName , rwx )
2020-04-17 17:23:02 -04:00
if len ( types ) == 0 {
sect . Align = 1
} else if len ( types ) == 1 {
sect . Align = state . dataMaxAlign [ types [ 0 ] ]
} else {
for _ , symn := range types {
align := state . dataMaxAlign [ symn ]
if sect . Align < align {
sect . Align = align
}
}
}
state . datsize = Rnd ( state . datsize , int64 ( sect . Align ) )
sect . Vaddr = uint64 ( state . datsize )
return sect
}
2020-04-17 13:09:02 -04:00
// assignDsymsToSection assigns a collection of data symbols to a
// newly created section. "sect" is the section into which to place
// the symbols, "syms" holds the list of symbols to assign,
// "forceType" (if non-zero) contains a new sym type to apply to each
// sym during the assignment, and "aligner" is a hook to call to
// handle alignment during the assignment process.
2020-04-21 18:37:43 -04:00
func ( state * dodataState ) assignDsymsToSection2 ( sect * sym . Section , syms [ ] loader . Sym , forceType sym . SymKind , aligner func ( state * dodataState , datsize int64 , s loader . Sym ) int64 ) {
ldr := state . ctxt . loader
2020-04-17 13:09:02 -04:00
for _ , s := range syms {
2020-04-21 18:37:43 -04:00
state . datsize = aligner ( state , state . datsize , s )
ldr . SetSymSect ( s , sect )
2020-04-17 13:09:02 -04:00
if forceType != sym . Sxxx {
2020-04-21 18:37:43 -04:00
state . setSymType ( s , forceType )
2020-04-17 13:09:02 -04:00
}
2020-04-21 18:37:43 -04:00
ldr . SetSymValue ( s , int64 ( uint64 ( state . datsize ) - sect . Vaddr ) )
state . datsize += ldr . SymSize ( s )
2020-04-17 13:09:02 -04:00
}
sect . Length = uint64 ( state . datsize ) - sect . Vaddr
}
2020-04-21 18:37:43 -04:00
func ( state * dodataState ) assignToSection2 ( sect * sym . Section , symn sym . SymKind , forceType sym . SymKind ) {
state . assignDsymsToSection2 ( sect , state . data2 [ symn ] , forceType , aligndatsize2 )
2020-04-17 13:09:02 -04:00
state . checkdatsize ( symn )
}
2020-04-17 17:23:02 -04:00
// allocateSingleSymSections walks through the bucketed data symbols
// with type 'symn', creates a new section for each sym, and assigns
// the sym to a newly created section. Section name is set from the
// symbol name. "Seg" is the segment into which to place the new
// section, "forceType" is the new sym.SymKind to assign to the symbol
// within the section, and "rwx" holds section permissions.
2020-04-21 18:37:43 -04:00
func ( state * dodataState ) allocateSingleSymSections2 ( seg * sym . Segment , symn sym . SymKind , forceType sym . SymKind , rwx int ) {
ldr := state . ctxt . loader
for _ , s := range state . data2 [ symn ] {
sect := state . allocateDataSectionForSym2 ( seg , s , rwx )
ldr . SetSymSect ( s , sect )
state . setSymType ( s , forceType )
ldr . SetSymValue ( s , int64 ( uint64 ( state . datsize ) - sect . Vaddr ) )
state . datsize += ldr . SymSize ( s )
2020-04-17 17:23:02 -04:00
sect . Length = uint64 ( state . datsize ) - sect . Vaddr
}
state . checkdatsize ( symn )
}
// allocateNamedSectionAndAssignSyms creates a new section with the
// specified name, then walks through the bucketed data symbols with
// type 'symn' and assigns each of them to this new section. "Seg" is
// the segment into which to place the new section, "secName" is the
// name to give to the new section, "forceType" (if non-zero) contains
// a new sym type to apply to each sym during the assignment, and
// "rwx" holds section permissions.
2020-04-21 18:37:43 -04:00
func ( state * dodataState ) allocateNamedSectionAndAssignSyms2 ( seg * sym . Segment , secName string , symn sym . SymKind , forceType sym . SymKind , rwx int ) * sym . Section {
2020-04-17 17:23:02 -04:00
sect := state . allocateNamedDataSection ( seg , secName , [ ] sym . SymKind { symn } , rwx )
2020-04-21 18:37:43 -04:00
state . assignDsymsToSection2 ( sect , state . data2 [ symn ] , forceType , aligndatsize2 )
2020-04-17 17:23:02 -04:00
return sect
}
2020-04-20 07:47:43 -04:00
// allocateDataSections allocates sym.Section objects for data/rodata
// (and related) symbols, and then assigns symbols to those sections.
2020-04-21 18:37:43 -04:00
func ( state * dodataState ) allocateDataSections2 ( ctxt * Link ) {
2016-04-18 14:50:14 -04:00
// Allocate sections.
// Data is processed before segtext, because we need
// to see all symbols in the .data and .bss sections in order
// to generate garbage collection information.
2015-02-27 22:57:28 -05:00
2016-09-07 14:45:27 -04:00
// Writable data sections that do not need any specialized handling.
2017-10-04 17:54:04 -04:00
writable := [ ] sym . SymKind {
2019-04-22 23:02:37 -04:00
sym . SBUILDINFO ,
2017-10-04 17:54:04 -04:00
sym . SELFSECT ,
sym . SMACHO ,
sym . SMACHOGOT ,
sym . SWINDOWS ,
2016-04-18 14:50:14 -04:00
}
2016-09-07 14:45:27 -04:00
for _ , symn := range writable {
2020-04-21 18:37:43 -04:00
state . allocateSingleSymSections2 ( & Segdata , symn , sym . SDATA , 06 )
2015-02-27 22:57:28 -05:00
}
2020-04-21 18:37:43 -04:00
ldr := ctxt . loader
2015-02-27 22:57:28 -05:00
2016-04-18 14:50:14 -04:00
// .got (and .toc on ppc64)
2020-04-21 18:37:43 -04:00
if len ( state . data2 [ sym . SELFGOT ] ) > 0 {
sect := state . allocateNamedSectionAndAssignSyms2 ( & Segdata , ".got" , sym . SELFGOT , sym . SDATA , 06 )
2020-04-17 13:09:02 -04:00
if ctxt . IsPPC64 ( ) {
2020-04-21 18:37:43 -04:00
for _ , s := range state . data2 [ sym . SELFGOT ] {
2020-04-17 13:09:02 -04:00
// Resolve .TOC. symbol for this object file (ppc64)
2020-04-21 18:37:43 -04:00
toc := ldr . Lookup ( ".TOC." , int ( ldr . SymVersion ( s ) ) )
if toc != 0 {
ldr . SetSymSect ( toc , sect )
ldr . PrependSub ( s , toc )
ldr . SetSymValue ( toc , 0x8000 )
2020-04-17 13:09:02 -04:00
}
2015-02-27 22:57:28 -05:00
}
}
}
/* pointer-free data */
2020-04-21 18:37:43 -04:00
sect := state . allocateNamedSectionAndAssignSyms2 ( & Segdata , ".noptrdata" , sym . SNOPTRDATA , sym . SDATA , 06 )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.noptrdata" , 0 ) , sect )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.enoptrdata" , 0 ) , sect )
2015-02-27 22:57:28 -05:00
2017-10-07 13:28:51 -04:00
hasinitarr := ctxt . linkShared
2015-04-01 14:17:43 +13:00
2015-02-27 22:57:28 -05:00
/* shared library initializer */
2017-10-05 10:20:17 -04:00
switch ctxt . BuildMode {
case BuildModeCArchive , BuildModeCShared , BuildModeShared , BuildModePlugin :
2015-04-01 14:17:43 +13:00
hasinitarr = true
}
2019-03-25 10:33:49 +01:00
if ctxt . HeadType == objabi . Haix {
2020-04-21 18:37:43 -04:00
if len ( state . data2 [ sym . SINITARR ] ) > 0 {
2019-03-25 10:33:49 +01:00
Errorf ( nil , "XCOFF format doesn't allow .init_array section" )
}
}
2020-04-21 18:37:43 -04:00
if hasinitarr && len ( state . data2 [ sym . SINITARR ] ) > 0 {
state . allocateNamedSectionAndAssignSyms2 ( & Segdata , ".init_array" , sym . SINITARR , sym . Sxxx , 06 )
2015-02-27 22:57:28 -05:00
}
/* data */
2020-04-21 18:37:43 -04:00
sect = state . allocateNamedSectionAndAssignSyms2 ( & Segdata , ".data" , sym . SDATA , sym . SDATA , 06 )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.data" , 0 ) , sect )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.edata" , 0 ) , sect )
2020-04-17 13:09:02 -04:00
dataGcEnd := state . datsize - int64 ( sect . Vaddr )
2018-09-28 17:02:16 +02:00
// On AIX, TOC entries must be the last of .data
2019-02-20 16:20:56 +01:00
// These aren't part of gc as they won't change during the runtime.
2020-04-21 18:37:43 -04:00
state . assignToSection2 ( sect , sym . SXCOFFTOC , sym . SDATA )
2020-04-15 09:42:13 -04:00
state . checkdatsize ( sym . SDATA )
sect . Length = uint64 ( state . datsize ) - sect . Vaddr
2015-02-27 22:57:28 -05:00
/* bss */
2020-04-21 18:37:43 -04:00
sect = state . allocateNamedSectionAndAssignSyms2 ( & Segdata , ".bss" , sym . SBSS , sym . Sxxx , 06 )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.bss" , 0 ) , sect )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.ebss" , 0 ) , sect )
2020-04-17 13:09:02 -04:00
bssGcEnd := state . datsize - int64 ( sect . Vaddr )
// Emit gcdata for bcc symbols now that symbol values have been assigned.
gcsToEmit := [ ] struct {
symName string
symKind sym . SymKind
gcEnd int64
} {
{ "runtime.gcdata" , sym . SDATA , dataGcEnd } ,
{ "runtime.gcbss" , sym . SBSS , bssGcEnd } ,
}
for _ , g := range gcsToEmit {
2020-04-21 18:37:43 -04:00
var gc GCProg2
2020-04-17 13:09:02 -04:00
gc . Init ( ctxt , g . symName )
2020-04-21 18:37:43 -04:00
for _ , s := range state . data2 [ g . symKind ] {
2020-04-17 13:09:02 -04:00
gc . AddSym ( s )
}
gc . End ( g . gcEnd )
2015-02-27 22:57:28 -05:00
}
/* pointer-free bss */
2020-04-21 18:37:43 -04:00
sect = state . allocateNamedSectionAndAssignSyms2 ( & Segdata , ".noptrbss" , sym . SNOPTRBSS , sym . Sxxx , 06 )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.noptrbss" , 0 ) , sect )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.enoptrbss" , 0 ) , sect )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.end" , 0 ) , sect )
2015-02-27 22:57:28 -05:00
2019-10-18 17:39:39 -07:00
// Coverage instrumentation counters for libfuzzer.
2020-04-14 15:14:54 -04:00
if len ( state . data [ sym . SLIBFUZZER_EXTRA_COUNTER ] ) > 0 {
2020-04-21 18:37:43 -04:00
state . allocateNamedSectionAndAssignSyms2 ( & Segdata , "__libfuzzer_extra_counters" , sym . SLIBFUZZER_EXTRA_COUNTER , sym . Sxxx , 06 )
2019-10-18 17:39:39 -07:00
}
2020-04-21 18:37:43 -04:00
if len ( state . data2 [ sym . STLSBSS ] ) > 0 {
2017-10-04 17:54:04 -04:00
var sect * sym . Section
2020-04-17 17:23:02 -04:00
// FIXME: not clear why it is sometimes necessary to suppress .tbss section creation.
2019-02-20 16:16:38 +01:00
if ( ctxt . IsELF || ctxt . HeadType == objabi . Haix ) && ( ctxt . LinkMode == LinkExternal || ! * FlagD ) {
2020-04-21 18:37:43 -04:00
sect = addsection ( ldr , ctxt . Arch , & Segdata , ".tbss" , 06 )
2017-09-30 21:10:49 +00:00
sect . Align = int32 ( ctxt . Arch . PtrSize )
2020-04-17 17:23:02 -04:00
// FIXME: why does this need to be set to zero?
2015-08-11 12:29:00 +12:00
sect . Vaddr = 0
}
2020-04-15 09:42:13 -04:00
state . datsize = 0
2015-08-11 12:29:00 +12:00
2020-04-21 18:37:43 -04:00
for _ , s := range state . data2 [ sym . STLSBSS ] {
state . datsize = aligndatsize2 ( state , state . datsize , s )
if sect != nil {
ldr . SetSymSect ( s , sect )
}
ldr . SetSymValue ( s , state . datsize )
state . datsize += ldr . SymSize ( s )
2015-02-27 22:57:28 -05:00
}
2020-04-15 09:42:13 -04:00
state . checkdatsize ( sym . STLSBSS )
2015-02-27 22:57:28 -05:00
2015-08-11 12:29:00 +12:00
if sect != nil {
2020-04-15 09:42:13 -04:00
sect . Length = uint64 ( state . datsize )
2015-02-27 22:57:28 -05:00
}
}
/ *
* We finished data , begin read - only data .
* Not all systems support a separate read - only non - executable data section .
2018-06-02 15:35:25 +10:00
* ELF and Windows PE systems do .
2015-02-27 22:57:28 -05:00
* OS X and Plan 9 do not .
* And if we ' re using external linking mode , the point is moot ,
* since it ' s not our decision ; that code expects the sections in
* segtext .
* /
2017-10-04 17:54:04 -04:00
var segro * sym . Segment
2017-10-07 13:43:38 -04:00
if ctxt . IsELF && ctxt . LinkMode == LinkInternal {
2015-02-27 22:57:28 -05:00
segro = & Segrodata
2018-06-02 15:35:25 +10:00
} else if ctxt . HeadType == objabi . Hwindows {
segro = & Segrodata
2015-02-27 22:57:28 -05:00
} else {
segro = & Segtext
}
2020-04-15 09:42:13 -04:00
state . datsize = 0
2015-02-27 22:57:28 -05:00
/* read-only executable ELF, Mach-O sections */
2020-04-21 18:37:43 -04:00
if len ( state . data2 [ sym . STEXT ] ) != 0 {
culprit := ldr . SymName ( state . data2 [ sym . STEXT ] [ 0 ] )
Errorf ( nil , "dodata found an sym.STEXT symbol: %s" , culprit )
2016-04-18 14:50:14 -04:00
}
2020-04-21 18:37:43 -04:00
state . allocateSingleSymSections2 ( & Segtext , sym . SELFRXSECT , sym . SRODATA , 04 )
2015-02-27 22:57:28 -05:00
/* read-only data */
2020-04-17 17:23:02 -04:00
sect = state . allocateNamedDataSection ( segro , ".rodata" , sym . ReadOnly , 04 )
2020-04-21 18:37:43 -04:00
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.rodata" , 0 ) , sect )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.erodata" , 0 ) , sect )
2017-10-05 10:20:17 -04:00
if ! ctxt . UseRelro ( ) {
2020-04-21 18:37:43 -04:00
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.types" , 0 ) , sect )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.etypes" , 0 ) , sect )
2016-03-27 10:21:48 -04:00
}
2017-10-04 17:54:04 -04:00
for _ , symn := range sym . ReadOnly {
2020-04-15 09:42:13 -04:00
symnStartValue := state . datsize
2020-04-21 18:37:43 -04:00
state . assignToSection2 ( sect , symn , sym . SRODATA )
2019-02-20 15:54:11 +01:00
if ctxt . HeadType == objabi . Haix {
// Read-only symbols might be wrapped inside their outer
// symbol.
// XCOFF symbol table needs to know the size of
// these outer symbols.
2020-04-21 18:37:43 -04:00
xcoffUpdateOuterSize2 ( ctxt , state . datsize - symnStartValue , symn )
2019-02-20 15:54:11 +01:00
}
2015-02-27 22:57:28 -05:00
}
2016-09-05 23:29:16 -04:00
/* read-only ELF, Mach-O sections */
2020-04-21 18:37:43 -04:00
state . allocateSingleSymSections2 ( segro , sym . SELFROSECT , sym . SRODATA , 04 )
state . allocateSingleSymSections2 ( segro , sym . SMACHOPLT , sym . SRODATA , 04 )
2016-09-05 23:29:16 -04:00
2015-05-21 13:07:19 +12:00
// There is some data that are conceptually read-only but are written to by
// relocations. On GNU systems, we can arrange for the dynamic linker to
// mprotect sections after relocations are applied by giving them write
// permissions in the object file and calling them ".data.rel.ro.FOO". We
// divide the .rodata section between actual .rodata and .data.rel.ro.rodata,
// but for the other sections that this applies to, we just write a read-only
// .FOO section or a read-write .data.rel.ro.FOO section depending on the
// situation.
// TODO(mwhudson): It would make sense to do this more widely, but it makes
// the system linker segfault on darwin.
2020-04-17 17:23:02 -04:00
const relroPerm = 06
const fallbackPerm = 04
relroSecPerm := fallbackPerm
genrelrosecname := func ( suffix string ) string {
return suffix
2016-09-05 23:29:16 -04:00
}
2020-04-17 17:23:02 -04:00
seg := segro
2015-05-21 13:07:19 +12:00
2017-10-05 10:20:17 -04:00
if ctxt . UseRelro ( ) {
2019-11-25 22:17:29 -08:00
segrelro := & Segrelrodata
if ctxt . LinkMode == LinkExternal && ctxt . HeadType != objabi . Haix {
// Using a separate segment with an external
// linker results in some programs moving
// their data sections unexpectedly, which
// corrupts the moduledata. So we use the
// rodata segment and let the external linker
// sort out a rel.ro segment.
segrelro = segro
} else {
// Reset datsize for new segment.
2020-04-15 09:42:13 -04:00
state . datsize = 0
2019-11-25 22:17:29 -08:00
}
2020-04-17 17:23:02 -04:00
genrelrosecname = func ( suffix string ) string {
return ".data.rel.ro" + suffix
2016-09-05 23:29:16 -04:00
}
2020-04-17 17:23:02 -04:00
relroReadOnly := [ ] sym . SymKind { }
for _ , symnro := range sym . ReadOnly {
symn := sym . RelROMap [ symnro ]
relroReadOnly = append ( relroReadOnly , symn )
}
seg = segrelro
relroSecPerm = relroPerm
2019-11-25 22:17:29 -08:00
2015-05-21 13:07:19 +12:00
/* data only written by relocations */
2020-04-17 17:23:02 -04:00
sect = state . allocateNamedDataSection ( segrelro , genrelrosecname ( "" ) , relroReadOnly , relroSecPerm )
2015-05-21 13:07:19 +12:00
2020-04-21 18:37:43 -04:00
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.types" , 0 ) , sect )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.etypes" , 0 ) , sect )
2019-02-20 16:20:56 +01:00
2019-11-25 22:17:29 -08:00
for i , symnro := range sym . ReadOnly {
if i == 0 && symnro == sym . STYPE && ctxt . HeadType != objabi . Haix {
// Skip forward so that no type
// reference uses a zero offset.
// This is unlikely but possible in small
// programs with no other read-only data.
2020-04-15 09:42:13 -04:00
state . datsize ++
2019-11-25 22:17:29 -08:00
}
2017-10-04 17:54:04 -04:00
symn := sym . RelROMap [ symnro ]
2020-04-15 09:42:13 -04:00
symnStartValue := state . datsize
2020-04-17 13:09:02 -04:00
2020-04-21 18:37:43 -04:00
for _ , s := range state . data2 [ symn ] {
outer := ldr . OuterSym ( s )
if s != 0 && ldr . SymSect ( outer ) != nil && ldr . SymSect ( outer ) != sect {
ctxt . Errorf ( s , "s.Outer (%s) in different section from s, %s != %s" , ldr . SymName ( outer ) , ldr . SymSect ( outer ) . Name , sect . Name )
2016-04-18 14:50:14 -04:00
}
2015-05-21 13:07:19 +12:00
}
2020-04-21 18:37:43 -04:00
state . assignToSection2 ( sect , symn , sym . SRODATA )
2019-02-20 15:54:11 +01:00
if ctxt . HeadType == objabi . Haix {
// Read-only symbols might be wrapped inside their outer
// symbol.
// XCOFF symbol table needs to know the size of
// these outer symbols.
2020-04-21 18:37:43 -04:00
xcoffUpdateOuterSize2 ( ctxt , state . datsize - symnStartValue , symn )
2019-02-20 15:54:11 +01:00
}
2015-05-21 13:07:19 +12:00
}
2020-04-15 09:42:13 -04:00
sect . Length = uint64 ( state . datsize ) - sect . Vaddr
2015-05-21 13:07:19 +12:00
}
2015-02-27 22:57:28 -05:00
/* typelink */
2020-04-17 17:23:02 -04:00
sect = state . allocateNamedDataSection ( seg , genrelrosecname ( ".typelink" ) , [ ] sym . SymKind { sym . STYPELINK } , relroSecPerm )
2020-04-21 18:37:43 -04:00
typelink := ldr . CreateSymForUpdate ( "runtime.typelink" , 0 )
ldr . SetSymSect ( typelink . Sym ( ) , sect )
typelink . SetType ( sym . SRODATA )
state . datsize += typelink . Size ( )
2020-04-15 09:42:13 -04:00
state . checkdatsize ( sym . STYPELINK )
sect . Length = uint64 ( state . datsize ) - sect . Vaddr
2015-02-27 22:57:28 -05:00
2016-03-17 07:00:33 -07:00
/* itablink */
2020-04-21 18:37:43 -04:00
sect = state . allocateNamedSectionAndAssignSyms2 ( seg , genrelrosecname ( ".itablink" ) , sym . SITABLINK , sym . Sxxx , relroSecPerm )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.itablink" , 0 ) , sect )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.eitablink" , 0 ) , sect )
2019-02-20 15:54:11 +01:00
if ctxt . HeadType == objabi . Haix {
// Store .itablink size because its symbols are wrapped
// under an outer symbol: runtime.itablink.
2020-04-21 18:37:43 -04:00
xcoffUpdateOuterSize2 ( ctxt , int64 ( sect . Length ) , sym . SITABLINK )
2019-02-20 15:54:11 +01:00
}
2016-03-17 07:00:33 -07:00
2015-02-27 22:57:28 -05:00
/* gosymtab */
2020-04-21 18:37:43 -04:00
sect = state . allocateNamedSectionAndAssignSyms2 ( seg , genrelrosecname ( ".gosymtab" ) , sym . SSYMTAB , sym . SRODATA , relroSecPerm )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.symtab" , 0 ) , sect )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.esymtab" , 0 ) , sect )
2015-02-27 22:57:28 -05:00
/* gopclntab */
2020-04-21 18:37:43 -04:00
sect = state . allocateNamedSectionAndAssignSyms2 ( seg , genrelrosecname ( ".gopclntab" ) , sym . SPCLNTAB , sym . SRODATA , relroSecPerm )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.pclntab" , 0 ) , sect )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.epclntab" , 0 ) , sect )
2015-02-27 22:57:28 -05:00
// 6g uses 4-byte relocation offsets, so the entire segment must fit in 32 bits.
2020-04-15 09:42:13 -04:00
if state . datsize != int64 ( uint32 ( state . datsize ) ) {
Errorf ( nil , "read-only data segment too large: %d" , state . datsize )
2015-02-27 22:57:28 -05:00
}
2020-04-29 18:46:44 -04:00
siz := 0
for symn := sym . SELFRXSECT ; symn < sym . SXREF ; symn ++ {
siz += len ( state . data2 [ symn ] )
}
ctxt . datap2 = make ( [ ] loader . Sym , 0 , siz )
2017-10-04 17:54:04 -04:00
for symn := sym . SELFRXSECT ; symn < sym . SXREF ; symn ++ {
2020-04-21 18:37:43 -04:00
ctxt . datap2 = append ( ctxt . datap2 , state . data2 [ symn ] ... )
2016-04-18 14:50:14 -04:00
}
2020-04-20 07:47:43 -04:00
}
// allocateDwarfSections allocates sym.Section objects for DWARF
// symbols, and assigns symbols to sections.
2020-04-21 18:37:43 -04:00
func ( state * dodataState ) allocateDwarfSections2 ( ctxt * Link ) {
2016-04-18 14:50:14 -04:00
2020-04-21 18:37:43 -04:00
alignOne := func ( state * dodataState , datsize int64 , s loader . Sym ) int64 { return datsize }
2020-04-17 13:09:02 -04:00
2020-04-21 18:37:43 -04:00
ldr := ctxt . loader
for i := 0 ; i < len ( dwarfp2 ) ; i ++ {
2020-04-17 09:11:57 -04:00
// First the section symbol.
2020-04-21 18:37:43 -04:00
s := dwarfp2 [ i ] . secSym ( )
sect := state . allocateNamedDataSection ( & Segdwarf , ldr . SymName ( s ) , [ ] sym . SymKind { } , 04 )
ldr . SetSymSect ( s , sect )
sect . Sym2 = sym . LoaderSym ( s )
curType := ldr . SymType ( s )
state . setSymType ( s , sym . SRODATA )
ldr . SetSymValue ( s , int64 ( uint64 ( state . datsize ) - sect . Vaddr ) )
state . datsize += ldr . SymSize ( s )
2016-03-14 09:23:04 -07:00
2020-04-17 09:11:57 -04:00
// Then any sub-symbols for the section symbol.
2020-04-21 18:37:43 -04:00
subSyms := dwarfp2 [ i ] . subSyms ( )
state . assignDsymsToSection2 ( sect , subSyms , sym . SRODATA , alignOne )
2020-04-17 13:09:02 -04:00
2020-04-17 09:11:57 -04:00
for j := 0 ; j < len ( subSyms ) ; j ++ {
s := subSyms [ j ]
2019-02-20 16:29:00 +01:00
if ctxt . HeadType == objabi . Haix && curType == sym . SDWARFLOC {
// Update the size of .debug_loc for this symbol's
// package.
2020-04-21 18:37:43 -04:00
addDwsectCUSize ( ".debug_loc" , ldr . SymPkg ( s ) , uint64 ( ldr . SymSize ( s ) ) )
2019-02-20 16:29:00 +01:00
}
2016-03-14 09:23:04 -07:00
}
2020-04-15 09:42:13 -04:00
sect . Length = uint64 ( state . datsize ) - sect . Vaddr
state . checkdatsize ( curType )
2016-03-14 09:23:04 -07:00
}
2015-02-27 22:57:28 -05:00
}
2015-06-04 15:15:48 -04:00
2020-04-29 18:46:44 -04:00
type symNameSize struct {
name string
sz int64
sym loader . Sym
}
2020-04-21 18:37:43 -04:00
func ( state * dodataState ) dodataSect2 ( ctxt * Link , symn sym . SymKind , syms [ ] loader . Sym ) ( result [ ] loader . Sym , maxAlign int32 ) {
var head , tail loader . Sym
ldr := ctxt . loader
2020-04-29 18:46:44 -04:00
sl := make ( [ ] symNameSize , len ( syms ) )
for k , s := range syms {
2020-04-21 18:37:43 -04:00
ss := ldr . SymSize ( s )
2020-04-29 18:46:44 -04:00
sl [ k ] = symNameSize { name : ldr . SymName ( s ) , sz : ss , sym : s }
2020-04-21 18:37:43 -04:00
ds := int64 ( len ( ldr . Data ( s ) ) )
2016-04-19 08:59:56 -04:00
switch {
2020-04-21 18:37:43 -04:00
case ss < ds :
ctxt . Errorf ( s , "initialize bounds (%d < %d)" , ss , ds )
case ss < 0 :
ctxt . Errorf ( s , "negative size (%d bytes)" , ss )
case ss > cutoff :
ctxt . Errorf ( s , "symbol too large (%d bytes)" , ss )
2016-04-18 14:50:14 -04:00
}
2016-09-19 14:13:07 -04:00
// If the usually-special section-marker symbols are being laid
// out as regular symbols, put them either at the beginning or
// end of their section.
2019-02-20 16:20:56 +01:00
if ( ctxt . DynlinkingGo ( ) && ctxt . HeadType == objabi . Hdarwin ) || ( ctxt . HeadType == objabi . Haix && ctxt . LinkMode == LinkExternal ) {
2020-04-21 18:37:43 -04:00
switch ldr . SymName ( s ) {
2019-02-20 16:20:56 +01:00
case "runtime.text" , "runtime.bss" , "runtime.data" , "runtime.types" , "runtime.rodata" :
2016-09-19 14:13:07 -04:00
head = s
continue
2019-02-20 16:20:56 +01:00
case "runtime.etext" , "runtime.ebss" , "runtime.edata" , "runtime.etypes" , "runtime.erodata" :
2016-09-19 14:13:07 -04:00
tail = s
continue
}
}
2016-04-18 14:50:14 -04:00
}
2020-04-21 18:37:43 -04:00
// For ppc64, we want to interleave the .got and .toc sections
// from input files. Both are type sym.SELFGOT, so in that case
// we skip size comparison and fall through to the name
// comparison (conveniently, .got sorts before .toc).
checkSize := symn != sym . SELFGOT
2016-04-18 14:50:14 -04:00
2020-04-21 18:37:43 -04:00
// Perform the sort.
2020-04-29 18:46:44 -04:00
sort . Slice ( sl , func ( i , j int ) bool {
si , sj := sl [ i ] . sym , sl [ j ] . sym
2020-04-21 18:37:43 -04:00
switch {
case si == head , sj == tail :
return true
case sj == head , si == tail :
return false
}
if checkSize {
2020-04-29 18:46:44 -04:00
isz := sl [ i ] . sz
jsz := sl [ j ] . sz
2020-04-21 18:37:43 -04:00
if isz != jsz {
return isz < jsz
2016-04-18 14:50:14 -04:00
}
}
2020-04-29 18:46:44 -04:00
iname := sl [ i ] . name
jname := sl [ j ] . name
2020-04-21 18:37:43 -04:00
if iname != jname {
return iname < jname
}
return si < sj
} )
2020-04-29 18:46:44 -04:00
// Reap alignment, construct result
syms = syms [ : 0 ]
for k := range sl {
s := sl [ k ] . sym
2020-04-21 18:37:43 -04:00
if s != head && s != tail {
align := state . symalign2 ( s )
if maxAlign < align {
maxAlign = align
2016-04-18 14:50:14 -04:00
}
}
2020-04-29 18:46:44 -04:00
syms = append ( syms , s )
2016-04-18 14:50:14 -04:00
}
2016-04-19 08:59:56 -04:00
return syms , maxAlign
2016-04-18 14:50:14 -04:00
}
2015-06-04 15:15:48 -04:00
// Add buildid to beginning of text segment, on non-ELF systems.
// Non-ELF binary formats are not always flexible enough to
// give us a place to put the Go build ID. On those systems, we put it
// at the very beginning of the text segment.
// This ``header'' is read by cmd/go.
2016-08-19 22:40:38 -04:00
func ( ctxt * Link ) textbuildid ( ) {
2017-10-07 13:43:38 -04:00
if ctxt . IsELF || ctxt . BuildMode == BuildModePlugin || * flagBuildid == "" {
2015-06-04 15:15:48 -04:00
return
}
2020-03-26 12:54:21 -04:00
ldr := ctxt . loader
s := ldr . CreateSymForUpdate ( "go.buildid" , 0 )
s . SetReachable ( true )
2015-06-04 15:15:48 -04:00
// The \xff is invalid UTF-8, meant to make it less likely
// to find one of these accidentally.
2016-08-21 18:34:24 -04:00
data := "\xff Go build ID: " + strconv . Quote ( * flagBuildid ) + "\n \xff"
2020-03-26 12:54:21 -04:00
s . SetType ( sym . STEXT )
s . SetData ( [ ] byte ( data ) )
s . SetSize ( int64 ( len ( data ) ) )
2015-06-04 15:15:48 -04:00
2020-03-26 12:54:21 -04:00
ctxt . Textp2 = append ( ctxt . Textp2 , 0 )
copy ( ctxt . Textp2 [ 1 : ] , ctxt . Textp2 )
ctxt . Textp2 [ 0 ] = s . Sym ( )
2015-06-04 15:15:48 -04:00
}
2015-02-27 22:57:28 -05:00
2019-04-22 23:02:37 -04:00
func ( ctxt * Link ) buildinfo ( ) {
if ctxt . linkShared || ctxt . BuildMode == BuildModePlugin {
// -linkshared and -buildmode=plugin get confused
// about the relocations in go.buildinfo
// pointing at the other data sections.
// The version information is only available in executables.
return
}
2020-04-09 14:12:17 -04:00
ldr := ctxt . loader
s := ldr . CreateSymForUpdate ( ".go.buildinfo" , 0 )
s . SetReachable ( true )
s . SetType ( sym . SBUILDINFO )
s . SetAlign ( 16 )
2019-04-22 23:02:37 -04:00
// The \xff is invalid UTF-8, meant to make it less likely
// to find one of these accidentally.
const prefix = "\xff Go buildinf:" // 14 bytes, plus 2 data bytes filled in below
data := make ( [ ] byte , 32 )
copy ( data , prefix )
data [ len ( prefix ) ] = byte ( ctxt . Arch . PtrSize )
data [ len ( prefix ) + 1 ] = 0
if ctxt . Arch . ByteOrder == binary . BigEndian {
data [ len ( prefix ) + 1 ] = 1
}
2020-04-09 14:12:17 -04:00
s . SetData ( data )
s . SetSize ( int64 ( len ( data ) ) )
r , _ := s . AddRel ( objabi . R_ADDR )
r . SetOff ( 16 )
r . SetSiz ( uint8 ( ctxt . Arch . PtrSize ) )
r . SetSym ( ldr . LookupOrCreateSym ( "runtime.buildVersion" , 0 ) )
r , _ = s . AddRel ( objabi . R_ADDR )
r . SetOff ( 16 + int32 ( ctxt . Arch . PtrSize ) )
r . SetSiz ( uint8 ( ctxt . Arch . PtrSize ) )
r . SetSym ( ldr . LookupOrCreateSym ( "runtime.modinfo" , 0 ) )
2019-04-22 23:02:37 -04:00
}
2015-02-27 22:57:28 -05:00
// assign addresses to text
2016-08-19 22:40:38 -04:00
func ( ctxt * Link ) textaddress ( ) {
2020-04-21 18:50:49 -04:00
addsection ( ctxt . loader , ctxt . Arch , & Segtext , ".text" , 05 )
2015-02-27 22:57:28 -05:00
// Assign PCs in text segment.
// Could parallelize, by assigning to text
// and then letting threads copy down, but probably not worth it.
2017-04-18 21:52:06 +12:00
sect := Segtext . Sections [ 0 ]
2015-02-27 22:57:28 -05:00
sect . Align = int32 ( Funcalign )
2016-09-19 14:13:07 -04:00
2020-04-06 15:58:21 -04:00
ldr := ctxt . loader
text := ldr . LookupOrCreateSym ( "runtime.text" , 0 )
ldr . SetAttrReachable ( text , true )
ldr . SetSymSect ( text , sect )
if ctxt . IsAIX ( ) && ctxt . IsExternal ( ) {
2019-02-20 16:20:56 +01:00
// Setting runtime.text has a real symbol prevents ld to
// change its base address resulting in wrong offsets for
// reflect methods.
2020-04-06 15:58:21 -04:00
u := ldr . MakeSymbolUpdater ( text )
u . SetAlign ( sect . Align )
u . SetSize ( 8 )
2019-02-20 16:20:56 +01:00
}
2016-09-19 14:13:07 -04:00
2020-04-06 15:58:21 -04:00
if ( ctxt . DynlinkingGo ( ) && ctxt . IsDarwin ( ) ) || ( ctxt . IsAIX ( ) && ctxt . IsExternal ( ) ) {
etext := ldr . LookupOrCreateSym ( "runtime.etext" , 0 )
ldr . SetSymSect ( etext , sect )
2016-09-19 14:13:07 -04:00
2020-04-06 15:58:21 -04:00
ctxt . Textp2 = append ( ctxt . Textp2 , etext , 0 )
copy ( ctxt . Textp2 [ 1 : ] , ctxt . Textp2 )
ctxt . Textp2 [ 0 ] = text
2016-09-19 14:13:07 -04:00
}
2016-08-21 18:34:24 -04:00
va := uint64 ( * FlagTextAddr )
2016-08-25 11:07:33 -05:00
n := 1
2015-02-27 22:57:28 -05:00
sect . Vaddr = va
2016-09-14 14:47:12 -04:00
ntramps := 0
2020-04-06 15:58:21 -04:00
for _ , s := range ctxt . Textp2 {
2017-10-04 17:54:04 -04:00
sect , n , va = assignAddress ( ctxt , sect , n , s , va , false )
2016-09-14 14:47:12 -04:00
2017-10-04 17:54:04 -04:00
trampoline ( ctxt , s ) // resolve jumps, may add trampolines if jump too far
2016-09-14 14:47:12 -04:00
// lay down trampolines after each function
for ; ntramps < len ( ctxt . tramps ) ; ntramps ++ {
tramp := ctxt . tramps [ ntramps ]
2020-04-06 15:58:21 -04:00
if ctxt . IsAIX ( ) && strings . HasPrefix ( ldr . SymName ( tramp ) , "runtime.text." ) {
2019-02-20 16:48:43 +01:00
// Already set in assignAddress
continue
}
2017-06-08 08:26:19 -04:00
sect , n , va = assignAddress ( ctxt , sect , n , tramp , va , true )
2015-02-27 22:57:28 -05:00
}
2016-09-14 14:47:12 -04:00
}
sect . Length = va - sect . Vaddr
2020-04-06 15:58:21 -04:00
etext := ldr . LookupOrCreateSym ( "runtime.etext" , 0 )
ldr . SetAttrReachable ( etext , true )
ldr . SetSymSect ( etext , sect )
2016-09-14 14:47:12 -04:00
// merge tramps into Textp, keeping Textp in address order
if ntramps != 0 {
2020-04-06 15:58:21 -04:00
newtextp := make ( [ ] loader . Sym , 0 , len ( ctxt . Textp ) + ntramps )
2016-09-14 14:47:12 -04:00
i := 0
2020-04-06 15:58:21 -04:00
for _ , s := range ctxt . Textp2 {
for ; i < ntramps && ldr . SymValue ( ctxt . tramps [ i ] ) < ldr . SymValue ( s ) ; i ++ {
2016-09-14 14:47:12 -04:00
newtextp = append ( newtextp , ctxt . tramps [ i ] )
}
2017-10-04 17:54:04 -04:00
newtextp = append ( newtextp , s )
2015-02-27 22:57:28 -05:00
}
2016-09-14 14:47:12 -04:00
newtextp = append ( newtextp , ctxt . tramps [ i : ntramps ] ... )
2020-04-06 15:58:21 -04:00
ctxt . Textp2 = newtextp
2016-09-14 14:47:12 -04:00
}
}
// assigns address for a text symbol, returns (possibly new) section, its number, and the address
2020-04-06 15:58:21 -04:00
func assignAddress ( ctxt * Link , sect * sym . Section , n int , s loader . Sym , va uint64 , isTramp bool ) ( * sym . Section , int , uint64 ) {
ldr := ctxt . loader
2018-03-04 12:59:15 +01:00
if thearch . AssignAddress != nil {
2020-04-06 15:58:21 -04:00
return thearch . AssignAddress ( ldr , sect , n , s , va , isTramp )
2018-03-04 12:59:15 +01:00
}
2020-04-06 15:58:21 -04:00
ldr . SetSymSect ( s , sect )
if ldr . AttrSubSymbol ( s ) {
2016-09-14 14:47:12 -04:00
return sect , n , va
}
2020-04-06 15:58:21 -04:00
align := ldr . SymAlign ( s )
if align == 0 {
align = int32 ( Funcalign )
}
va = uint64 ( Rnd ( int64 ( va ) , int64 ( align ) ) )
if sect . Align < align {
sect . Align = align
2016-09-14 14:47:12 -04:00
}
2016-08-25 11:07:33 -05:00
2020-04-06 15:58:21 -04:00
funcsize := uint64 ( MINFUNC ) // spacing required for findfunctab
if ldr . SymSize ( s ) > MINFUNC {
funcsize = uint64 ( ldr . SymSize ( s ) )
2020-04-02 12:48:13 -04:00
}
2016-09-14 14:47:12 -04:00
// On ppc64x a text section should not be larger than 2^26 bytes due to the size of
// call target offset field in the bl instruction. Splitting into smaller text
// sections smaller than this limit allows the GNU linker to modify the long calls
// appropriately. The limit allows for the space needed for tables inserted by the linker.
2016-08-25 11:07:33 -05:00
2016-09-14 14:47:12 -04:00
// If this function doesn't fit in the current text section, then create a new one.
2016-08-25 11:07:33 -05:00
2016-09-14 14:47:12 -04:00
// Only break at outermost syms.
2016-08-25 11:07:33 -05:00
2020-04-06 15:58:21 -04:00
if ctxt . Arch . InFamily ( sys . PPC64 ) && ldr . OuterSym ( s ) == 0 && ctxt . IsExternal ( ) && va - sect . Vaddr + funcsize + maxSizeTrampolinesPPC64 ( ldr , s , isTramp ) > 0x1c00000 {
2016-09-14 14:47:12 -04:00
// Set the length for the previous text section
sect . Length = va - sect . Vaddr
2016-08-25 11:07:33 -05:00
2016-09-14 14:47:12 -04:00
// Create new section, set the starting Vaddr
2020-04-21 18:50:49 -04:00
sect = addsection ( ctxt . loader , ctxt . Arch , & Segtext , ".text" , 05 )
2016-09-14 14:47:12 -04:00
sect . Vaddr = va
2020-04-06 15:58:21 -04:00
ldr . SetSymSect ( s , sect )
2016-08-25 11:07:33 -05:00
2016-09-14 14:47:12 -04:00
// Create a symbol for the start of the secondary text sections
2020-04-06 15:58:21 -04:00
ntext := ldr . CreateSymForUpdate ( fmt . Sprintf ( "runtime.text.%d" , n ) , 0 )
ntext . SetReachable ( true )
ntext . SetSect ( sect )
if ctxt . IsAIX ( ) {
2019-02-20 16:48:43 +01:00
// runtime.text.X must be a real symbol on AIX.
// Assign its address directly in order to be the
// first symbol of this new section.
2020-04-06 15:58:21 -04:00
ntext . SetType ( sym . STEXT )
ntext . SetSize ( int64 ( MINFUNC ) )
ntext . SetOnList ( true )
ctxt . tramps = append ( ctxt . tramps , ntext . Sym ( ) )
2019-02-20 16:48:43 +01:00
2020-04-06 15:58:21 -04:00
ntext . SetValue ( int64 ( va ) )
va += uint64 ( ntext . Size ( ) )
2019-02-20 16:48:43 +01:00
2020-04-06 15:58:21 -04:00
if align := ldr . SymAlign ( s ) ; align != 0 {
va = uint64 ( Rnd ( int64 ( va ) , int64 ( align ) ) )
2019-02-20 16:48:43 +01:00
} else {
va = uint64 ( Rnd ( int64 ( va ) , int64 ( Funcalign ) ) )
}
}
2016-09-14 14:47:12 -04:00
n ++
2015-02-27 22:57:28 -05:00
}
2019-02-20 16:48:43 +01:00
2020-04-06 15:58:21 -04:00
ldr . SetSymValue ( s , 0 )
for sub := s ; sub != 0 ; sub = ldr . SubSym ( sub ) {
ldr . SetSymValue ( sub , ldr . SymValue ( sub ) + int64 ( va ) )
2019-02-20 16:48:43 +01:00
}
2016-09-14 14:47:12 -04:00
va += funcsize
2015-02-27 22:57:28 -05:00
2016-09-14 14:47:12 -04:00
return sect , n , va
2015-02-27 22:57:28 -05:00
}
2018-05-04 14:55:31 -04:00
// address assigns virtual addresses to all segments and sections and
// returns all segments in file order.
func ( ctxt * Link ) address ( ) [ ] * sym . Segment {
var order [ ] * sym . Segment // Layout order
2016-08-21 18:34:24 -04:00
va := uint64 ( * FlagTextAddr )
2018-05-04 14:55:31 -04:00
order = append ( order , & Segtext )
2015-02-27 22:57:28 -05:00
Segtext . Rwx = 05
Segtext . Vaddr = va
2017-04-18 21:52:06 +12:00
for _ , s := range Segtext . Sections {
2015-02-27 22:57:28 -05:00
va = uint64 ( Rnd ( int64 ( va ) , int64 ( s . Align ) ) )
s . Vaddr = va
va += s . Length
}
2016-08-21 18:34:24 -04:00
Segtext . Length = va - uint64 ( * FlagTextAddr )
2015-02-27 22:57:28 -05:00
2017-04-18 21:52:06 +12:00
if len ( Segrodata . Sections ) > 0 {
2015-02-27 22:57:28 -05:00
// align to page boundary so as not to mix
// rodata and executable text.
2016-09-05 23:29:16 -04:00
//
// Note: gold or GNU ld will reduce the size of the executable
// file by arranging for the relro segment to end at a page
// boundary, and overlap the end of the text segment with the
// start of the relro segment in the file. The PT_LOAD segments
// will be such that the last page of the text segment will be
// mapped twice, once r-x and once starting out rw- and, after
// relocation processing, changed to r--.
//
// Ideally the last page of the text segment would not be
// writable even for this short period.
2016-08-21 18:34:24 -04:00
va = uint64 ( Rnd ( int64 ( va ) , int64 ( * FlagRound ) ) )
2015-02-27 22:57:28 -05:00
2018-05-04 14:55:31 -04:00
order = append ( order , & Segrodata )
2015-02-27 22:57:28 -05:00
Segrodata . Rwx = 04
Segrodata . Vaddr = va
2017-04-18 21:52:06 +12:00
for _ , s := range Segrodata . Sections {
2015-02-27 22:57:28 -05:00
va = uint64 ( Rnd ( int64 ( va ) , int64 ( s . Align ) ) )
s . Vaddr = va
va += s . Length
}
Segrodata . Length = va - Segrodata . Vaddr
}
2017-04-18 21:52:06 +12:00
if len ( Segrelrodata . Sections ) > 0 {
2016-09-05 23:29:16 -04:00
// align to page boundary so as not to mix
// rodata, rel-ro data, and executable text.
va = uint64 ( Rnd ( int64 ( va ) , int64 ( * FlagRound ) ) )
2019-02-20 16:16:38 +01:00
if ctxt . HeadType == objabi . Haix {
// Relro data are inside data segment on AIX.
va += uint64 ( XCOFFDATABASE ) - uint64 ( XCOFFTEXTBASE )
}
2016-09-05 23:29:16 -04:00
2018-05-04 14:55:31 -04:00
order = append ( order , & Segrelrodata )
2016-09-05 23:29:16 -04:00
Segrelrodata . Rwx = 06
Segrelrodata . Vaddr = va
2017-04-18 21:52:06 +12:00
for _ , s := range Segrelrodata . Sections {
2016-09-05 23:29:16 -04:00
va = uint64 ( Rnd ( int64 ( va ) , int64 ( s . Align ) ) )
s . Vaddr = va
va += s . Length
}
Segrelrodata . Length = va - Segrelrodata . Vaddr
}
2015-02-27 22:57:28 -05:00
2016-08-21 18:34:24 -04:00
va = uint64 ( Rnd ( int64 ( va ) , int64 ( * FlagRound ) ) )
2019-02-20 16:16:38 +01:00
if ctxt . HeadType == objabi . Haix && len ( Segrelrodata . Sections ) == 0 {
2018-11-23 15:12:04 +01:00
// Data sections are moved to an unreachable segment
// to ensure that they are position-independent.
2019-02-20 16:16:38 +01:00
// Already done if relro sections exist.
2018-11-23 15:12:04 +01:00
va += uint64 ( XCOFFDATABASE ) - uint64 ( XCOFFTEXTBASE )
}
2018-05-04 14:55:31 -04:00
order = append ( order , & Segdata )
2015-02-27 22:57:28 -05:00
Segdata . Rwx = 06
Segdata . Vaddr = va
2017-10-04 17:54:04 -04:00
var data * sym . Section
var noptr * sym . Section
var bss * sym . Section
var noptrbss * sym . Section
2017-04-18 21:52:06 +12:00
for i , s := range Segdata . Sections {
2019-02-20 16:16:38 +01:00
if ( ctxt . IsELF || ctxt . HeadType == objabi . Haix ) && s . Name == ".tbss" {
2015-08-11 12:29:00 +12:00
continue
}
2017-10-01 14:39:04 +11:00
vlen := int64 ( s . Length )
2019-02-20 16:16:38 +01:00
if i + 1 < len ( Segdata . Sections ) && ! ( ( ctxt . IsELF || ctxt . HeadType == objabi . Haix ) && Segdata . Sections [ i + 1 ] . Name == ".tbss" ) {
2017-04-18 21:52:06 +12:00
vlen = int64 ( Segdata . Sections [ i + 1 ] . Vaddr - s . Vaddr )
2015-02-27 22:57:28 -05:00
}
s . Vaddr = va
va += uint64 ( vlen )
Segdata . Length = va - Segdata . Vaddr
if s . Name == ".data" {
data = s
}
if s . Name == ".noptrdata" {
noptr = s
}
if s . Name == ".bss" {
bss = s
}
if s . Name == ".noptrbss" {
noptrbss = s
}
}
2018-05-04 14:55:31 -04:00
// Assign Segdata's Filelen omitting the BSS. We do this here
// simply because right now we know where the BSS starts.
2015-02-27 22:57:28 -05:00
Segdata . Filelen = bss . Vaddr - Segdata . Vaddr
2016-08-21 18:34:24 -04:00
va = uint64 ( Rnd ( int64 ( va ) , int64 ( * FlagRound ) ) )
2018-05-04 14:55:31 -04:00
order = append ( order , & Segdwarf )
2016-03-14 09:23:04 -07:00
Segdwarf . Rwx = 06
Segdwarf . Vaddr = va
2017-04-18 21:52:06 +12:00
for i , s := range Segdwarf . Sections {
2017-10-01 14:39:04 +11:00
vlen := int64 ( s . Length )
2017-04-18 21:52:06 +12:00
if i + 1 < len ( Segdwarf . Sections ) {
vlen = int64 ( Segdwarf . Sections [ i + 1 ] . Vaddr - s . Vaddr )
2016-03-14 09:23:04 -07:00
}
s . Vaddr = va
va += uint64 ( vlen )
2017-10-07 13:49:44 -04:00
if ctxt . HeadType == objabi . Hwindows {
2016-03-14 09:23:04 -07:00
va = uint64 ( Rnd ( int64 ( va ) , PEFILEALIGN ) )
}
Segdwarf . Length = va - Segdwarf . Vaddr
}
2020-04-23 14:28:20 -04:00
ldr := ctxt . loader
2016-09-05 23:29:16 -04:00
var (
2017-04-18 21:52:06 +12:00
text = Segtext . Sections [ 0 ]
2020-04-23 14:28:20 -04:00
rodata = ldr . SymSect ( ldr . LookupOrCreateSym ( "runtime.rodata" , 0 ) )
itablink = ldr . SymSect ( ldr . LookupOrCreateSym ( "runtime.itablink" , 0 ) )
symtab = ldr . SymSect ( ldr . LookupOrCreateSym ( "runtime.symtab" , 0 ) )
pclntab = ldr . SymSect ( ldr . LookupOrCreateSym ( "runtime.pclntab" , 0 ) )
types = ldr . SymSect ( ldr . LookupOrCreateSym ( "runtime.types" , 0 ) )
2016-09-05 23:29:16 -04:00
)
2016-08-25 11:07:33 -05:00
lasttext := text
// Could be multiple .text sections
2017-04-18 21:52:06 +12:00
for _ , sect := range Segtext . Sections {
if sect . Name == ".text" {
lasttext = sect
}
2016-08-25 11:07:33 -05:00
}
2015-02-27 22:57:28 -05:00
2020-04-23 14:28:20 -04:00
for _ , s := range ctxt . datap2 {
if sect := ldr . SymSect ( s ) ; sect != nil {
ldr . AddToSymValue ( s , int64 ( sect . Vaddr ) )
2015-02-27 22:57:28 -05:00
}
2020-04-23 14:28:20 -04:00
v := ldr . SymValue ( s )
for sub := ldr . SubSym ( s ) ; sub != 0 ; sub = ldr . SubSym ( sub ) {
ldr . AddToSymValue ( sub , v )
2015-02-27 22:57:28 -05:00
}
}
2016-04-18 14:50:14 -04:00
2020-04-23 14:28:20 -04:00
for _ , si := range dwarfp2 {
2020-04-17 09:11:57 -04:00
for _ , s := range si . syms {
2020-04-23 14:28:20 -04:00
if sect := ldr . SymSect ( s ) ; sect != nil {
ldr . AddToSymValue ( s , int64 ( sect . Vaddr ) )
2020-04-17 09:11:57 -04:00
}
2020-04-23 14:28:20 -04:00
sub := ldr . SubSym ( s )
if sub != 0 {
panic ( fmt . Sprintf ( "unexpected sub-sym for %s %s" , ldr . SymName ( s ) , ldr . SymType ( s ) . String ( ) ) )
2020-04-17 09:11:57 -04:00
}
2020-04-23 14:28:20 -04:00
v := ldr . SymValue ( s )
for ; sub != 0 ; sub = ldr . SubSym ( sub ) {
ldr . AddToSymValue ( s , v )
2020-04-17 09:11:57 -04:00
}
2016-03-14 09:23:04 -07:00
}
}
2015-02-27 22:57:28 -05:00
2017-10-05 10:20:17 -04:00
if ctxt . BuildMode == BuildModeShared {
2020-04-23 14:28:20 -04:00
s := ldr . LookupOrCreateSym ( "go.link.abihashbytes" , 0 )
sect := ldr . SymSect ( ldr . LookupOrCreateSym ( ".note.go.abihash" , 0 ) )
ldr . SetSymSect ( s , sect )
ldr . SetSymValue ( s , int64 ( sect . Vaddr + 16 ) )
2015-05-25 13:59:08 +12:00
}
2020-04-23 14:28:20 -04:00
ctxt . xdefine2 ( "runtime.text" , sym . STEXT , int64 ( text . Vaddr ) )
ctxt . xdefine2 ( "runtime.etext" , sym . STEXT , int64 ( lasttext . Vaddr + lasttext . Length ) )
2016-08-25 11:07:33 -05:00
// If there are multiple text sections, create runtime.text.n for
// their section Vaddr, using n for index
n := 1
2017-04-18 21:52:06 +12:00
for _ , sect := range Segtext . Sections [ 1 : ] {
2017-10-28 17:35:27 +01:00
if sect . Name != ".text" {
2017-04-18 21:52:06 +12:00
break
}
2017-10-28 17:35:27 +01:00
symname := fmt . Sprintf ( "runtime.text.%d" , n )
2019-02-20 16:48:43 +01:00
if ctxt . HeadType != objabi . Haix || ctxt . LinkMode != LinkExternal {
// Addresses are already set on AIX with external linker
// because these symbols are part of their sections.
2020-04-23 14:28:20 -04:00
ctxt . xdefine2 ( symname , sym . STEXT , int64 ( sect . Vaddr ) )
2019-02-20 16:48:43 +01:00
}
2017-10-28 17:35:27 +01:00
n ++
2016-08-25 11:07:33 -05:00
}
2020-04-23 14:28:20 -04:00
ctxt . xdefine2 ( "runtime.rodata" , sym . SRODATA , int64 ( rodata . Vaddr ) )
ctxt . xdefine2 ( "runtime.erodata" , sym . SRODATA , int64 ( rodata . Vaddr + rodata . Length ) )
ctxt . xdefine2 ( "runtime.types" , sym . SRODATA , int64 ( types . Vaddr ) )
ctxt . xdefine2 ( "runtime.etypes" , sym . SRODATA , int64 ( types . Vaddr + types . Length ) )
ctxt . xdefine2 ( "runtime.itablink" , sym . SRODATA , int64 ( itablink . Vaddr ) )
ctxt . xdefine2 ( "runtime.eitablink" , sym . SRODATA , int64 ( itablink . Vaddr + itablink . Length ) )
s := ldr . Lookup ( "runtime.gcdata" , 0 )
ldr . SetAttrLocal ( s , true )
ctxt . xdefine2 ( "runtime.egcdata" , sym . SRODATA , ldr . SymAddr ( s ) + ldr . SymSize ( s ) )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.egcdata" , 0 ) , ldr . SymSect ( s ) )
s = ldr . LookupOrCreateSym ( "runtime.gcbss" , 0 )
ldr . SetAttrLocal ( s , true )
ctxt . xdefine2 ( "runtime.egcbss" , sym . SRODATA , ldr . SymAddr ( s ) + ldr . SymSize ( s ) )
ldr . SetSymSect ( ldr . LookupOrCreateSym ( "runtime.egcbss" , 0 ) , ldr . SymSect ( s ) )
ctxt . xdefine2 ( "runtime.symtab" , sym . SRODATA , int64 ( symtab . Vaddr ) )
ctxt . xdefine2 ( "runtime.esymtab" , sym . SRODATA , int64 ( symtab . Vaddr + symtab . Length ) )
ctxt . xdefine2 ( "runtime.pclntab" , sym . SRODATA , int64 ( pclntab . Vaddr ) )
ctxt . xdefine2 ( "runtime.epclntab" , sym . SRODATA , int64 ( pclntab . Vaddr + pclntab . Length ) )
ctxt . xdefine2 ( "runtime.noptrdata" , sym . SNOPTRDATA , int64 ( noptr . Vaddr ) )
ctxt . xdefine2 ( "runtime.enoptrdata" , sym . SNOPTRDATA , int64 ( noptr . Vaddr + noptr . Length ) )
ctxt . xdefine2 ( "runtime.bss" , sym . SBSS , int64 ( bss . Vaddr ) )
ctxt . xdefine2 ( "runtime.ebss" , sym . SBSS , int64 ( bss . Vaddr + bss . Length ) )
ctxt . xdefine2 ( "runtime.data" , sym . SDATA , int64 ( data . Vaddr ) )
ctxt . xdefine2 ( "runtime.edata" , sym . SDATA , int64 ( data . Vaddr + data . Length ) )
ctxt . xdefine2 ( "runtime.noptrbss" , sym . SNOPTRBSS , int64 ( noptrbss . Vaddr ) )
ctxt . xdefine2 ( "runtime.enoptrbss" , sym . SNOPTRBSS , int64 ( noptrbss . Vaddr + noptrbss . Length ) )
ctxt . xdefine2 ( "runtime.end" , sym . SBSS , int64 ( Segdata . Vaddr + Segdata . Length ) )
2018-05-04 14:55:31 -04:00
2020-03-25 17:10:16 -04:00
if ctxt . IsSolaris ( ) {
// On Solaris, in the runtime it sets the external names of the
// end symbols. Unset them and define separate symbols, so we
// keep both.
2020-04-23 14:28:20 -04:00
etext := ldr . Lookup ( "runtime.etext" , 0 )
edata := ldr . Lookup ( "runtime.edata" , 0 )
end := ldr . Lookup ( "runtime.end" , 0 )
ldr . SetSymExtname ( etext , "runtime.etext" )
ldr . SetSymExtname ( edata , "runtime.edata" )
ldr . SetSymExtname ( end , "runtime.end" )
ctxt . xdefine2 ( "_etext" , ldr . SymType ( etext ) , ldr . SymValue ( etext ) )
ctxt . xdefine2 ( "_edata" , ldr . SymType ( edata ) , ldr . SymValue ( edata ) )
ctxt . xdefine2 ( "_end" , ldr . SymType ( end ) , ldr . SymValue ( end ) )
ldr . SetSymSect ( ldr . Lookup ( "_etext" , 0 ) , ldr . SymSect ( etext ) )
ldr . SetSymSect ( ldr . Lookup ( "_edata" , 0 ) , ldr . SymSect ( edata ) )
ldr . SetSymSect ( ldr . Lookup ( "_end" , 0 ) , ldr . SymSect ( end ) )
2020-03-25 17:10:16 -04:00
}
2018-05-04 14:55:31 -04:00
return order
}
// layout assigns file offsets and lengths to the segments in order.
2019-04-03 22:41:48 -04:00
// Returns the file size containing all the segments.
func ( ctxt * Link ) layout ( order [ ] * sym . Segment ) uint64 {
2018-05-04 14:55:31 -04:00
var prev * sym . Segment
for _ , seg := range order {
if prev == nil {
seg . Fileoff = uint64 ( HEADR )
} else {
switch ctxt . HeadType {
default :
// Assuming the previous segment was
// aligned, the following rounding
// should ensure that this segment's
// VA ≡ Fileoff mod FlagRound.
seg . Fileoff = uint64 ( Rnd ( int64 ( prev . Fileoff + prev . Filelen ) , int64 ( * FlagRound ) ) )
if seg . Vaddr % uint64 ( * FlagRound ) != seg . Fileoff % uint64 ( * FlagRound ) {
Exitf ( "bad segment rounding (Vaddr=%#x Fileoff=%#x FlagRound=%#x)" , seg . Vaddr , seg . Fileoff , * FlagRound )
}
case objabi . Hwindows :
seg . Fileoff = prev . Fileoff + uint64 ( Rnd ( int64 ( prev . Filelen ) , PEFILEALIGN ) )
case objabi . Hplan9 :
seg . Fileoff = prev . Fileoff + prev . Filelen
}
}
if seg != & Segdata {
// Link.address already set Segdata.Filelen to
// account for BSS.
seg . Filelen = seg . Length
}
prev = seg
}
2019-04-03 22:41:48 -04:00
return prev . Fileoff + prev . Filelen
2015-02-27 22:57:28 -05:00
}
2016-09-14 14:47:12 -04:00
// add a trampoline with symbol s (to be laid down after the current function)
2020-04-06 15:58:21 -04:00
func ( ctxt * Link ) AddTramp ( s * loader . SymbolBuilder ) {
s . SetType ( sym . STEXT )
s . SetReachable ( true )
s . SetOnList ( true )
ctxt . tramps = append ( ctxt . tramps , s . Sym ( ) )
2016-09-14 14:47:12 -04:00
if * FlagDebugTramp > 0 && ctxt . Debugvlog > 0 {
2020-04-06 15:58:21 -04:00
ctxt . Logf ( "trampoline %s inserted\n" , s . Name ( ) )
2016-09-14 14:47:12 -04:00
}
}
cmd/link: compress DWARF sections in ELF binaries
Forked from CL 111895.
The trickiest part of this is that the binary layout code (blk,
elfshbits, and various other things) assumes a constant offset between
symbols' and sections' file locations and their virtual addresses.
Compression, of course, breaks this constant offset. But we need to
assign virtual addresses to everything before compression in order to
resolve relocations before compression. As a result, compression needs
to re-compute the "address" of the DWARF sections and symbols based on
their compressed size. Luckily, these are at the end of the file, so
this doesn't perturb any other sections or symbols. (And there is, of
course, a surprising amount of code that assumes the DWARF segment
comes last, so what's one more place?)
Relevant benchmarks:
name old time/op new time/op delta
StdCmd 10.3s ± 2% 10.8s ± 1% +5.43% (p=0.000 n=30+30)
name old text-bytes new text-bytes delta
HelloSize 746kB ± 0% 746kB ± 0% ~ (all equal)
CmdGoSize 8.41MB ± 0% 8.41MB ± 0% ~ (all equal)
[Geo mean] 2.50MB 2.50MB +0.00%
name old data-bytes new data-bytes delta
HelloSize 10.6kB ± 0% 10.6kB ± 0% ~ (all equal)
CmdGoSize 252kB ± 0% 252kB ± 0% ~ (all equal)
[Geo mean] 51.5kB 51.5kB +0.00%
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
CmdGoSize 145kB ± 0% 145kB ± 0% ~ (all equal)
[Geo mean] 135kB 135kB +0.00%
name old exe-bytes new exe-bytes delta
HelloSize 1.60MB ± 0% 1.05MB ± 0% -34.39% (p=0.000 n=30+30)
CmdGoSize 16.5MB ± 0% 11.3MB ± 0% -31.76% (p=0.000 n=30+30)
[Geo mean] 5.14MB 3.44MB -33.08%
Fixes #11799.
Updates #6853.
Change-Id: I64197afe4c01a237523a943088051ee056331c6f
Reviewed-on: https://go-review.googlesource.com/118276
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-05 21:49:40 -04:00
// compressSyms compresses syms and returns the contents of the
// compressed section. If the section would get larger, it returns nil.
2020-04-25 14:17:10 -04:00
func compressSyms ( ctxt * Link , syms [ ] loader . Sym ) [ ] byte {
ldr := ctxt . loader
cmd/link: compress DWARF sections in ELF binaries
Forked from CL 111895.
The trickiest part of this is that the binary layout code (blk,
elfshbits, and various other things) assumes a constant offset between
symbols' and sections' file locations and their virtual addresses.
Compression, of course, breaks this constant offset. But we need to
assign virtual addresses to everything before compression in order to
resolve relocations before compression. As a result, compression needs
to re-compute the "address" of the DWARF sections and symbols based on
their compressed size. Luckily, these are at the end of the file, so
this doesn't perturb any other sections or symbols. (And there is, of
course, a surprising amount of code that assumes the DWARF segment
comes last, so what's one more place?)
Relevant benchmarks:
name old time/op new time/op delta
StdCmd 10.3s ± 2% 10.8s ± 1% +5.43% (p=0.000 n=30+30)
name old text-bytes new text-bytes delta
HelloSize 746kB ± 0% 746kB ± 0% ~ (all equal)
CmdGoSize 8.41MB ± 0% 8.41MB ± 0% ~ (all equal)
[Geo mean] 2.50MB 2.50MB +0.00%
name old data-bytes new data-bytes delta
HelloSize 10.6kB ± 0% 10.6kB ± 0% ~ (all equal)
CmdGoSize 252kB ± 0% 252kB ± 0% ~ (all equal)
[Geo mean] 51.5kB 51.5kB +0.00%
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
CmdGoSize 145kB ± 0% 145kB ± 0% ~ (all equal)
[Geo mean] 135kB 135kB +0.00%
name old exe-bytes new exe-bytes delta
HelloSize 1.60MB ± 0% 1.05MB ± 0% -34.39% (p=0.000 n=30+30)
CmdGoSize 16.5MB ± 0% 11.3MB ± 0% -31.76% (p=0.000 n=30+30)
[Geo mean] 5.14MB 3.44MB -33.08%
Fixes #11799.
Updates #6853.
Change-Id: I64197afe4c01a237523a943088051ee056331c6f
Reviewed-on: https://go-review.googlesource.com/118276
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-05 21:49:40 -04:00
var total int64
for _ , sym := range syms {
2020-04-25 14:17:10 -04:00
total += ldr . SymSize ( sym )
cmd/link: compress DWARF sections in ELF binaries
Forked from CL 111895.
The trickiest part of this is that the binary layout code (blk,
elfshbits, and various other things) assumes a constant offset between
symbols' and sections' file locations and their virtual addresses.
Compression, of course, breaks this constant offset. But we need to
assign virtual addresses to everything before compression in order to
resolve relocations before compression. As a result, compression needs
to re-compute the "address" of the DWARF sections and symbols based on
their compressed size. Luckily, these are at the end of the file, so
this doesn't perturb any other sections or symbols. (And there is, of
course, a surprising amount of code that assumes the DWARF segment
comes last, so what's one more place?)
Relevant benchmarks:
name old time/op new time/op delta
StdCmd 10.3s ± 2% 10.8s ± 1% +5.43% (p=0.000 n=30+30)
name old text-bytes new text-bytes delta
HelloSize 746kB ± 0% 746kB ± 0% ~ (all equal)
CmdGoSize 8.41MB ± 0% 8.41MB ± 0% ~ (all equal)
[Geo mean] 2.50MB 2.50MB +0.00%
name old data-bytes new data-bytes delta
HelloSize 10.6kB ± 0% 10.6kB ± 0% ~ (all equal)
CmdGoSize 252kB ± 0% 252kB ± 0% ~ (all equal)
[Geo mean] 51.5kB 51.5kB +0.00%
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
CmdGoSize 145kB ± 0% 145kB ± 0% ~ (all equal)
[Geo mean] 135kB 135kB +0.00%
name old exe-bytes new exe-bytes delta
HelloSize 1.60MB ± 0% 1.05MB ± 0% -34.39% (p=0.000 n=30+30)
CmdGoSize 16.5MB ± 0% 11.3MB ± 0% -31.76% (p=0.000 n=30+30)
[Geo mean] 5.14MB 3.44MB -33.08%
Fixes #11799.
Updates #6853.
Change-Id: I64197afe4c01a237523a943088051ee056331c6f
Reviewed-on: https://go-review.googlesource.com/118276
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-05 21:49:40 -04:00
}
var buf bytes . Buffer
buf . Write ( [ ] byte ( "ZLIB" ) )
var sizeBytes [ 8 ] byte
binary . BigEndian . PutUint64 ( sizeBytes [ : ] , uint64 ( total ) )
buf . Write ( sizeBytes [ : ] )
2020-02-24 20:59:02 -05:00
var relocbuf [ ] byte // temporary buffer for applying relocations
2018-07-11 16:13:04 -04:00
// Using zlib.BestSpeed achieves very nearly the same
// compression levels of zlib.DefaultCompression, but takes
// substantially less time. This is important because DWARF
// compression can be a significant fraction of link time.
z , err := zlib . NewWriterLevel ( & buf , zlib . BestSpeed )
if err != nil {
log . Fatalf ( "NewWriterLevel failed: %s" , err )
}
2020-04-30 10:19:28 -04:00
st := ctxt . makeRelocSymState ( )
2019-04-04 00:29:16 -04:00
for _ , s := range syms {
2020-04-25 14:17:10 -04:00
// Symbol data may be read-only. Apply relocations in a
2019-04-10 10:18:52 -04:00
// temporary buffer, and immediately write it out.
2020-04-25 14:17:10 -04:00
P := ldr . Data ( s )
relocs := ldr . Relocs ( s )
if relocs . Count ( ) != 0 {
relocbuf = append ( relocbuf [ : 0 ] , P ... )
P = relocbuf
}
2020-04-30 10:19:28 -04:00
st . relocsym ( s , P )
2020-04-25 14:17:10 -04:00
if _ , err := z . Write ( P ) ; err != nil {
cmd/link: compress DWARF sections in ELF binaries
Forked from CL 111895.
The trickiest part of this is that the binary layout code (blk,
elfshbits, and various other things) assumes a constant offset between
symbols' and sections' file locations and their virtual addresses.
Compression, of course, breaks this constant offset. But we need to
assign virtual addresses to everything before compression in order to
resolve relocations before compression. As a result, compression needs
to re-compute the "address" of the DWARF sections and symbols based on
their compressed size. Luckily, these are at the end of the file, so
this doesn't perturb any other sections or symbols. (And there is, of
course, a surprising amount of code that assumes the DWARF segment
comes last, so what's one more place?)
Relevant benchmarks:
name old time/op new time/op delta
StdCmd 10.3s ± 2% 10.8s ± 1% +5.43% (p=0.000 n=30+30)
name old text-bytes new text-bytes delta
HelloSize 746kB ± 0% 746kB ± 0% ~ (all equal)
CmdGoSize 8.41MB ± 0% 8.41MB ± 0% ~ (all equal)
[Geo mean] 2.50MB 2.50MB +0.00%
name old data-bytes new data-bytes delta
HelloSize 10.6kB ± 0% 10.6kB ± 0% ~ (all equal)
CmdGoSize 252kB ± 0% 252kB ± 0% ~ (all equal)
[Geo mean] 51.5kB 51.5kB +0.00%
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
CmdGoSize 145kB ± 0% 145kB ± 0% ~ (all equal)
[Geo mean] 135kB 135kB +0.00%
name old exe-bytes new exe-bytes delta
HelloSize 1.60MB ± 0% 1.05MB ± 0% -34.39% (p=0.000 n=30+30)
CmdGoSize 16.5MB ± 0% 11.3MB ± 0% -31.76% (p=0.000 n=30+30)
[Geo mean] 5.14MB 3.44MB -33.08%
Fixes #11799.
Updates #6853.
Change-Id: I64197afe4c01a237523a943088051ee056331c6f
Reviewed-on: https://go-review.googlesource.com/118276
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-05 21:49:40 -04:00
log . Fatalf ( "compression failed: %s" , err )
}
2020-04-25 14:17:10 -04:00
for i := ldr . SymSize ( s ) - int64 ( len ( P ) ) ; i > 0 ; {
cmd/link: compress DWARF sections in ELF binaries
Forked from CL 111895.
The trickiest part of this is that the binary layout code (blk,
elfshbits, and various other things) assumes a constant offset between
symbols' and sections' file locations and their virtual addresses.
Compression, of course, breaks this constant offset. But we need to
assign virtual addresses to everything before compression in order to
resolve relocations before compression. As a result, compression needs
to re-compute the "address" of the DWARF sections and symbols based on
their compressed size. Luckily, these are at the end of the file, so
this doesn't perturb any other sections or symbols. (And there is, of
course, a surprising amount of code that assumes the DWARF segment
comes last, so what's one more place?)
Relevant benchmarks:
name old time/op new time/op delta
StdCmd 10.3s ± 2% 10.8s ± 1% +5.43% (p=0.000 n=30+30)
name old text-bytes new text-bytes delta
HelloSize 746kB ± 0% 746kB ± 0% ~ (all equal)
CmdGoSize 8.41MB ± 0% 8.41MB ± 0% ~ (all equal)
[Geo mean] 2.50MB 2.50MB +0.00%
name old data-bytes new data-bytes delta
HelloSize 10.6kB ± 0% 10.6kB ± 0% ~ (all equal)
CmdGoSize 252kB ± 0% 252kB ± 0% ~ (all equal)
[Geo mean] 51.5kB 51.5kB +0.00%
name old bss-bytes new bss-bytes delta
HelloSize 125kB ± 0% 125kB ± 0% ~ (all equal)
CmdGoSize 145kB ± 0% 145kB ± 0% ~ (all equal)
[Geo mean] 135kB 135kB +0.00%
name old exe-bytes new exe-bytes delta
HelloSize 1.60MB ± 0% 1.05MB ± 0% -34.39% (p=0.000 n=30+30)
CmdGoSize 16.5MB ± 0% 11.3MB ± 0% -31.76% (p=0.000 n=30+30)
[Geo mean] 5.14MB 3.44MB -33.08%
Fixes #11799.
Updates #6853.
Change-Id: I64197afe4c01a237523a943088051ee056331c6f
Reviewed-on: https://go-review.googlesource.com/118276
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-05-05 21:49:40 -04:00
b := zeros [ : ]
if i < int64 ( len ( b ) ) {
b = b [ : i ]
}
n , err := z . Write ( b )
if err != nil {
log . Fatalf ( "compression failed: %s" , err )
}
i -= int64 ( n )
}
}
if err := z . Close ( ) ; err != nil {
log . Fatalf ( "compression failed: %s" , err )
}
if int64 ( buf . Len ( ) ) >= total {
// Compression didn't save any space.
return nil
}
return buf . Bytes ( )
}