Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
Cherry Zhang	c89251204e	[dev.link] cmd: delete old object support We are not going to merge to master until Go 1.16 cycle. The old object support can go now. Change-Id: I93e6f584974c7749d0a0c2e7a96def35134dc566 Reviewed-on: https://go-review.googlesource.com/c/go/+/231918 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com>	2020-05-04 17:00:27 +00:00
Cherry Zhang	d0754cfe4a	cmd: merge branch 'dev.link' into master In the dev.link branch we continued developing the new object file format support and the linker improvements described in https://golang.org/s/better-linker . Since the last merge, more progress has been made to improve the new linker. This is a clean merge. Change-Id: Ide5ad6fcec9cede99e9b21c4548929b4ba1f4185	2020-04-30 17:08:35 -04:00
Cherry Zhang	1419445926	[dev.link] all: merge branch 'master' into dev.link Clean merge. Change-Id: I9a30645ca0ceb52e45bc6b301f9f15f2f42998e8	2020-04-30 12:32:09 -04:00
Austin Clements	9d812cfa5c	cmd/compile,runtime: stack maps only at calls, remove register maps Currently, we emit stack maps and register maps at almost every instruction. This was originally intended to support non-cooperative preemption, but was only ever used for debug call injection. Now debug call injection also uses conservative frame scanning. As a result, stack maps are only needed at call sites and register maps aren't needed at all except that we happen to also encode unsafe-point information in the register map PCDATA stream. This CL reduces stack maps to only appear at calls, and replace full register maps with just safe/unsafe-point information. This is all protected by the go115ReduceLiveness feature flag, which is defined in both runtime and cmd/compile. This CL significantly reduces binary sizes and also speeds up compiles and links: name old exe-bytes new exe-bytes delta BinGoSize 15.0MB ± 0% 14.1MB ± 0% -5.72% name old pcln-bytes new pcln-bytes delta BinGoSize 3.14MB ± 0% 2.48MB ± 0% -21.08% name old time/op new time/op delta Template 178ms ± 7% 172ms ±14% -3.59% (p=0.005 n=19+19) Unicode 71.0ms ±12% 69.8ms ±10% ~ (p=0.126 n=18+18) GoTypes 655ms ± 8% 615ms ± 8% -6.11% (p=0.000 n=19+19) Compiler 3.27s ± 6% 3.15s ± 7% -3.69% (p=0.001 n=20+20) SSA 7.10s ± 5% 6.85s ± 8% -3.53% (p=0.001 n=19+20) Flate 124ms ±15% 116ms ±22% -6.57% (p=0.024 n=18+19) GoParser 156ms ±26% 147ms ±34% ~ (p=0.070 n=19+19) Reflect 406ms ± 9% 387ms ±21% -4.69% (p=0.028 n=19+20) Tar 163ms ±15% 162ms ±27% ~ (p=0.370 n=19+19) XML 223ms ±13% 218ms ±14% ~ (p=0.157 n=20+20) LinkCompiler 503ms ±21% 484ms ±23% ~ (p=0.072 n=20+20) ExternalLinkCompiler 1.27s ± 7% 1.22s ± 8% -3.85% (p=0.005 n=20+19) LinkWithoutDebugCompiler 294ms ±17% 273ms ±11% -7.16% (p=0.001 n=19+18) (https://perf.golang.org/search?q=upload:20200428.8) The binary size improvement is even slightly better when you include the CLs leading up to this. Relative to the parent of "cmd/compile: mark PanicBounds/Extend as calls": name old exe-bytes new exe-bytes delta BinGoSize 15.0MB ± 0% 14.1MB ± 0% -6.18% name old pcln-bytes new pcln-bytes delta BinGoSize 3.22MB ± 0% 2.48MB ± 0% -22.92% (https://perf.golang.org/search?q=upload:20200428.9) For #36365. Change-Id: I69448e714f2a44430067ca97f6b78e08c0abed27 Reviewed-on: https://go-review.googlesource.com/c/go/+/230544 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-04-29 21:29:21 +00:00
Than McIntosh	8493b64527	[dev.link] all: merge branch 'master' into dev.link Change-Id: Ied39f4f701a2e64b87262f7cc34108a60b15e08c	2020-04-29 07:49:35 -04:00
Cherry Zhang	e08f10b8b5	[dev.link] cmd/internal/goobj2: add index fingerprint to object file The new object files use indices for symbol references, instead of names. Fundamental to the design, it requires that the importing and imported packages have consistent view of symbol indices. The Go command should already ensure this, when using "go build". But in case it goes wrong, it could lead to obscure errors like run-time crashes. It would be better to check the index consistency at build time. To do that, we add a fingerprint to each object file, which is a hash of symbol indices. In the object file it records the fingerprints of all imported packages, as well as its own fingerprint. At link time, the linker checks that a package's fingerprint matches the fingerprint recorded in the importing packages, and issue an error if they don't match. This CL does the first part: introducing the fingerprint in the object file, and propagating fingerprints through importing/exporting by the compiler. It is not yet used by the linker. Next CL will do. Change-Id: I0aa372da652e4afb11f2867cb71689a3e3f9966e Reviewed-on: https://go-review.googlesource.com/c/go/+/229617 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Jeremy Faller <jeremy@golang.org>	2020-04-24 17:47:14 +00:00
David Chase	f5fcc9b8e0	cmd/internal/obj: add IsAsm flag This allows more exciting changes to compiler-generated assembly language that might not be correct for tricky hand-crafted assembly (e.g., nop padding breaking tables of call or branch instructions). Updates #35881 Change-Id: I842b811796076c160180a364564f2844604df3fb Reviewed-on: https://go-review.googlesource.com/c/go/+/229708 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-04-24 01:48:48 +00:00
Keith Randall	ea7126fe14	cmd/compile: use a Sym type instead of interface{} for symbolic offsets Will help with strongly typed rewrite rules. Change-Id: Ifbf316a49f4081322b3b8f13bc962713437d9aba Reviewed-on: https://go-review.googlesource.com/c/go/+/227785 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Martí <mvdan@mvdan.cc>	2020-04-10 16:24:46 +00:00
Cherry Zhang	d92a5a80b5	[dev.link] cmd: support large function alignment This ports CL 226997 to the dev.link branch. - The assembler part and old object file writing are unchanged. - Changes to cmd/link are applied to cmd/oldlink. - Add alignment field to new object files for the new linker. Change-Id: Id00f323ae5bdd86b2709a702ee28bcaa9ba962f8 Reviewed-on: https://go-review.googlesource.com/c/go/+/227025 Reviewed-by: Than McIntosh <thanm@google.com>	2020-04-02 17:24:05 +00:00
Cherry Zhang	53a3b600a4	[dev.link] all: merge branch 'master' into dev.link The only merge conflict is the addition of -spectre flag on master and the addition of -go115newobj flag on dev.link. Resolved trivially. Change-Id: I5b46c2b25e140d6c3d8cb129acbd7a248ff03bb9	2020-03-27 13:39:19 -04:00
Cherry Zhang	330f53b615	[dev.link] cmd/asm, cmd/compile: add back newobj flag Add back the newobj flag, renamed to go115newobj, for feature gating. The flag defaults to true. This essentially reverts CL 206398 as well as CL 220060. The old object format isn't working yet. Will fix in followup CLs. Change-Id: I1ace2a9cbb1a322d2266972670d27bda4e24adbc Reviewed-on: https://go-review.googlesource.com/c/go/+/224623 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com>	2020-03-23 14:38:49 +00:00
Russ Cox	fc8a6336d1	cmd/asm, cmd/compile, runtime: add -spectre=ret mode This commit extends the -spectre flag to cmd/asm and adds a new Spectre mitigation mode "ret", which enables the use of retpolines. Retpolines prevent speculation about the target of an indirect jump or call and are described in more detail here: https://support.google.com/faqs/answer/7625886 Change-Id: I4f2cb982fa94e44d91e49bd98974fd125619c93a Reviewed-on: https://go-review.googlesource.com/c/go/+/222661 Reviewed-by: Keith Randall <khr@golang.org>	2020-03-13 19:05:54 +00:00
Jeremy Faller	6a819b0062	[dev.link] cmd/internal: remove unneeded RefIdx field Change-Id: Ic77e67b70b76dc958890e74b77c9691c30eb6ba1 Reviewed-on: https://go-review.googlesource.com/c/go/+/220060 Run-TryBot: Jeremy Faller <jeremy@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-02-19 16:57:57 +00:00
Cherry Zhang	181faef82c	[dev.link] cmd/compile, cmd/asm: delete old object file format support There are more cleanups to do, but I want to keep this CL mostly a pure deletion. Change-Id: Icd2ff0a4b648eb4adf3d29386542617e49620818 Reviewed-on: https://go-review.googlesource.com/c/go/+/206398 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-11-11 20:49:27 +00:00
Cherry Zhang	d77b809df9	[dev.link] all: merge branch 'master' into dev.link The only conflict is in cmd/internal/obj/link.go and the resolution is trivial. Change-Id: Ic79b760865a972a0ab68291d06386531d012de86	2019-10-25 13:41:36 -04:00
Dan Scales	be64a19d99	cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go binary without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I63b1a60d1ebf28126f55ee9fd7ecffe9cb23d1ff Reviewed-on: https://go-review.googlesource.com/c/go/+/202340 Reviewed-by: Austin Clements <austin@google.com>	2019-10-24 13:54:11 +00:00
Cherry Zhang	97e497b253	[dev.link] cmd: reference symbols by name when linking against Go shared library When building a program that links against Go shared libraries, it needs to reference symbols defined in the shared library. At compile time, we don't know where the shared library boundary is. If we reference a symbol in package p by index, and package p is actually part of a shared library, we cannot resolve the index at link time, as the linker doesn't see the object file of p. So when linking against Go shared libraries, always use named reference for now. To do this, the compiler needs to know whether we will be linking against Go shared libraries. The -dynlink flag kind of indicates that (as the document says), but currently it is actually overloaded: it is also used when building a plugin or a shared library, which is self-contained (if -linkshared is not otherwise specified) and could use index for symbol reference. So we introduce another compiler flag, -linkshared, specifically for linking against Go shared libraries. The go command will pass this flag if its -linkshared flag is specified ("go build -linkshared"). There may be better way to handle this. For example, we can put the symbol indices in a special section in the shared library that the linker can read. Or we can generate some per-package description file to include the indices. (Currently we generate a .shlibname file for each package that is included in a shared library, which contains the path of the library. We could consider extending this.) That said, this CL is a stop-gap solution. And it is no worse than the old object files. If we were to redesign the build system so that the shared library boundary is known at compile time, we could use indices for symbol references that do not cross shared library boundary, as well as doing other things better. Change-Id: I9c02aad36518051cc4785dbe25c4b4cef8f3faeb Reviewed-on: https://go-review.googlesource.com/c/go/+/201818 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com>	2019-10-21 21:57:01 +00:00
Bryan C. Mills	b76e6f8825	Revert "cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata" This reverts CL 190098. Reason for revert: broke several builders. Change-Id: I69161352f9ded02537d8815f259c4d391edd9220 Reviewed-on: https://go-review.googlesource.com/c/go/+/201519 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Dan Scales <danscales@google.com>	2019-10-16 20:59:53 +00:00
Dan Scales	dad616375f	cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go cmd without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I51a389860b9676cfa1b84722f5fb84d3c4ee9e28 Reviewed-on: https://go-review.googlesource.com/c/go/+/190098 Reviewed-by: Austin Clements <austin@google.com>	2019-10-16 18:27:16 +00:00
Cherry Zhang	2c484c0356	[dev.link] cmd/internal/obj: write object file in new format If -newobj is set, write object file in new format, which uses indices for symbol references instead of symbol names. The file format is described at the beginning of cmd/internal/goobj2/objfile.go. A new package, cmd/internal/goobj2, is introduced for reading and writing new object files. (The package name is temporary.) It is written in a way that trys to make the encoding as regular as possible, and the reader and writer as symmetric as possible. This is incomplete, and currently nothing will consume the new object file. Change-Id: Ifefedbf6456d760d15a9f40a28af6486c93100fe Reviewed-on: https://go-review.googlesource.com/c/go/+/196030 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com>	2019-10-03 18:49:41 +00:00
Cherry Zhang	53b7c18284	[dev.link] cmd/compile, cmd/asm: assign index to symbols We are planning to use indices for symbol references, instead of symbol names. Here we assign indices to symbols defined in the package being compiled, and propagate the indices to the dependent packages in the export data. A symbol is referenced by a tuple, (package index, symbol index). Normally, for a given symbol, this index is unique, and the symbol index is globally consistent (but with exceptions, see below). The package index is local to a compilation. For example, when compiling the fmt package, fmt.Println gets assigned index 25, then all packages that reference fmt.Println will refer it as (X, 25) with some X. X is the index for the fmt package, which may differ in different compilations. There are some symbols that do not have clear package affiliation, such as dupOK symbols and linknamed symbols. We cannot give them globally consistent indices. We categorize them as non-package symbols, assign them with package index 1 and a symbol index that is only meaningful locally. Currently nothing will consume the indices. All this is behind a flag, -newobj. The flag needs to be set for all builds (-gcflags=all=-newobj -asmflags=all=-newobj), or none. Change-Id: I18e489c531e9a9fbc668519af92c6116b7308cab Reviewed-on: https://go-review.googlesource.com/c/go/+/196029 Reviewed-by: Than McIntosh <thanm@google.com>	2019-10-02 19:07:17 +00:00
Than McIntosh	cdd59205c4	cmd/compile: don't emit autom's into object file Don't write Autom records when writing a function to the object file; we no longer need them in the linker for DWARF processing. So as to keep the object file format unchanged, write out a zero-length list of automs to the object, as opposed to removing all references. Updates #34554. Change-Id: I42a1d67207ea7114ae4f3a315cf37effba57f190 Reviewed-on: https://go-review.googlesource.com/c/go/+/197499 Reviewed-by: Jeremy Faller <jeremy@golang.org>	2019-09-27 13:58:59 +00:00
Than McIntosh	0b486d2a87	cmd/compile: add R_USETYPE relocs to func syms for autom types During DWARF processing, keep track of the go type symbols for types directly or indirectly referenced by auto variables in a function, and add a set of dummy R_USETYPE relocations to the function's DWARF subprogram DIE symbol. This change is not useful on its own, but is part of a series of changes intended to clean up handling of autom's in the compiler and linker. Updates #34554. Change-Id: I974afa9b7092aa5dba808f74e00aa931249d6fe9 Reviewed-on: https://go-review.googlesource.com/c/go/+/197497 Run-TryBot: Than McIntosh <thanm@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Jeremy Faller <jeremy@golang.org>	2019-09-27 13:56:32 +00:00
Jeremy Faller	7defbffcda	cmd/compile: remove isStmt symbol from FuncInfo As promised in CL 188238, removing the obsolete symbol. Here are the latest stats. This is baselined at "`e53edafb66`" with only these changes applied, run on magna.cam. The linker looks straight better (in memory and speed). There is still a change I'm working on walking the progs to generate the debug_lines data in the compiler. That will likely result in a compiler speedup. name old time/op new time/op delta Template 324ms ± 3% 317ms ± 3% -2.07% (p=0.043 n=10+10) Unicode 142ms ± 4% 144ms ± 3% ~ (p=0.393 n=10+10) GoTypes 1.05s ± 2% 1.07s ± 2% +1.59% (p=0.019 n=9+9) Compiler 4.09s ± 2% 4.11s ± 1% ~ (p=0.218 n=10+10) SSA 12.5s ± 1% 12.7s ± 1% +1.00% (p=0.035 n=10+10) Flate 199ms ± 7% 203ms ± 5% ~ (p=0.481 n=10+10) GoParser 245ms ± 3% 246ms ± 5% ~ (p=0.780 n=9+10) Reflect 672ms ± 4% 688ms ± 3% +2.42% (p=0.015 n=10+10) Tar 280ms ± 4% 284ms ± 4% ~ (p=0.123 n=10+10) XML 379ms ± 4% 381ms ± 2% ~ (p=0.529 n=10+10) LinkCompiler 1.16s ± 4% 1.12s ± 2% -3.03% (p=0.001 n=10+9) ExternalLinkCompiler 2.28s ± 3% 2.23s ± 3% -2.51% (p=0.011 n=8+9) LinkWithoutDebugCompiler 686ms ± 9% 667ms ± 2% ~ (p=0.277 n=9+8) StdCmd 14.1s ± 1% 14.0s ± 1% ~ (p=0.739 n=10+10) name old user-time/op new user-time/op delta Template 604ms ±23% 564ms ± 7% ~ (p=0.661 n=10+9) Unicode 429ms ±40% 418ms ±37% ~ (p=0.579 n=10+10) GoTypes 2.43s ±12% 2.51s ± 7% ~ (p=0.393 n=10+10) Compiler 9.22s ± 3% 9.27s ± 3% ~ (p=0.720 n=9+10) SSA 26.3s ± 3% 26.6s ± 2% ~ (p=0.579 n=10+10) Flate 328ms ±19% 333ms ±12% ~ (p=0.842 n=10+9) GoParser 387ms ± 5% 378ms ± 9% ~ (p=0.356 n=9+10) Reflect 1.36s ±20% 1.43s ±21% ~ (p=0.631 n=10+10) Tar 469ms ±12% 471ms ±21% ~ (p=0.497 n=9+10) XML 685ms ±18% 698ms ±19% ~ (p=0.739 n=10+10) LinkCompiler 1.86s ±10% 1.87s ±11% ~ (p=0.968 n=10+9) ExternalLinkCompiler 3.20s ±13% 3.01s ± 8% -5.70% (p=0.046 n=8+9) LinkWithoutDebugCompiler 1.08s ±15% 1.09s ±20% ~ (p=0.579 n=10+10) name old alloc/op new alloc/op delta Template 36.3MB ± 0% 36.4MB ± 0% +0.26% (p=0.000 n=10+10) Unicode 28.5MB ± 0% 28.5MB ± 0% ~ (p=0.165 n=10+10) GoTypes 120MB ± 0% 121MB ± 0% +0.29% (p=0.000 n=9+10) Compiler 546MB ± 0% 548MB ± 0% +0.32% (p=0.000 n=10+10) SSA 1.84GB ± 0% 1.85GB ± 0% +0.49% (p=0.000 n=10+10) Flate 22.9MB ± 0% 23.0MB ± 0% +0.25% (p=0.000 n=10+10) GoParser 27.8MB ± 0% 27.9MB ± 0% +0.25% (p=0.000 n=10+8) Reflect 77.5MB ± 0% 77.7MB ± 0% +0.27% (p=0.000 n=9+9) Tar 34.5MB ± 0% 34.6MB ± 0% +0.23% (p=0.000 n=10+10) XML 44.2MB ± 0% 44.4MB ± 0% +0.32% (p=0.000 n=10+10) LinkCompiler 239MB ± 0% 230MB ± 0% -3.86% (p=0.000 n=10+10) ExternalLinkCompiler 243MB ± 0% 243MB ± 0% +0.22% (p=0.000 n=10+10) LinkWithoutDebugCompiler 164MB ± 0% 155MB ± 0% -5.45% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Template 371k ± 0% 372k ± 0% +0.44% (p=0.000 n=10+10) Unicode 340k ± 0% 340k ± 0% +0.05% (p=0.000 n=10+10) GoTypes 1.32M ± 0% 1.32M ± 0% +0.46% (p=0.000 n=10+10) Compiler 5.34M ± 0% 5.37M ± 0% +0.59% (p=0.000 n=10+10) SSA 17.6M ± 0% 17.7M ± 0% +0.63% (p=0.000 n=10+10) Flate 233k ± 0% 234k ± 0% +0.48% (p=0.000 n=10+10) GoParser 309k ± 0% 310k ± 0% +0.40% (p=0.000 n=10+10) Reflect 964k ± 0% 969k ± 0% +0.54% (p=0.000 n=10+10) Tar 346k ± 0% 348k ± 0% +0.48% (p=0.000 n=10+9) XML 424k ± 0% 426k ± 0% +0.51% (p=0.000 n=10+10) LinkCompiler 751k ± 0% 645k ± 0% -14.13% (p=0.000 n=10+10) ExternalLinkCompiler 1.79M ± 0% 1.69M ± 0% -5.30% (p=0.000 n=10+10) LinkWithoutDebugCompiler 217k ± 0% 222k ± 0% +2.02% (p=0.000 n=10+10) name old object-bytes new object-bytes delta Template 547kB ± 0% 559kB ± 0% +2.17% (p=0.000 n=10+10) Unicode 215kB ± 0% 216kB ± 0% +0.60% (p=0.000 n=10+10) GoTypes 1.99MB ± 0% 2.03MB ± 0% +2.02% (p=0.000 n=10+10) Compiler 7.86MB ± 0% 8.07MB ± 0% +2.73% (p=0.000 n=10+10) SSA 26.4MB ± 0% 27.2MB ± 0% +3.27% (p=0.000 n=10+10) Flate 337kB ± 0% 343kB ± 0% +2.02% (p=0.000 n=10+10) GoParser 432kB ± 0% 441kB ± 0% +2.11% (p=0.000 n=10+10) Reflect 1.33MB ± 0% 1.36MB ± 0% +1.87% (p=0.000 n=10+10) Tar 477kB ± 0% 487kB ± 0% +2.24% (p=0.000 n=10+10) XML 617kB ± 0% 632kB ± 0% +2.33% (p=0.000 n=10+10) name old export-bytes new export-bytes delta Template 18.5kB ± 0% 18.5kB ± 0% ~ (all equal) Unicode 7.92kB ± 0% 7.92kB ± 0% ~ (all equal) GoTypes 35.0kB ± 0% 35.0kB ± 0% ~ (all equal) Compiler 109kB ± 0% 109kB ± 0% +0.09% (p=0.000 n=10+10) SSA 137kB ± 0% 137kB ± 0% +0.03% (p=0.000 n=10+10) Flate 4.89kB ± 0% 4.89kB ± 0% ~ (all equal) GoParser 8.49kB ± 0% 8.49kB ± 0% ~ (all equal) Reflect 11.4kB ± 0% 11.4kB ± 0% ~ (all equal) Tar 10.5kB ± 0% 10.5kB ± 0% ~ (all equal) XML 16.7kB ± 0% 16.7kB ± 0% ~ (all equal) name old text-bytes new text-bytes delta HelloSize 760kB ± 0% 760kB ± 0% ~ (all equal) CmdGoSize 10.8MB ± 0% 10.8MB ± 0% ~ (all equal) name old data-bytes new data-bytes delta HelloSize 10.7kB ± 0% 10.7kB ± 0% ~ (all equal) CmdGoSize 312kB ± 0% 312kB ± 0% ~ (all equal) name old bss-bytes new bss-bytes delta HelloSize 122kB ± 0% 122kB ± 0% ~ (all equal) CmdGoSize 146kB ± 0% 146kB ± 0% ~ (all equal) name old exe-bytes new exe-bytes delta HelloSize 1.11MB ± 0% 1.13MB ± 0% +1.10% (p=0.000 n=10+10) CmdGoSize 14.9MB ± 0% 15.0MB ± 0% +0.77% (p=0.000 n=10+10) Change-Id: I42e6087cd6231dbdcfff5464e46d373474e455e1 Reviewed-on: https://go-review.googlesource.com/c/go/+/192417 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com>	2019-09-26 15:22:09 +00:00
Jeremy Faller	376fc48338	cmd/compile: add new symbol for debug line numbers This is broken out from: CL 187117 This new symbol will be populated by the compiler and contain debug line information that's currently generated in the linker. One might say it's sad to create a new symbol, but this symbol will replace the isStmt symbols. Testing: Ran go build -toolexec 'toolstash -cmp' Change-Id: If8f7ae4b43b7247076605b6429b7d03a1fd239c5 Reviewed-on: https://go-review.googlesource.com/c/go/+/188238 Reviewed-by: Austin Clements <austin@google.com>	2019-09-23 19:40:07 +00:00
Joel Sing	7ef890db91	cmd/internal/obj: instructions and registers for RISC-V Start implementing an assembler for RISC-V - this provides register definitions and instruction mnemonics as defined in the RISC-V Instruction Set Manual, along with instruction encoding. The instruction encoding is generated by the parse_opcodes script with the "opcodes" and "opcodes-pseudo" files from (`make inst.go`): https://github.com/riscv/riscv-opcodes This is based on the riscv-go port: https://github.com/riscv/riscv-go Contributors to the riscv-go port are: Amol Bhave <ammubhave@gmail.com> Benjamin Barenblat <bbaren@google.com> Josh Bleecher Snyder <josharian@gmail.com> Michael Pratt <michael@pratt.im> Michael Yenik <myenik@google.com> Ronald G. Minnich <rminnich@gmail.com> Stefan O'Rear <sorear2@gmail.com> This port has been updated to Go 1.13: https://github.com/4a6f656c/riscv-go Updates #27532 Change-Id: I257b6de87e9864df61a2b0ce9be15968c1227b49 Reviewed-on: https://go-review.googlesource.com/c/go/+/193677 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-09-07 13:24:59 +00:00
LE Manh Cuong	2d7cb295fd	cmd/compile: clarify the difference between types.Sym and obj.LSym Both types.Sym and obj.LSym have the field Name, and that field is widely used in compiler source. It can lead to confusion that when to use which one. So, adding documentation for clarifying the difference between them, eliminate the confusion, or at least, make the code which use them clearer for the reader. See https://github.com/golang/go/issues/31252#issuecomment-481929174 Change-Id: I31f7fc6e4de4cf68f67ab2e3a385a7f451c796f5 Reviewed-on: https://go-review.googlesource.com/c/go/+/175019 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2019-05-21 03:03:01 +00:00
David Chase	807761f334	cmd/link: revert/revise CL 98075 because LLDB is very picky now This was originally Revert "cmd/link: fix up debug_range for dsymutil (revert CL 72371)" which has the effect of no longer using Base Address Selection Entries in DWARF. However, the build-time costs of that are about 2%, so instead the hacky fixup that generated technically incorrect DWARF was removed from the linker, and the choice is instead made in the compiler, dependent on platform, but also under control of a flag so that we can report this bug against LLDB/dsymutil/dwarfdump (really, the LLVM dwarf libraries). This however does not solve #31188; debugging still fails, but dwarfdump no longer complains. There are at least two LLDB bugs involved, and this change will at allow us to report them without them being rejected because our now-obsolete workaround for the first bug creates not-quite-DWARF. Updates #31188. Change-Id: I5300c51ad202147bab7333329ebe961623d2b47d Reviewed-on: https://go-review.googlesource.com/c/go/+/170638 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com>	2019-04-23 20:52:23 +00:00
Michael Munday	aafe257390	cmd/link, runtime: mark goexit as the top of the call stack This CL adds a new attribute, TOPFRAME, which can be used to mark functions that should be treated as being at the top of the call stack. The function `runtime.goexit` has been marked this way on architectures that use a link register. This will stop programs that use DWARF to unwind the call stack from unwinding past `runtime.goexit` on architectures that use a link register. For example, it eliminates "corrupt stack?" warnings when generating a backtrace that hits `runtime.goexit` in GDB on s390x. Similar code should be added for non-link-register architectures (i.e. amd64, 386). They mark the top of the call stack slightly differently to link register architectures so I haven't added that code (they need to mark "rip" as undefined). Fixes #24385. Change-Id: I15b4c69ac75b491daa0acf0d981cb80eb06488de Reviewed-on: https://go-review.googlesource.com/c/go/+/169726 Run-TryBot: Michael Munday <mike.munday@ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-04-15 13:17:28 +00:00
Austin Clements	a2e79571a9	cmd/compile: separate data and function LSyms Currently, obj.Ctxt's symbol table does not distinguish between ABI0 and ABIInternal symbols. This is almost okay, since a given symbol name in the final object file is only going to belong to one ABI or the other, but it requires that the compiler mark a Sym as being a function symbol before it retrieves its LSym. If it retrieves the LSym first, that LSym will be created as ABI0, and later marking the Sym as a function symbol won't change the LSym's ABI. Marking a Sym as a function symbol before looking up its LSym sounds easy, except Syms have a dual purpose: they are used just as interned strings (every function, variable, parameter, etc with the same textual name shares a Sym), and also to store state for whatever package global has that name. As a result, it's easy to slip up and look up an LSym when a Sym is serving as the name of a local variable, and then later mark it as a function when it's serving as the global with the name. In general, we were careful to avoid this, but #29610 demonstrates one case where we messed up. Because of on-demand importing from indexed export data, it's possible to compile a method wrapper for a type imported from another package before importing an init function from that package. If the argument of the method is named "init", the "init" LSym will be created as a data symbol when compiling the wrapper, before it gets marked as a function symbol. To fix this, we separate obj.Ctxt's symbol tables for ABI0 and ABIInternal symbols. This way, the compiler will simply get a different LSym once the Sym takes on its package-global meaning as a function. This fixes the above ordering issue, and means we no longer need to go out of our way to create the "init" function early and mark it as a function symbol. Fixes #29610. Updates #27539. Change-Id: Id9458b40017893d46ef9e4a3f9b47fc49e1ce8df Reviewed-on: https://go-review.googlesource.com/c/157017 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Robert Griesemer <gri@golang.org>	2019-01-11 00:45:49 +00:00
Keith Randall	69c2c56453	cmd/compile,runtime: redo mid-stack inlining tracebacks Work involved in getting a stack trace is divided between runtime.Callers and runtime.CallersFrames. Before this CL, runtime.Callers returns a pc per runtime frame. runtime.CallersFrames is responsible for expanding a runtime frame into potentially multiple user frames. After this CL, runtime.Callers returns a pc per user frame. runtime.CallersFrames just maps those to user frame info. Entries in the result of runtime.Callers are now pcs of the calls (or of the inline marks), not of the instruction just after the call. Fixes #29007 Fixes #28640 Update #26320 Change-Id: I1c9567596ff73dc73271311005097a9188c3406f Reviewed-on: https://go-review.googlesource.com/c/152537 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-12-28 20:55:36 +00:00
Keith Randall	01a1eaa10c	cmd/compile: use innermost line number for -S When functions are inlined, for instructions in the inlined body, does -S print the location of the call, or the location of the body? Right now, we do the former. I'd like to do the latter by default, it makes much more sense when reading disassembly. With mid-stack inlining enabled in more cases, this quandry will come up more often. The original behavior is still available with -S=2. Some tests use this mode (so they can find assembly generated by a particular source line). This helped me with understanding what the compiler was doing while fixing #29007. Change-Id: Id14a3a41e1b18901e7c5e460aa4caf6d940ed064 Reviewed-on: https://go-review.googlesource.com/c/153241 Reviewed-by: David Chase <drchase@google.com>	2018-12-11 20:24:45 +00:00
Clément Chigot	4295ed9bef	cmd: fix symbols addressing for aix/ppc64 This commit changes the code generated for addressing symbols on AIX operating system. On AIX, every symbol accesses must be done via another symbol near the TOC, named TOC anchor or TOC entry. This TOC anchor is a pointer to the symbol address. During Progedit function, when a symbol access is detected, its instructions are modified to create a load on its TOC anchor and retrieve the symbol. Change-Id: I00cf8f49c13004bc99fa8af13d549a709320f797 Reviewed-on: https://go-review.googlesource.com/c/151039 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-11-27 21:06:16 +00:00
Austin Clements	c5718b6b26	cmd/internal/obj, cmd/link: record ABIs and aliases in Go obj files This repurposes the "version" field of a symbol reference in the Go object file format to be an ABI field. Currently, this is just 0 or 1 depending on whether the symbol is static (the linker turns it into a different internal version number), so it's already only tenuously a symbol version. We change this to be -1 for static symbols and otherwise by the ABI number. This also adds a separate list of ABI alias symbols to be recorded in the object file. The ABI aliases must be a separate list and not just part of the symbol definitions because it's possible to have a symbol defined in one package and the alias "defined" in a different package. For example, this can happen if a symbol is defined in assembly in one package and stubbed in a different package. The stub triggers the generation of the ABI alias, but in a different package from the definition. For #27539. Change-Id: I015c9fe54690c027de6ef77e22b5585976a01587 Reviewed-on: https://go-review.googlesource.com/c/147157 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-11-12 20:46:48 +00:00
Austin Clements	97e4010fd4	cmd/compile: accept and parse symabis This doesn't yet do anything with this information. For #27539. Change-Id: Ia12c905812aa1ed425eedd6ab2f55ec75d81c0ce Reviewed-on: https://go-review.googlesource.com/c/147099 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-11-12 20:46:37 +00:00
Austin Clements	15265ec421	cmd/compile: avoid duplicate GC bitmap symbols Currently, liveness produces a distinct obj.LSym for each GC bitmap for each function. These are then named by content hash and only ultimately deduplicated by WriteObjFile. For various reasons (see next commit), we want to remove this deduplication behavior from WriteObjFile. Furthermore, it's inefficient to produce these duplicate symbols in the first place. GC bitmaps are the only source of duplicate symbols in the compiler. This commit eliminates these duplicate symbols by declaring them in the Ctxt symbol hash just like every other obj.LSym. As a result, all GC bitmaps with the same content now refer to the same obj.LSym. The next commit will remove deduplication from WriteObjFile. For #27539. Change-Id: I4f15e3d99530122cdf473b7a838c69ef5f79db59 Reviewed-on: https://go-review.googlesource.com/c/146557 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-11-03 15:12:34 +00:00
Lynn Boger	bdba55653f	cmd/asm/internal,cmd/internal/obj/ppc64: add alignment directive to asm for ppc64x This adds support for an alignment directive that can be used within Go asm to indicate preferred code alignment for ppc64x. This is intended to be used with loops to improve performance. This change only adds the directive and aligns the code based on it. Follow up changes will modify asm functions for ppc64x that benefit from preferred alignment. Fixes #14935 Here is one example of the improvement in memmove when the directive is used on the loops in the code: Memmove/64 8.74ns ± 0% 8.64ns ± 0% -1.19% (p=0.000 n=8+8) Memmove/128 11.5ns ± 0% 11.0ns ± 0% -4.35% (p=0.000 n=8+8) Memmove/256 23.0ns ± 0% 15.3ns ± 0% -33.48% (p=0.000 n=8+8) Memmove/512 31.7ns ± 0% 31.8ns ± 0% +0.32% (p=0.000 n=8+8) Memmove/1024 52.3ns ± 0% 43.9ns ± 0% -16.10% (p=0.000 n=8+8) Memmove/2048 93.2ns ± 0% 76.2ns ± 0% -18.24% (p=0.000 n=8+8) Memmove/4096 174ns ± 0% 141ns ± 0% -18.97% (p=0.000 n=8+8) Change-Id: I200d77e923dd5d78c22fe3f8eb142a8fbaff57bf Reviewed-on: https://go-review.googlesource.com/c/144218 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2018-10-23 20:37:29 +00:00
Keith Randall	cbafcc55e8	cmd/compile,runtime: implement stack objects Rework how the compiler+runtime handles stack-allocated variables whose address is taken. Direct references to such variables work as before. References through pointers, however, use a new mechanism. The new mechanism is more precise than the old "ambiguously live" mechanism. It computes liveness at runtime based on the actual references among objects on the stack. Each function records all of its address-taken objects in a FUNCDATA. These are called "stack objects". The runtime then uses that information while scanning a stack to find all of the stack objects on a stack. It then does a mark phase on the stack objects, using all the pointers found on the stack (and ancillary structures, like defer records) as the root set. Only stack objects which are found to be live during this mark phase will be scanned and thus retain any heap objects they point to. A subsequent CL will remove all the "ambiguously live" logic from the compiler, so that the stack object tracing will be required. For this CL, the stack tracing is all redundant with the current ambiguously live logic. Update #22350 Change-Id: Ide19f1f71a5b6ec8c4d54f8f66f0e9a98344772f Reviewed-on: https://go-review.googlesource.com/c/134155 Reviewed-by: Austin Clements <austin@google.com>	2018-10-03 19:52:49 +00:00
Austin Clements	9f95c9db23	cmd/compile, cmd/internal/obj: record register maps in binary This adds FUNCDATA and PCDATA that records the register maps much like the existing live arguments maps and live locals maps. The register map is indexed independently from the argument and locals maps since changes in register liveness tend not to correlate with changes to argument and local liveness. This is the final CL toward adding safe-points everywhere. The following CLs will optimize liveness analysis to bring down the cost. The effect of this CL is: name old time/op new time/op delta Template 195ms ± 2% 197ms ± 1% ~ (p=0.136 n=9+9) Unicode 98.4ms ± 2% 99.7ms ± 1% +1.39% (p=0.004 n=10+10) GoTypes 685ms ± 1% 700ms ± 1% +2.06% (p=0.000 n=9+9) Compiler 3.28s ± 1% 3.34s ± 0% +1.71% (p=0.000 n=9+8) SSA 7.79s ± 1% 7.91s ± 1% +1.55% (p=0.000 n=10+9) Flate 133ms ± 2% 133ms ± 2% ~ (p=0.190 n=10+10) GoParser 161ms ± 2% 164ms ± 3% +1.83% (p=0.015 n=10+10) Reflect 450ms ± 1% 457ms ± 1% +1.62% (p=0.000 n=10+10) Tar 183ms ± 2% 185ms ± 1% +0.91% (p=0.008 n=9+10) XML 234ms ± 1% 238ms ± 1% +1.60% (p=0.000 n=9+9) [Geo mean] 411ms 417ms +1.40% name old exe-bytes new exe-bytes delta HelloSize 1.47M ± 0% 1.51M ± 0% +2.79% (p=0.000 n=10+10) Compared to just before "cmd/internal/obj: consolidate emitting entry stack map", the cumulative effect of adding stack maps everywhere and register maps is: name old time/op new time/op delta Template 185ms ± 2% 197ms ± 1% +6.42% (p=0.000 n=10+9) Unicode 96.3ms ± 3% 99.7ms ± 1% +3.60% (p=0.000 n=10+10) GoTypes 658ms ± 0% 700ms ± 1% +6.37% (p=0.000 n=10+9) Compiler 3.14s ± 1% 3.34s ± 0% +6.53% (p=0.000 n=9+8) SSA 7.41s ± 2% 7.91s ± 1% +6.71% (p=0.000 n=9+9) Flate 126ms ± 1% 133ms ± 2% +6.15% (p=0.000 n=10+10) GoParser 153ms ± 1% 164ms ± 3% +6.89% (p=0.000 n=10+10) Reflect 437ms ± 1% 457ms ± 1% +4.59% (p=0.000 n=10+10) Tar 178ms ± 1% 185ms ± 1% +4.18% (p=0.000 n=10+10) XML 223ms ± 1% 238ms ± 1% +6.39% (p=0.000 n=10+9) [Geo mean] 394ms 417ms +5.78% name old alloc/op new alloc/op delta Template 34.5MB ± 0% 38.0MB ± 0% +10.19% (p=0.000 n=10+10) Unicode 29.3MB ± 0% 30.3MB ± 0% +3.56% (p=0.000 n=8+9) GoTypes 113MB ± 0% 125MB ± 0% +10.89% (p=0.000 n=10+10) Compiler 510MB ± 0% 575MB ± 0% +12.79% (p=0.000 n=10+10) SSA 1.46GB ± 0% 1.64GB ± 0% +12.40% (p=0.000 n=10+10) Flate 23.9MB ± 0% 25.9MB ± 0% +8.56% (p=0.000 n=10+10) GoParser 28.0MB ± 0% 30.8MB ± 0% +10.08% (p=0.000 n=10+10) Reflect 77.6MB ± 0% 84.3MB ± 0% +8.63% (p=0.000 n=10+10) Tar 34.1MB ± 0% 37.0MB ± 0% +8.44% (p=0.000 n=10+10) XML 42.7MB ± 0% 47.2MB ± 0% +10.75% (p=0.000 n=10+10) [Geo mean] 76.0MB 83.3MB +9.60% name old allocs/op new allocs/op delta Template 321k ± 0% 337k ± 0% +4.98% (p=0.000 n=10+10) Unicode 337k ± 0% 340k ± 0% +1.04% (p=0.000 n=10+9) GoTypes 1.13M ± 0% 1.18M ± 0% +4.85% (p=0.000 n=10+10) Compiler 4.67M ± 0% 4.96M ± 0% +6.25% (p=0.000 n=10+10) SSA 11.7M ± 0% 12.3M ± 0% +5.69% (p=0.000 n=10+10) Flate 216k ± 0% 226k ± 0% +4.52% (p=0.000 n=10+9) GoParser 271k ± 0% 283k ± 0% +4.52% (p=0.000 n=10+10) Reflect 927k ± 0% 972k ± 0% +4.78% (p=0.000 n=10+10) Tar 318k ± 0% 333k ± 0% +4.56% (p=0.000 n=10+10) XML 376k ± 0% 395k ± 0% +5.04% (p=0.000 n=10+10) [Geo mean] 730k 764k +4.61% name old exe-bytes new exe-bytes delta HelloSize 1.46M ± 0% 1.51M ± 0% +3.66% (p=0.000 n=10+10) For #24543. Change-Id: I91e003dc64151916b384274884bf02a2d6862547 Reviewed-on: https://go-review.googlesource.com/109353 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2018-05-22 15:55:03 +00:00
isharipo	5437cde96c	cmd/asm: enable AVX512 - Uncomment tests for AVX512 encoder - Permit instruction suffixes for x86 - Permit limited reg list [reg-reg] syntax for x86 for multi-source ops - EVEX encoding support in obj/x86 (Z-cases, asmevex, etc.) - optabs and ytabs generated by x86avxgen (https://golang.org/cl/107216) Note: suffix formatting implemented with updated CConv function. Now arch asm backend should register formatting function by calling RegisterOpSuffix. Updates #22779 Change-Id: I076a167ee49582700e058c56ad74e6696710c8c8 Reviewed-on: https://go-review.googlesource.com/113315 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-05-22 14:57:15 +00:00
Richard Musiol	3b137dd2df	cmd/compile: add wasm architecture This commit adds the wasm architecture to the compile command. A later commit will contain the corresponding linker changes. Design doc: https://docs.google.com/document/d/131vjr4DH6JFnb-blm_uRdaC0_Nv3OUwjEY5qVCxCup4 The following files are generated: - src/cmd/compile/internal/ssa/opGen.go - src/cmd/compile/internal/ssa/rewriteWasm.go - src/cmd/internal/obj/wasm/anames.go Updates #18892 Change-Id: Ifb4a96a3e427aac2362a1c97967d5667450fba3b Reviewed-on: https://go-review.googlesource.com/103295 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2018-05-04 17:56:12 +00:00
Wei Xiao	bd8a88729c	cmd/compile: intrinsify runtime.getcallerpc on arm64 Add a compiler intrinsic for getcallerpc on arm64 for better code generation. Change-Id: I897e670a2b8ffa1a8c2fdc638f5b2c44bda26318 Reviewed-on: https://go-review.googlesource.com/109276 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-04-30 13:29:14 +00:00
David Chase	dead03b794	cmd/link: process is_stmt data into dwarf line tables To improve debugging, instructions should be annotated with DWARF is_stmt. The DWARF default before was is_stmt=1, and to remove "jumpy" stepping the optimizer was tagging instructions with a no-position position, which interferes with the accuracy of profiling information. This allows that to be corrected, and also allows more "jumpy" positions to be annotated with is_stmt=0 (these changes were not made for 1.10 because of worries about further messing up profiling). The is_stmt values are placed in a pc-encoded table and passed through a symbol derived from the name of the function and processed in the linker alongside its processing of each function's pc/line tables. The only change in binary size is in the .debug_line tables measured with "objdump -h --section=.debug_line go1.test" For go1.test, these are 2614 bytes larger, or 0.72% of the size of .debug_line, or 0.025% of the file size. This will increase in proportion to how much the is_stmt flag is used (toggled). Change-Id: Ic1f1aeccff44591ad0494d29e1a0202a3c506a7a Reviewed-on: https://go-review.googlesource.com/93664 Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com>	2018-04-04 22:14:29 +00:00
Daniel Martí	c15b7b2a54	cmd: re-generate all stringer files The tool has gotten better over time, so re-generating the files brings some advantages like fewer objects, dropping the use of fmt, and dropping unnecessary bounds checks. While at it, add the missing go:generate line for obj.AddrType. Change-Id: I120c9795ee8faddf5961ff0384b9dcaf58d831ff Reviewed-on: https://go-review.googlesource.com/100015 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-03-10 21:20:50 +00:00
Than McIntosh	88c2fb9d04	cmd/compile: fix bug in DWARF inl handling of unused autos The DWARF inline info generation hooks weren't properly handling unused auto vars in certain cases, triggering an assert (now fixed). Also with this change, introduce a new autom "flavor" to use for autom entries that are added to insure that a specific auto type makes it into the linker (this is a follow-on to the fix for 22941). Fixes #22962. Change-Id: I7a2d8caf47f6ca897b12acb6a6de0eb25f5cac8f Reviewed-on: https://go-review.googlesource.com/81557 Run-TryBot: Than McIntosh <thanm@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-12-04 18:36:11 +00:00
Than McIntosh	4435fcfd6c	compiler,linker: support for DWARF inlined instances Compiler and linker changes to support DWARF inlined instances, see https://go.googlesource.com/proposal/+/HEAD/design/22080-dwarf-inlining.md for design details. This functionality is gated via the cmd/compile option -gendwarfinl=N, where N={0,1,2}, where a value of 0 disables dwarf inline generation, a value of 1 turns on dwarf generation without tracking of formal/local vars from inlined routines, and a value of 2 enables inlines with variable tracking. Updates #22080 Change-Id: I69309b3b815d9fed04aebddc0b8d33d0dbbfad6e Reviewed-on: https://go-review.googlesource.com/75550 Run-TryBot: Than McIntosh <thanm@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2017-11-30 14:39:19 +00:00
isharipo	1e83f883c5	cmd/internal/obj: make it possible to have all AVX1/2 insts Current AllowedOpCodes is 1024, which is not enough for modern x86. Changed limit to 2048 (though AVX512 will exceed this). Additional Z-cases and ytab tables are added to make it possible to handle missing AVX1 and AVX2 instructions. This CL is required by x86avxgen to work properly: https://go-review.googlesource.com/c/arch/+/66972 Change-Id: I290214bbda554d2cba53349f50dcd34014fe4cee Reviewed-on: https://go-review.googlesource.com/70650 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>	2017-11-02 16:12:17 +00:00
Wei Xiao	531e6c06c4	cmd/asm: refine Go assembly for ARM64 Some ARM64-specific instructions (such as SIMD instructions) are not supported. This patch adds support for the following: 1. Extended register, e.g.: ADD Rm.<ext>[<<amount], Rn, Rd <ext> can have the following values: UXTB, UXTH, UXTW, UXTX, SXTB, SXTH, SXTW and SXTX 2. Arrangement for SIMD instructions, e.g.: VADDP Vm.<T>, Vn.<T>, Vd.<T> <T> can have the following values: B8, B16, H4, H8, S2, S4 and D2 3. Width specifier and element index for SIMD instructions, e.g.: VMOV Vn.<T>[index], Rd // MOV(to general register) <T> can have the following values: S and D 4. Register List, e.g.: VLD1 (Rn), [Vt1.<T>, Vt2.<T>, Vt3.<T>] 5. Register offset variant, e.g.: VLD1.P (Rn)(Rm), [Vt1.<T>, Vt2.<T>] // Rm is the post-index register 6. Go assembly for ARM64 reference manual new added instructions are required to have according explanation items in the manual and items for existed instructions will be added incrementally For more information about the refinement background, please refer to the discussion (https://groups.google.com/forum/#!topic/golang-dev/rWgDxCrL4GU) This patch only adds syntax and doesn't break any assembly that already exists. Change-Id: I34e90b7faae032820593a0e417022c354a882008 Reviewed-on: https://go-review.googlesource.com/41654 Run-TryBot: Cherry Zhang <cherryyz@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2017-10-13 13:41:19 +00:00
Keith Randall	e130dcf051	cmd/compile: abort earlier if stack frame too large If the stack frame is too large, abort immediately. We used to generate code first, then abort. In issue 22200, generating code raised a panic so we got an ICE instead of an error message. Change the max frame size to 1GB (from 2GB). Stack frames between 1.1GB and 2GB didn't used to work anyway, the pcln table generation would have failed and generated an ICE. Fixes #22200 Change-Id: I1d918ab27ba6ebf5c87ec65d1bccf973f8c8541e Reviewed-on: https://go-review.googlesource.com/69810 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-11 18:24:13 +00:00
isharipo	8c67f210a1	cmd/internal/obj: change Prog.From3 to RestArgs ([]Addr) This change makes it easier to express instructions with arbitrary number of operands. Rationale: previous approach with operand "hiding" does not scale well, AVX and especially AVX512 have many instructions with 3+ operands. x86 asm backend is updated to handle up to 6 explicit operands. It also fixes issue with 4-th immediate operand type checks. All `ytab` tables are updated accordingly. Changes to non-x86 backends only include these patterns: `p.From3 = X` => `p.SetFrom3(X)` `p.From3.X = Y` => `p.GetFrom3().X = Y` Over time, other backends can adapt Prog.RestArgs and reduce the amount of workarounds. -- Performance -- x/benchmark/build: $ benchstat upstream.bench patched.bench name old time/op new time/op delta Build-48 21.7s ± 2% 21.8s ± 2% ~ (p=0.218 n=10+10) name old binary-size new binary-size delta Build-48 10.3M ± 0% 10.3M ± 0% ~ (all equal) name old build-time/op new build-time/op delta Build-48 21.7s ± 2% 21.8s ± 2% ~ (p=0.218 n=10+10) name old build-peak-RSS-bytes new build-peak-RSS-bytes delta Build-48 145MB ± 5% 148MB ± 5% ~ (p=0.218 n=10+10) name old build-user+sys-time/op new build-user+sys-time/op delta Build-48 21.0s ± 2% 21.2s ± 2% ~ (p=0.075 n=10+10) Microbenchmark shows a slight slowdown. name old time/op new time/op delta AMD64asm-4 49.5ms ± 1% 49.9ms ± 1% +0.67% (p=0.001 n=23+15) func BenchmarkAMD64asm(b *testing.B) { for i := 0; i < b.N; i++ { TestAMD64EndToEnd(nil) TestAMD64Encoder(nil) } } Change-Id: I4f1d37b5c2c966da3f2127705ccac9bff0038183 Reviewed-on: https://go-review.googlesource.com/63490 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>	2017-09-15 21:05:03 +00:00

1 2 3 4 5 ...

287 commits