go/src/cmd/compile/internal/ssa
Keith Randall ca66f907dd cmd/compile: use generated loops instead of DUFFCOPY on amd64
This reverts commit 4e182db5fc (CL 695196),
which is itself a revert of
ec9e1176c3 (CL 678620).

So this CL is exactly the same as CL 678620, but with a regalloc fix
(CL 696035) submitted first.

Change-Id: I743ab32fa3aa6ef3e1b2b6751a2ef4519139057c
Reviewed-on: https://go-review.googlesource.com/c/go/+/696016
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-08-13 15:57:33 -07:00
..
_gen cmd/compile: use generated loops instead of DUFFCOPY on amd64 2025-08-13 15:57:33 -07:00
testdata cmd/compile/internal/escape: improve DWARF .debug_line numbering for literal rewriting optimizations 2025-07-17 03:41:36 -07:00
addressingmodes.go cmd/compile: add indexed SET* opcodes for amd64 2023-07-26 17:19:57 +00:00
allocators.go cmd/compile: use the builtin clear 2025-04-18 04:21:12 -07:00
bench_test.go cmd/compile: optimize absorbing InvertFlags into Noov comparisons for arm64 2023-09-21 02:36:06 +00:00
biasedsparsemap.go cmd: remove dead code 2025-08-05 10:31:25 -07:00
block.go cmd/compile: use the builtin clear 2025-04-18 04:21:12 -07:00
branchelim.go cmd/compile: allow InlMark operations to be speculatively executed 2025-08-11 00:52:23 -07:00
branchelim_test.go cmd/compile/internal/ssa: add Op{SP,SB} type checks to check.go 2018-04-24 15:51:15 +00:00
cache.go cmd/compile: use the builtin clear 2025-04-18 04:21:12 -07:00
check.go cmd/compile,runtime: remember idx+len for bounds check failure with less code 2025-07-24 16:05:59 -07:00
checkbce.go cmd/compile: run checkbce after fuseLate pass 2024-07-23 23:50:30 +00:00
compile.go cmd/compile: add opt branchelim to rewrite some CondSelect into math 2025-07-24 14:42:10 -07:00
config.go cmd/compile/internal: optimize multiplication use new operation 'ADDshiftLLV' on loong64 2025-08-12 23:01:49 -07:00
copyelim.go cmd/compile: generalize struct load/store 2024-09-26 13:18:08 +00:00
copyelim_test.go cmd/compile: change ssa.Type into *types.Type 2017-05-09 23:01:51 +00:00
critical.go cmd/compile: call phiElimValue from removePhiArg 2023-05-16 21:40:11 +00:00
cse.go cmd/compile: modify CSE to remove redundant OpLocalAddrs 2024-11-22 00:12:03 +00:00
cse_test.go cmd/compile/internal/ssa: replace Frontend.Auto with Func.NewLocal 2023-09-08 19:09:14 +00:00
deadcode.go cmd/compile: use ,ok return idiom for sparsemap.get 2025-07-24 09:04:29 -07:00
deadcode_test.go cmd/compile: change ssa.Type into *types.Type 2017-05-09 23:01:51 +00:00
deadstore.go cmd/compile: use ,ok return idiom for sparsemap.get 2025-07-24 09:04:29 -07:00
deadstore_test.go cmd/compile: teach dse about equivalent LocalAddrs 2024-04-25 20:07:26 +00:00
debug.go cmd: remove dead code 2025-08-05 10:31:25 -07:00
debug_lines_test.go cmd/compile/internal/ssa: restrict architectures for TestDebugLines_74576 2025-07-19 05:33:40 -07:00
debug_test.go all: use strings.ReplaceAll where applicable 2025-04-16 12:26:29 -07:00
decompose.go cmd/compile: generalize struct load/store 2024-09-26 13:18:08 +00:00
dom.go cmd/compile/internal/ssa: remove linkedBlocks and its uses 2025-03-20 09:10:17 -07:00
dom_test.go all: fix a large number of comments 2024-03-26 19:58:28 +00:00
expand_calls.go Revert "cmd/compile: allow multi-field structs to be stored directly in interfaces" 2025-08-11 22:59:52 -07:00
export_test.go cmd/compile: add structs.HostLayout 2024-05-20 21:19:39 +00:00
flagalloc.go cmd/compile: add cache of sizeable objects so they can be reused 2022-10-31 21:41:20 +00:00
flags_amd64_test.s all: add //go:build lines to assembly files 2021-05-13 09:12:17 +00:00
flags_arm64_test.s all: add //go:build lines to assembly files 2021-05-13 09:12:17 +00:00
flags_test.go cmd/cgo, cmd/compile, cmd/link: remove old style build tags 2022-10-04 19:36:17 +00:00
fmahash_test.go cmd/compile/internal/ssa: print output on failure in TestFmaHash 2024-05-14 21:57:53 +00:00
func.go all: use strings.ReplaceAll where applicable 2025-04-16 12:26:29 -07:00
func_test.go cmd/compile/internal/ssa: simplify NewFunc API 2023-09-08 19:01:04 +00:00
fuse.go cmd/compile: ensure pointer arithmetic happens after the nil check 2023-10-31 20:45:54 +00:00
fuse_branchredirect.go cmd/compile: call phiElimValue from removePhiArg 2023-05-16 21:40:11 +00:00
fuse_comparisons.go all: gofmt main repo 2022-04-11 16:34:30 +00:00
fuse_test.go cmd/compile: ensure pointer arithmetic happens after the nil check 2023-10-31 20:45:54 +00:00
generate.go cmd/compile/internal/ssa: generate code via a //go:generate directive 2023-01-19 22:42:34 +00:00
generate_test.go cmd/compile: add up-to-date test for generated files 2025-06-11 10:11:53 -07:00
html.go all: use strings.ReplaceAll where applicable 2025-04-16 12:26:29 -07:00
id.go cmd/compile: in a Tarjan algorithm, DFS should really be DFS 2016-04-22 19:21:16 +00:00
layout.go cmd/compile: don't preload registers if destination already scheduled 2025-05-19 17:13:21 -07:00
lca.go cmd/compile/internal/ssa: add missing space in comment 2023-10-30 21:52:15 +00:00
lca_test.go all: fix printf(var) mistakes detected by latest printf checker 2024-09-04 18:16:59 +00:00
likelyadjust.go cmd/compile: remove no-longer-necessary call to calculateDepths 2025-07-25 17:43:10 -07:00
location.go cmd/compile: remove residual register GC map code 2025-02-20 12:51:47 -08:00
loopbce.go cmd/compile: check domination of loop return in both controls 2025-07-30 07:31:18 -07:00
loopreschedchecks.go cmd/compile: fix a premature-deallocation of state in loopreschedchecks 2024-12-03 16:22:55 +00:00
looprotate.go cmd/compile: remove no-longer-necessary call to calculateDepths 2025-07-25 17:43:10 -07:00
looprotate_test.go cmd/compile: improve loopRotate to handle nested loops 2025-07-24 12:40:00 -07:00
lower.go cmd/compile: detect write barrier completion differently 2023-02-16 00:16:13 +00:00
magic.go cmd/internal/ssa: fix typo in comment 2024-03-26 19:58:25 +00:00
magic_test.go cmd/compile: add signed divisibility rules 2019-04-30 22:02:07 +00:00
memcombine.go cmd/compile: fix offset calculation error in memcombine 2025-05-21 12:17:08 -07:00
nilcheck.go cmd/compile: use ,ok return idiom for sparsemap.get 2025-07-24 09:04:29 -07:00
nilcheck_test.go all: add missing copyright header 2023-11-17 23:34:11 +00:00
numberlines.go cmd/compile: generalize struct load/store 2024-09-26 13:18:08 +00:00
op.go cmd/compile: remove support for old-style bounds check calls 2025-08-05 08:59:28 -07:00
opGen.go cmd/compile: use generated loops instead of DUFFCOPY on amd64 2025-08-13 15:57:33 -07:00
opt.go all: add missing periods in comments 2022-11-18 17:59:44 +00:00
pair.go cmd/compile: use zero register instead of specialized *zero instructions 2025-04-04 15:26:24 -07:00
passbm_test.go all: fix a large number of comments 2024-03-26 19:58:28 +00:00
phiopt.go cmd/compile: add 2 phiopt cases 2025-05-08 10:18:37 -07:00
poset.go cmd/compile: use the builtin clear 2025-04-18 04:21:12 -07:00
poset_test.go cmd/compile: rip out constant handling in poset data structure 2024-08-07 16:08:28 +00:00
print.go cmd: do not use notsha256 2024-09-04 18:23:49 +00:00
prove.go cmd/compile: teach prove about len's & cap's max based on the element size 2025-08-13 07:21:20 -07:00
README.md cmd/compile: update GOSSAFUNC doc for printing CFG 2025-03-27 15:39:38 -07:00
regalloc.go cmd/compile: use generated loops instead of DUFFCOPY on amd64 2025-08-13 15:57:33 -07:00
regalloc_test.go cmd/compile: use generated loops instead of DUFFCOPY on amd64 2025-08-13 15:57:33 -07:00
rewrite.go cmd/compile: use generated loops instead of DUFFCOPY on amd64 2025-08-13 15:57:33 -07:00
rewrite386.go cmd/compile: move amd64 and 386 over to new bounds check strategy 2025-07-24 16:06:16 -07:00
rewrite386splitload.go cmd/compile/internal/ssa: generate code via a //go:generate directive 2023-01-19 22:42:34 +00:00
rewrite_test.go cmd/compile: add type-based alias analysis 2025-02-14 15:32:55 -08:00
rewriteAMD64.go cmd/compile: use generated loops instead of DUFFCOPY on amd64 2025-08-13 15:57:33 -07:00
rewriteAMD64latelower.go cmd/compile: fix sign/zero-extension removal 2024-03-12 19:38:41 +00:00
rewriteAMD64splitload.go cmd/compile/internal/ssa: generate code via a //go:generate directive 2023-01-19 22:42:34 +00:00
rewriteARM.go cmd/compile: move arm32 over to new bounds check strategy 2025-07-29 10:46:49 -07:00
rewriteARM64.go cmd/compile: use generated loops instead of DUFFZERO on arm64 2025-08-12 09:15:19 -07:00
rewriteARM64latelower.go cmd/compile: improve multiplication strength reduction 2025-05-01 09:33:31 -07:00
rewriteCond_test.go cmd/compile: optimize cmp to cmn under conditions < and >= on arm64 2023-03-24 01:19:09 +00:00
rewritedec.go Revert "cmd/compile: allow StructSelect [x] of interface data fields for x>0" 2025-08-11 22:36:26 -07:00
rewritedec64.go cmd/compile/internal/ssa: generate code via a //go:generate directive 2023-01-19 22:42:34 +00:00
rewritegeneric.go Revert "cmd/compile: allow multi-field structs to be stored directly in interfaces" 2025-08-11 22:59:52 -07:00
rewriteLOONG64.go cmd/compile: absorb NEGV into branch on loong64 2025-08-13 07:20:08 -07:00
rewriteLOONG64latelower.go cmd/compile/internal/ssa: use BEQ/BNE to optimize the combination of XOR and EQ/NE on loong64 2025-08-12 18:02:02 -07:00
rewriteMIPS.go cmd/compile: move mips32 over to new bounds check strategy 2025-07-30 08:33:08 -07:00
rewriteMIPS64.go cmd/compile: move mips64 over to new bounds check strategy 2025-07-30 08:33:02 -07:00
rewritePPC64.go cmd/compile: move ppc64 over to new bounds check strategy 2025-08-05 08:59:16 -07:00
rewritePPC64latelower.go cmd/compile: intrinsify math.MulUintptr on PPC64 2024-08-26 17:02:43 +00:00
rewriteRISCV64.go cmd/compile: optimise float <-> int register moves on riscv64 2025-08-05 08:27:15 -07:00
rewriteRISCV64latelower.go cmd/compile,cmd/internal/obj/riscv: always provide ANDN, ORN and XNOR for riscv64 2024-09-12 15:03:44 +00:00
rewriteS390X.go cmd/compile: move s390x over to new bounds check strategy 2025-08-04 10:08:22 -07:00
rewriteWasm.go cmd/compile: simplify intrinsification of BitLen16 and BitLen8 2025-02-26 02:02:07 -08:00
sccp.go cmd/compile/internal/ssa: fix typo in sccp 2024-01-22 16:54:50 +00:00
sccp_test.go cmd/compile: sparse conditional constant propagation 2023-09-12 21:01:50 +00:00
schedule.go cmd/compile: schedule induction variable increments late 2025-05-15 14:06:41 -07:00
schedule_test.go cmd/compile: enable carry chain scheduling for arm64 2022-10-08 01:46:00 +00:00
shift_test.go all: gofmt main repo 2022-04-11 16:34:30 +00:00
shortcircuit.go cmd/compile: match more patterns for shortcircuit 2025-03-27 12:30:03 -07:00
shortcircuit_test.go cmd/compile: change ssa.Type into *types.Type 2017-05-09 23:01:51 +00:00
sizeof_test.go [dev.regabi] cmd/compile: change LocalSlot.N to *ir.Name 2020-12-08 01:46:40 +00:00
softfloat.go [dev.typeparams] cmd/compile: make softfloat mode work with register ABI 2021-08-03 16:14:24 +00:00
sparsemap.go cmd/compile: use ,ok return idiom for sparsemap.get 2025-07-24 09:04:29 -07:00
sparsemappos.go cmd/compile: separate out sparsemaps that need position 2022-10-31 21:41:06 +00:00
sparseset.go all: add missing periods in comments 2022-11-18 17:59:44 +00:00
sparsetree.go all: add missing periods in comments 2022-11-18 17:59:44 +00:00
stackalloc.go cmd/compile: use the builtin clear 2025-04-18 04:21:12 -07:00
stmtlines_test.go cmd/compile/internal/ssa: skip EndSequence entries in TestStmtLines 2025-07-07 09:14:56 -07:00
tighten.go cmd/compile: fix containsUnavoidableCall computation 2025-07-25 13:52:00 -07:00
TODO cmd/compile: update SSA TODO file 2018-04-24 23:35:13 +00:00
trim.go cmd/compile: use the builtin clear 2025-04-18 04:21:12 -07:00
tuple.go cmd/compile: change StaticCall to return a "Results" 2021-02-26 02:52:33 +00:00
value.go cmd/compile: aggregate scalar allocations for heap escapes 2025-04-04 10:53:05 -07:00
writebarrier.go cmd/compile/internal/ssa: simplify with built-in min, max functions 2025-04-11 08:50:53 -07:00
writebarrier_test.go cmd/compile/internal/ssa: add Op{SP,SB} type checks to check.go 2018-04-24 15:51:15 +00:00
xposmap.go cmd/compile: use ,ok return idiom for sparsemap.get 2025-07-24 09:04:29 -07:00
zcse.go [dev.regabi] cmd/compile: add ssa.Aux tag interface for Value.Aux 2020-12-08 01:46:31 +00:00
zeroextension_test.go cmd/compile/internal/ssa: refactor zeroUpper32Bits 2018-02-27 20:38:32 +00:00

Introduction to the Go compiler's SSA backend

This package contains the compiler's Static Single Assignment form component. If you're not familiar with SSA, its Wikipedia article is a good starting point.

It is recommended that you first read cmd/compile/README.md if you are not familiar with the Go compiler already. That document gives an overview of the compiler, and explains what is SSA's part and purpose in it.

Key concepts

The names described below may be loosely related to their Go counterparts, but note that they are not equivalent. For example, a Go block statement has a variable scope, yet SSA has no notion of variables nor variable scopes.

It may also be surprising that values and blocks are named after their unique sequential IDs. They rarely correspond to named entities in the original code, such as variables or function parameters. The sequential IDs also allow the compiler to avoid maps, and it is always possible to track back the values to Go code using debug and position information.

Values

Values are the basic building blocks of SSA. Per SSA's very definition, a value is defined exactly once, but it may be used any number of times. A value mainly consists of a unique identifier, an operator, a type, and some arguments.

An operator or Op describes the operation that computes the value. The semantics of each operator can be found in _gen/*Ops.go. For example, OpAdd8 takes two value arguments holding 8-bit integers and results in their addition. Here is a possible SSA representation of the addition of two uint8 values:

// var c uint8 = a + b
v4 = Add8 <uint8> v2 v3

A value's type will usually be a Go type. For example, the value in the example above has a uint8 type, and a constant boolean value will have a bool type. However, certain types don't come from Go and are special; below we will cover memory, the most common of them.

Some operators contain an auxiliary field. The aux fields are usually printed as enclosed in [] or {}, and could be the constant op argument, argument type, etc. for example:

v13 (?) = Const64 <int> [1]

Here the aux field is the constant op argument, the op is creating a Const64 value of 1. One more example:

v17 (361) = Store <mem> {int} v16 v14 v8

Here the aux field is the type of the value being Storeed, which is int.

See value.go and _gen/*Ops.go for more information.

Memory types

memory represents the global memory state. An Op that takes a memory argument depends on that memory state, and an Op which has the memory type impacts the state of memory. This ensures that memory operations are kept in the right order. For example:

// *a = 3
// *b = *a
v10 = Store <mem> {int} v6 v8 v1
v14 = Store <mem> {int} v7 v8 v10

Here, Store stores its second argument (of type int) into the first argument (of type *int). The last argument is the memory state; since the second store depends on the memory value defined by the first store, the two stores cannot be reordered.

See cmd/compile/internal/types/type.go for more information.

Blocks

A block represents a basic block in the control flow graph of a function. It is, essentially, a list of values that define the operation of this block. Besides the list of values, blocks mainly consist of a unique identifier, a kind, and a list of successor blocks.

The simplest kind is a plain block; it simply hands the control flow to another block, thus its successors list contains one block.

Another common block kind is the exit block. These have a final value, called control value, which must return a memory state. This is necessary for functions to return some values, for example - the caller needs some memory state to depend on, to ensure that it receives those return values correctly.

The last important block kind we will mention is the if block. It has a single control value that must be a boolean value, and it has exactly two successor blocks. The control flow is handed to the first successor if the bool is true, and to the second otherwise.

Here is a sample if-else control flow represented with basic blocks:

// func(b bool) int {
// 	if b {
// 		return 2
// 	}
// 	return 3
// }
b1:
  v1 = InitMem <mem>
  v2 = SP <uintptr>
  v5 = Addr <*int> {~r1} v2
  v6 = Arg <bool> {b}
  v8 = Const64 <int> [2]
  v12 = Const64 <int> [3]
  If v6 -> b2 b3
b2: <- b1
  v10 = VarDef <mem> {~r1} v1
  v11 = Store <mem> {int} v5 v8 v10
  Ret v11
b3: <- b1
  v14 = VarDef <mem> {~r1} v1
  v15 = Store <mem> {int} v5 v12 v14
  Ret v15

See block.go for more information.

Functions

A function represents a function declaration along with its body. It mainly consists of a name, a type (its signature), a list of blocks that form its body, and the entry block within said list.

When a function is called, the control flow is handed to its entry block. If the function terminates, the control flow will eventually reach an exit block, thus ending the function call.

Note that a function may have zero or multiple exit blocks, just like a Go function can have any number of return points, but it must have exactly one entry point block.

Also note that some SSA functions are autogenerated, such as the hash functions for each type used as a map key.

For example, this is what an empty function can look like in SSA, with a single exit block that returns an uninteresting memory state:

foo func()
  b1:
    v1 = InitMem <mem>
    Ret v1

See func.go for more information.

Compiler passes

Having a program in SSA form is not very useful on its own. Its advantage lies in how easy it is to write optimizations that modify the program to make it better. The way the Go compiler accomplishes this is via a list of passes.

Each pass transforms a SSA function in some way. For example, a dead code elimination pass will remove blocks and values that it can prove will never be executed, and a nil check elimination pass will remove nil checks which it can prove to be redundant.

Compiler passes work on one function at a time, and by default run sequentially and exactly once.

The lower pass is special; it converts the SSA representation from being machine-independent to being machine-dependent. That is, some abstract operators are replaced with their non-generic counterparts, potentially reducing or increasing the final number of values.

See the passes list defined in compile.go for more information.

Playing with SSA

A good way to see and get used to the compiler's SSA in action is via GOSSAFUNC. For example, to see func Foo's initial SSA form and final generated assembly, one can run:

GOSSAFUNC=Foo go build

The generated ssa.html file will also contain the SSA func at each of the compile passes, making it easy to see what each pass does to a particular program. You can also click on values and blocks to highlight them, to help follow the control flow and values.

The value specified in GOSSAFUNC can also be a package-qualified function name, e.g.

GOSSAFUNC=blah.Foo go build

This will match any function named "Foo" within a package whose final suffix is "blah" (e.g. something/blah.Foo, anotherthing/extra/blah.Foo).

The users may also print the Control Flow Graph(CFG) by specifying in GOSSAFUNC value in the following format:

GOSSAFUNC="$FunctionName:$PassName1,$PassName2,..." go build

For example, the following command will print SSA with CFGs attached to the sccp and generic deadcode pass columns:

GOSSAFUNC="blah.Foo:sccp,generic deadcode" go build

If non-HTML dumps are needed, append a "+" to the GOSSAFUNC value and dumps will be written to stdout:

GOSSAFUNC=Bar+ go build

Hacking on SSA

While most compiler passes are implemented directly in Go code, some others are code generated. This is currently done via rewrite rules, which have their own syntax and are maintained in _gen/*.rules. Simpler optimizations can be written easily and quickly this way, but rewrite rules are not suitable for more complex optimizations.

To read more on rewrite rules, have a look at the top comments in _gen/generic.rules and _gen/rulegen.go.

Similarly, the code to manage operators is also code generated from _gen/*Ops.go, as it is easier to maintain a few tables than a lot of code. After changing the rules or operators, run go generate cmd/compile/internal/ssa to generate the Go code again.