go/src/cmd/compile at 97252f620ff8718ca9f7fef0ddebef16c6993612 - Stowage/go

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

History

Josh Bleecher Snyder c1544ff906 cmd/compile: move phi tighten after critical The phi tighten pass moves rematerializable phi args to the immediate predecessor of the phis. This reduces value lifetimes for regalloc. However, the critical edge removal pass can introduce new blocks, which can change what a block's immediate precedessor is. This can result in tightened phi args being spilled unnecessarily. This change moves the phi tighten pass after the critical edge pass, when the block structure is stable. This improves the code generated for func f(s string) bool { return s == "abcde" } Before this change: "".f STEXT nosplit size=44 args=0x18 locals=0x0 0x0000 00000 (x.go:3) MOVQ "".s+16(SP), AX 0x0005 00005 (x.go:3) CMPQ AX, $5 0x0009 00009 (x.go:3) JNE 40 0x000b 00011 (x.go:3) MOVQ "".s+8(SP), AX 0x0010 00016 (x.go:3) CMPL (AX), $1684234849 0x0016 00022 (x.go:3) JNE 36 0x0018 00024 (x.go:3) CMPB 4(AX), $101 0x001c 00028 (x.go:3) SETEQ AL 0x001f 00031 (x.go:3) MOVB AL, "".~r1+24(SP) 0x0023 00035 (x.go:3) RET 0x0024 00036 (x.go:3) XORL AX, AX 0x0026 00038 (x.go:3) JMP 31 0x0028 00040 (x.go:3) XORL AX, AX 0x002a 00042 (x.go:3) JMP 31 Observe the duplicated blocks at the end. After this change: "".f STEXT nosplit size=40 args=0x18 locals=0x0 0x0000 00000 (x.go:3) MOVQ "".s+16(SP), AX 0x0005 00005 (x.go:3) CMPQ AX, $5 0x0009 00009 (x.go:3) JNE 36 0x000b 00011 (x.go:3) MOVQ "".s+8(SP), AX 0x0010 00016 (x.go:3) CMPL (AX), $1684234849 0x0016 00022 (x.go:3) JNE 36 0x0018 00024 (x.go:3) CMPB 4(AX), $101 0x001c 00028 (x.go:3) SETEQ AL 0x001f 00031 (x.go:3) MOVB AL, "".~r1+24(SP) 0x0023 00035 (x.go:3) RET 0x0024 00036 (x.go:3) XORL AX, AX 0x0026 00038 (x.go:3) JMP 31 Change-Id: I12c81aa53b89456cb5809aa5396378245f3beda9 Reviewed-on: https://go-review.googlesource.com/c/go/+/172597 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>		2019-04-19 15:39:53 +00:00
..
internal	cmd/compile: move phi tighten after critical	2019-04-19 15:39:53 +00:00
doc.go	cmd/compile, cmd/link: document more flags	2019-01-25 04:57:20 +00:00
fmt_test.go	cmd/compile: provide updating mechanism for format test	2018-11-16 17:33:54 +00:00
fmtmap_test.go	cmd/compile: update compiler's format test (fix long test)	2019-02-12 01:09:52 +00:00
main.go	cmd/compile: add wasm architecture	2018-05-04 17:56:12 +00:00
README.md	cmd/compile: minor updates to the README	2018-07-04 10:47:06 +00:00

README.md

Introduction to the Go compiler

cmd/compile contains the main packages that form the Go compiler. The compiler may be logically split in four phases, which we will briefly describe alongside the list of packages that contain their code.

You may sometimes hear the terms "front-end" and "back-end" when referring to the compiler. Roughly speaking, these translate to the first two and last two phases we are going to list here. A third term, "middle-end", often refers to much of the work that happens in the second phase.

Note that the go/* family of packages, such as go/parser and go/types, have no relation to the compiler. Since the compiler was initially written in C, the go/* packages were developed to enable writing tools working with Go code, such as gofmt and vet.

It should be clarified that the name "gc" stands for "Go compiler", and has little to do with uppercase "GC", which stands for garbage collection.

1. Parsing

cmd/compile/internal/syntax (lexer, parser, syntax tree)

In the first phase of compilation, source code is tokenized (lexical analysis), parsed (syntax analysis), and a syntax tree is constructed for each source file.

Each syntax tree is an exact representation of the respective source file, with nodes corresponding to the various elements of the source such as expressions, declarations, and statements. The syntax tree also includes position information which is used for error reporting and the creation of debugging information.

2. Type-checking and AST transformations

cmd/compile/internal/gc (create compiler AST, type checking, AST transformations)

The gc package includes an AST definition carried over from when it was written in C. All of its code is written in terms of it, so the first thing that the gc package must do is convert the syntax package's syntax tree to the compiler's AST representation. This extra step may be refactored away in the future.

The AST is then type-checked. The first steps are name resolution and type inference, which determine which object belongs to which identifier, and what type each expression has. Type-checking includes certain extra checks, such as "declared and not used" as well as determining whether or not a function terminates.

Certain transformations are also done on the AST. Some nodes are refined based on type information, such as string additions being split from the arithmetic addition node type. Some other examples are dead code elimination, function call inlining, and escape analysis.

3. Generic SSA

cmd/compile/internal/gc (converting to SSA)
cmd/compile/internal/ssa (SSA passes and rules)

In this phase, the AST is converted into Static Single Assignment (SSA) form, a lower-level intermediate representation with specific properties that make it easier to implement optimizations and to eventually generate machine code from it.

During this conversion, function intrinsics are applied. These are special functions that the compiler has been taught to replace with heavily optimized code on a case-by-case basis.

Certain nodes are also lowered into simpler components during the AST to SSA conversion, so that the rest of the compiler can work with them. For instance, the copy builtin is replaced by memory moves, and range loops are rewritten into for loops. Some of these currently happen before the conversion to SSA due to historical reasons, but the long-term plan is to move all of them here.

Then, a series of machine-independent passes and rules are applied. These do not concern any single computer architecture, and thus run on all GOARCH variants.

Some examples of these generic passes include dead code elimination, removal of unneeded nil checks, and removal of unused branches. The generic rewrite rules mainly concern expressions, such as replacing some expressions with constant values, and optimizing multiplications and float operations.

4. Generating machine code

cmd/compile/internal/ssa (SSA lowering and arch-specific passes)
cmd/internal/obj (machine code generation)

The machine-dependent phase of the compiler begins with the "lower" pass, which rewrites generic values into their machine-specific variants. For example, on amd64 memory operands are possible, so many load-store operations may be combined.

Note that the lower pass runs all machine-specific rewrite rules, and thus it currently applies lots of optimizations too.

Once the SSA has been "lowered" and is more specific to the target architecture, the final code optimization passes are run. This includes yet another dead code elimination pass, moving values closer to their uses, the removal of local variables that are never read from, and register allocation.

Other important pieces of work done as part of this step include stack frame layout, which assigns stack offsets to local variables, and pointer liveness analysis, which computes which on-stack pointers are live at each GC safe point.

At the end of the SSA generation phase, Go functions have been transformed into a series of obj.Prog instructions. These are passed to the assembler (cmd/internal/obj), which turns them into machine code and writes out the final object file. The object file will also contain reflect data, export data, and debugging information.