Commit graph

56 commits

Author SHA1 Message Date
Keith Randall
f378f30034 undo CL 101570044 / 2c57aaea79c4
redo stack allocation.  This is mostly the same as
the original CL with a few bug fixes.

1. add racemalloc() for stack allocations
2. fix poolalloc/poolfree to terminate free lists correctly.
3. adjust span ref count correctly.
4. don't use cache for sizes >= StackCacheSize.

Should fix bugs and memory leaks in original changelist.

««« original CL description
undo CL 104200047 / 318b04f28372

Breaks windows and race detector.
TBR=rsc

««« original CL description
runtime: stack allocator, separate from mallocgc

In order to move malloc to Go, we need to have a
separate stack allocator.  If we run out of stack
during malloc, malloc will not be available
to allocate a new stack.

Stacks are the last remaining FlagNoGC objects in the
GC heap.  Once they are out, we can get rid of the
distinction between the allocated/blockboundary bits.
(This will be in a separate change.)

Fixes #7468
Fixes #7424

LGTM=rsc, dvyukov
R=golang-codereviews, dvyukov, khr, dave, rsc
CC=golang-codereviews
https://golang.org/cl/104200047
»»»

TBR=rsc
CC=golang-codereviews
https://golang.org/cl/101570044
»»»

LGTM=dvyukov
R=dvyukov, dave, khr, alex.brainman
CC=golang-codereviews
https://golang.org/cl/112240044
2014-07-17 14:41:46 -07:00
Keith Randall
3cf83c182a undo CL 104200047 / 318b04f28372
Breaks windows and race detector.
TBR=rsc

««« original CL description
runtime: stack allocator, separate from mallocgc

In order to move malloc to Go, we need to have a
separate stack allocator.  If we run out of stack
during malloc, malloc will not be available
to allocate a new stack.

Stacks are the last remaining FlagNoGC objects in the
GC heap.  Once they are out, we can get rid of the
distinction between the allocated/blockboundary bits.
(This will be in a separate change.)

Fixes #7468
Fixes #7424

LGTM=rsc, dvyukov
R=golang-codereviews, dvyukov, khr, dave, rsc
CC=golang-codereviews
https://golang.org/cl/104200047
»»»

TBR=rsc
CC=golang-codereviews
https://golang.org/cl/101570044
2014-06-30 19:48:08 -07:00
Keith Randall
7c13860cd0 runtime: stack allocator, separate from mallocgc
In order to move malloc to Go, we need to have a
separate stack allocator.  If we run out of stack
during malloc, malloc will not be available
to allocate a new stack.

Stacks are the last remaining FlagNoGC objects in the
GC heap.  Once they are out, we can get rid of the
distinction between the allocated/blockboundary bits.
(This will be in a separate change.)

Fixes #7468
Fixes #7424

LGTM=rsc, dvyukov
R=golang-codereviews, dvyukov, khr, dave, rsc
CC=golang-codereviews
https://golang.org/cl/104200047
2014-06-30 18:59:24 -07:00
Russ Cox
89f185fe8a all: remove 'extern register M *m' from runtime
The runtime has historically held two dedicated values g (current goroutine)
and m (current thread) in 'extern register' slots (TLS on x86, real registers
backed by TLS on ARM).

This CL removes the extern register m; code now uses g->m.

On ARM, this frees up the register that formerly held m (R9).
This is important for NaCl, because NaCl ARM code cannot use R9 at all.

The Go 1 macrobenchmarks (those with per-op times >= 10 µs) are unaffected:

BenchmarkBinaryTree17              5491374955     5471024381     -0.37%
BenchmarkFannkuch11                4357101311     4275174828     -1.88%
BenchmarkGobDecode                 11029957       11364184       +3.03%
BenchmarkGobEncode                 6852205        6784822        -0.98%
BenchmarkGzip                      650795967      650152275      -0.10%
BenchmarkGunzip                    140962363      141041670      +0.06%
BenchmarkHTTPClientServer          71581          73081          +2.10%
BenchmarkJSONEncode                31928079       31913356       -0.05%
BenchmarkJSONDecode                117470065      113689916      -3.22%
BenchmarkMandelbrot200             6008923        5998712        -0.17%
BenchmarkGoParse                   6310917        6327487        +0.26%
BenchmarkRegexpMatchMedium_1K      114568         114763         +0.17%
BenchmarkRegexpMatchHard_1K        168977         169244         +0.16%
BenchmarkRevcomp                   935294971      914060918      -2.27%
BenchmarkTemplate                  145917123      148186096      +1.55%

Minux previous reported larger variations, but these were caused by
run-to-run noise, not repeatable slowdowns.

Actual code changes by Minux.
I only did the docs and the benchmarking.

LGTM=dvyukov, iant, minux
R=minux, josharian, iant, dave, bradfitz, dvyukov
CC=golang-codereviews
https://golang.org/cl/109050043
2014-06-26 11:54:39 -04:00
Russ Cox
060a988011 runtime: revise CL 105140044 (defer nil) to work on Windows
It appears that something about Go on Windows
cannot handle the fault cause by a jump to address 0.
The way Go represents and calls functions, this
never happened at all, until CL 105140044.

This CL changes the code added in CL 105140044
to make jump to 0 impossible once again.

Fixes #8047. (again, on Windows)

TBR=bradfitz
R=golang-codereviews, dave
CC=adg, golang-codereviews, iant, r
https://golang.org/cl/105120044
2014-06-12 21:12:53 -04:00
Russ Cox
36207a91d3 runtime: fix defer of nil func
Fixes #8047.

LGTM=r, iant
R=golang-codereviews, r, iant
CC=dvyukov, golang-codereviews, khr
https://golang.org/cl/105140044
2014-06-12 16:34:36 -04:00
Russ Cox
14d2ee1d00 runtime: make continuation pc available to stack walk
The 'continuation pc' is where the frame will continue
execution, if anywhere. For a frame that stopped execution
due to a CALL instruction, the continuation pc is immediately
after the CALL. But for a frame that stopped execution due to
a fault, the continuation pc is the pc after the most recent CALL
to deferproc in that frame, or else 0. That is where execution
will continue, if anywhere.

The liveness information is only recorded for CALL instructions.
This change makes sure that we never look for liveness information
except for CALL instructions.

Using a valid PC fixes crashes when a garbage collection or
stack copying tries to process a stack frame that has faulted.

Record continuation pc in heapdump (format change).

Fixes #8048.

LGTM=iant, khr
R=khr, iant, dvyukov
CC=golang-codereviews, r
https://golang.org/cl/100870044
2014-05-31 10:10:12 -04:00
Keith Randall
3d1c3e1e26 runtime: stack copier should handle nil defers without faulting.
fixes #8047

LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/101800043
2014-05-27 16:26:08 -07:00
Keith Randall
fc6753c7cd runtime: make sure associated defers are copyable before trying to copy a stack.
Defers generated from cgo lie to us about their argument layout.
Mark those defers as not copyable.

CL 83820043 contains an additional test for this code and should be
checked in (and enabled) after this change is in.

Fixes bug 7695.

LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/84740043
2014-04-07 17:40:00 -07:00
Russ Cox
28f1868fed cmd/gc, runtime: make GODEBUG=gcdead=1 mode work with liveness
Trying to make GODEBUG=gcdead=1 work with liveness
and in particular ambiguously live variables.

1. In the liveness computation, mark all ambiguously live
variables as live for the entire function, except the entry.
They are zeroed directly after entry, and we need them not
to be poisoned thereafter.

2. In the liveness computation, compute liveness (and deadness)
for all parameters, not just pointer-containing parameters.
Otherwise gcdead poisons untracked scalar parameters and results.

3. Fix liveness debugging print for -live=2 to use correct bitmaps.
(Was not updated for compaction during compaction CL.)

4. Correct varkill during map literal initialization.
Was killing the map itself instead of the inserted value temp.

5. Disable aggressive varkill cleanup for call arguments if
the call appears in a defer or go statement.

6. In the garbage collector, avoid bug scanning empty
strings. An empty string is two zeros. The multiword
code only looked at the first zero and then interpreted
the next two bits in the bitmap as an ordinary word bitmap.
For a string the bits are 11 00, so if a live string was zero
length with a 0 base pointer, the poisoning code treated
the length as an ordinary word with code 00, meaning it
needed poisoning, turning the string into a poison-length
string with base pointer 0. By the same logic I believe that
a live nil slice (bits 11 01 00) will have its cap poisoned.
Always scan full multiword struct.

7. In the runtime, treat both poison words (PoisonGC and
PoisonStack) as invalid pointers that warrant crashes.

Manual testing as follows:

- Create a script called gcdead on your PATH containing:

        #!/bin/bash
        GODEBUG=gcdead=1 GOGC=10 GOTRACEBACK=2 exec "$@"
- Now you can build a test and then run 'gcdead ./foo.test'.
- More importantly, you can run 'go test -short -exec gcdead std'
   to run all the tests.

Fixes #7676.

While here, enable the precise scanning of slices, since that was
disabled due to bugs like these. That now works, both with and
without gcdead.

Fixes #7549.

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83410044
2014-04-03 20:33:25 -04:00
Russ Cox
4676fae525 cmd/gc, cmd/ld, runtime: compact liveness bitmaps
Reduce footprint of liveness bitmaps by about 5x.

1. Mark all liveness bitmap symbols as 4-byte aligned
(they were aligned to a larger size by default).

2. The bitmap data is a bitmap count n followed by n bitmaps.
Each bitmap begins with its own count m giving the number
of bits. All the m's are the same for the n bitmaps.
Emit this bitmap length once instead of n times.

3. Many bitmaps within a function have the same bit values,
but each call site was given a distinct bitmap. Merge duplicate
bitmaps so that no bitmap is written more than once.

4. Many functions end up with the same aggregate bitmap data.
We used to name the bitmap data funcname.gcargs and funcname.gclocals.
Instead, name it gclocals.<md5 of data> and mark it dupok so
that the linker coalesces duplicate sets. This cut the bitmap
data remaining after step 3 by 40%; I was not expecting it to
be quite so dramatic.

Applied to "go build -ldflags -w code.google.com/p/go.tools/cmd/godoc":

                bitmaps           pclntab           binary on disk
before this CL  1326600           1985854           12738268
4-byte align    1154288 (0.87x)   1985854 (1.00x)   12566236 (0.99x)
one bitmap len   782528 (0.54x)   1985854 (1.00x)   12193500 (0.96x)
dedup bitmap     414748 (0.31x)   1948478 (0.98x)   11787996 (0.93x)
dedup bitmap set 245580 (0.19x)   1948478 (0.98x)   11620060 (0.91x)

While here, remove various dead blocks of code from plive.c.

Fixes #6929.
Fixes #7568.

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83630044
2014-04-02 16:49:27 -04:00
Russ Cox
cfb347fc0a runtime: use correct pc to obtain liveness info during stack copy
The old code was using the PC of the instruction after the CALL.
Variables live during the call but not live when it returns would
not be seen as live during the stack copy, which might lead to
corruption. The correct PC to use is the one just before the
return address. After this CL the lookup matches what mgc0.c does.

The only time this matters is if you have back to back CALL instructions:

        CALL f1 // x live here
        CALL f2 // x no longer live

If a stack copy occurs during the execution of f1, the old code will
use the liveness bitmap intended for the execution of f2 and will not
treat x as live.

The only way this situation can arise and cause a problem in a stack copy
is if x lives on the stack has had its address taken but the compiler knows
enough about the context to know that x is no longer needed once f1
returns. The compiler has never known that much, so using the f2 context
cannot currently cause incorrect execution. For the same reason, it is not
possible to write a test for this today.

CL 83090046 will make the compiler precise enough in some cases
that this distinction will start mattering. The existing stack growth tests
in package runtime will fail if that CL is submitted without this one.

While we're here, print the frame PC in debug mode and update the
bitmap interpretation strings.

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/83250043
2014-04-01 14:57:58 -04:00
Russ Cox
1ec4d5e9e7 runtime: adjust GODEBUG=allocfreetrace=1 and GODEBUG=gcdead=1
GODEBUG=allocfreetrace=1:

The allocfreetrace=1 mode prints a stack trace for each block
allocated and freed, and also a stack trace for each garbage collection.

It was implemented by reusing the heap profiling support: if allocfreetrace=1
then the heap profile was effectively running at 1 sample per 1 byte allocated
(always sample). The stack being shown at allocation was the stack gathered
for profiling, meaning it was derived only from the program counters and
did not include information about function arguments or frame pointers.
The stack being shown at free was the allocation stack, not the free stack.
If you are generating this log, you can find the allocation stack yourself, but
it can be useful to see exactly the sequence that led to freeing the block:
was it the garbage collector or an explicit free? Now that the garbage collector
runs on an m0 stack, the stack trace for the garbage collector was never interesting.

Fix all these problems:

1. Decouple allocfreetrace=1 from heap profiling.
2. Print the standard goroutine stack traces instead of a custom format.
3. Print the stack trace at time of allocation for an allocation,
   and print the stack trace at time of free (not the allocation trace again)
   for a free.
4. Print all goroutine stacks at garbage collection. Having all the stacks
   means that you can see the exact point at which each goroutine was
   preempted, which is often useful for identifying liveness-related errors.

GODEBUG=gcdead=1:

This mode overwrites dead pointers with a poison value.
Detect the poison value as an invalid pointer during collection,
the same way that small integers are invalid pointers.

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/81670043
2014-04-01 13:30:10 -04:00
Russ Cox
5a23a7e52c runtime: enable 'bad pointer' check during garbage collection of Go stack frames
This is the same check we use during stack copying.
The check cannot be applied to C stack frames, even
though we do emit pointer bitmaps for the arguments,
because (1) the pointer bitmaps assume all arguments
are always live, not true of outputs during the prologue,
and (2) the pointer bitmaps encode interface values as
pointer pairs, not true of interfaces holding integers.

For the rest of the frames, however, we should hold ourselves
to the rule that a pointer marked live really is initialized.
The interface scanning already implicitly checks this
because it interprets the type word  as a valid type pointer.

This may slow things down a little because of the extra loads.
Or it may speed things up because we don't bother enqueuing
nil pointers anymore. Enough of the rest of the system is slow
right now that we can't measure it meaningfully.
Enable for now, even if it is slow, to shake out bugs in the
liveness bitmaps, and then decide whether to turn it off
for the Go 1.3 release (issue 7650 reminds us to do this).

The new m->traceback field lets us force printing of fp=
values on all goroutine stack traces when we detect a
bad pointer. This makes it easier to understand exactly
where in the frame the bad pointer is, so that we can trace
it back to a specific variable and determine what is wrong.

Update #7650

LGTM=khr
R=khr
CC=golang-codereviews
https://golang.org/cl/80860044
2014-03-27 14:06:15 -04:00
Dmitriy Vyukov
7c75a862b4 runtime: eliminate false retention due to m->moreargp/morebuf
m->moreargp/morebuf were not cleared in case of preemption and stack growing,
it can lead to persistent leaks of large memory blocks.

It seems to fix the sync.Pool finalizer failures. I've run the test 500'000 times
w/o a single failure; previously it would fail dozens of times.

Fixes #7633.
Fixes #7533.

LGTM=rsc
R=golang-codereviews
CC=golang-codereviews, khr, rsc
https://golang.org/cl/80480044
2014-03-26 19:06:15 +04:00
Keith Randall
1b45cc45e3 runtime: redo stack map entries to avoid false retention
Change two-bit stack map entries to encode:
0 = dead
1 = scalar
2 = pointer
3 = multiword

If multiword, the two-bit entry for the following word encodes:
0 = string
1 = slice
2 = iface
3 = eface

That way, during stack scanning we can check if a string
is zero length or a slice has zero capacity.  We can avoid
following the contained pointer in those cases.  It is safe
to do so because it can never be dereferenced, and it is
desirable to do so because it may cause false retention
of the following block in memory.

Slice feature turned off until issue 7564 is fixed.

Update #7549

LGTM=rsc
R=golang-codereviews, bradfitz, rsc
CC=golang-codereviews
https://golang.org/cl/76380043
2014-03-25 14:11:34 -07:00
Dmitriy Vyukov
8d321625fd runtime: fix spans corruption
The problem was that spans end up in wrong lists after split
(e.g. in h->busy instead of h->central->empty).
Also the span can be non-swept before split,
I don't know what it can cause, but it's safer to operate on swept spans.
Fixes #7544.

R=golang-codereviews, rsc
CC=golang-codereviews, khr
https://golang.org/cl/76160043
2014-03-14 23:25:48 +04:00
Dmitriy Vyukov
d8e6881166 runtime: report "out of memory" in efence mode
Currently processes crash with obscure message.
Say that it's "out of memory".

LGTM=rsc
R=golang-codereviews
CC=golang-codereviews, khr, rsc
https://golang.org/cl/75820045
2014-03-14 21:22:03 +04:00
Dmitriy Vyukov
b8d40172ce runtime: do not shrink stacks GOCOPYSTACK=0
LGTM=rsc
R=golang-codereviews
CC=golang-codereviews, khr, rsc
https://golang.org/cl/76070043
2014-03-14 21:11:04 +04:00
Dmitriy Vyukov
e678ab4e37 runtime: detect stack split after fork
This check would allowed to easily prevent issue 7511.
Update #7511

LGTM=rsc
R=rsc, aram
CC=golang-codereviews
https://golang.org/cl/75260043
2014-03-13 17:41:08 +04:00
Dmitriy Vyukov
5daffee17f runtime: fix stack size check
When we copy stack, we check only new size of the top segment.
This is incorrect, because we can have other segments below it.

LGTM=khr
R=golang-codereviews, khr
CC=golang-codereviews, rsc
https://golang.org/cl/73980045
2014-03-13 13:16:02 +04:00
Dmitriy Vyukov
1f4d2e79b0 runtime: efence support for growable stacks
1. Fix the bug that shrinkstack returns memory to heap.
   This causes growslice to misbehave (it manually initialized
   blocks, and in efence mode shrinkstack's free leads to
   partially-initialized blocks coming out of growslice.
   Which in turn causes GC to crash while treating the garbage
   as Eface/Iface.
2. Enable efence for stack segments.

LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews, khr
https://golang.org/cl/74080043
2014-03-12 10:21:34 +04:00
Dave Cheney
6431be3fe4 runtime: more Native Client fixes
Thanks to Ian for spotting these.

runtime.h: define uintreg correctly.
stack.c: address warning caused by the type of uintreg being 32 bits on amd64p32.

Commentary (mainly for my own use)

nacl/amd64p32 defines a machine with 64bit registers, but address space is limited to a 4gb window (the window is placed randomly inside the full 48 bit virtual address space of a process). To cope with this 6c defines _64BIT and _64BITREG.

_64BITREG is always defined by 6c, so both GOARCH=amd64 and GOARCH=amd64p32 use 64bit wide registers.

However _64BIT itself is only defined when 6c is compiling for amd64 targets. The definition is elided for amd64p32 environments causing int, uint and other arch specific types to revert to their 32bit definitions.

LGTM=iant
R=iant, rsc, remyoudompheng
CC=golang-codereviews
https://golang.org/cl/72860046
2014-03-11 14:43:10 +11:00
Shenghou Ma
84570aa9a1 runtime: round stack size to power of 2.
Fixes build on windows/386 and plan9/386.
Fixes #7487.

LGTM=mattn.jp, dvyukov, rsc
R=golang-codereviews, mattn.jp, dvyukov, 0intro, rsc
CC=golang-codereviews
https://golang.org/cl/72360043
2014-03-07 15:11:16 -05:00
Dmitriy Vyukov
1a89e6388c runtime: refactor and fix stack management code
There are at least 3 bugs:
1. g->stacksize accounting is broken during copystack/shrinkstack
2. stktop->free is not properly maintained during copystack/shrinkstack
3. stktop->free logic is broken:
        we can have stktop->free==FixedStack,
        and we will free it into stack cache,
        but it actually comes from heap as the result of non-copying segment shrink
This shows as at least spurious races on race builders (maybe something else as well I don't know).

The idea behind the refactoring is to consolidate stacksize and
segment origin logic in stackalloc/stackfree.

Fixes #7490.

LGTM=rsc, khr
R=golang-codereviews, rsc, khr
CC=golang-codereviews
https://golang.org/cl/72440043
2014-03-07 20:52:29 +04:00
Keith Randall
f4359afa7f runtime: shrink bigger stacks without any copying.
Instead, split the underlying storage in half and
free just half of it.

Shrinking without copying lets us reclaim storage used
by a previously profligate Go routine that has now blocked
inside some C code.

To shrink in place, we need all stacks to be a power of 2 in size.

LGTM=rsc
R=golang-codereviews, rsc
CC=golang-codereviews
https://golang.org/cl/69580044
2014-03-06 16:03:43 -08:00
Dmitriy Vyukov
8ca3372d7b runtime: fix bad g status after copystack
LGTM=khr
R=khr
CC=golang-codereviews, rsc
https://golang.org/cl/69870054
2014-03-06 21:33:19 +04:00
Keith Randall
1665b006a5 runtime: grow stack by copying
On stack overflow, if all frames on the stack are
copyable, we copy the frames to a new stack twice
as large as the old one.  During GC, if a G is using
less than 1/4 of its stack, copy the stack to a stack
half its size.

TODO
- Do something about C frames.  When a C frame is in the
  stack segment, it isn't copyable.  We allocate a new segment
  in this case.
  - For idempotent C code, we can abort it, copy the stack,
    then retry.  I'm working on a separate CL for this.
  - For other C code, we can raise the stackguard
    to the lowest Go frame so the next call that Go frame
    makes triggers a copy, which will then succeed.
- Pick a starting stack size?

The plan is that eventually we reach a point where the
stack contains only copyable frames.

LGTM=rsc
R=dvyukov, rsc
CC=golang-codereviews
https://golang.org/cl/54650044
2014-02-26 23:28:44 -08:00
Russ Cox
67c83db60d runtime: use goc2c as much as possible
Package runtime's C functions written to be called from Go
started out written in C using carefully constructed argument
lists and the FLUSH macro to write a result back to memory.

For some functions, the appropriate parameter list ended up
being architecture-dependent due to differences in alignment,
so we added 'goc2c', which takes a .goc file containing Go func
declarations but C bodies, rewrites the Go func declaration to
equivalent C declarations for the target architecture, adds the
needed FLUSH statements, and writes out an equivalent C file.
That C file is compiled as part of package runtime.

Native Client's x86-64 support introduces the most complex
alignment rules yet, breaking many functions that could until
now be portably written in C. Using goc2c for those avoids the
breakage.

Separately, Keith's work on emitting stack information from
the C compiler would require the hand-written functions
to add #pragmas specifying how many arguments are result
parameters. Using goc2c for those avoids maintaining #pragmas.

For both reasons, use goc2c for as many Go-called C functions
as possible.

This CL is a replay of the bulk of CL 15400047 and CL 15790043,
both of which were reviewed as part of the NaCl port and are
checked in to the NaCl branch. This CL is part of bringing the
NaCl code into the main tree.

No new code here, just reformatting and occasional movement
into .h files.

LGTM=r
R=dave, alex.brainman, r
CC=golang-codereviews
https://golang.org/cl/65220044
2014-02-20 15:58:47 -05:00
Russ Cox
00a757fb74 runtime: relax preemption assertion during stack split
The case can happen when starttheworld is calling acquirep
to get things moving again and acquirep gets preempted.
The stack trace is in golang.org/issue/6644.

It is difficult to build a short test case for this, but
the person who reported issue 6644 confirms that this
solves the problem.

Fixes #6644.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/18740044
2013-10-28 19:40:40 -04:00
Russ Cox
7276c02b41 runtime, cmd/gc, cmd/ld: ignore method wrappers in recover
Bug #1:

Issue 5406 identified an interesting case:
        defer iface.M()
may end up calling a wrapper that copies an indirect receiver
from the iface value and then calls the real M method. That's
two calls down, not just one, and so recover() == nil always
in the real M method, even during a panic.

[For the purposes of this entire discussion, a wrapper's
implementation is a function containing an ordinary call, not
the optimized tail call form that is somtimes possible. The
tail call does not create a second frame, so it is already
handled correctly.]

Fix this bug by introducing g->panicwrap, which counts the
number of bytes on current stack segment that are due to
wrapper calls that should not count against the recover
check. All wrapper functions must now adjust g->panicwrap up
on entry and back down on exit. This adds slightly to their
expense; on the x86 it is a single instruction at entry and
exit; on the ARM it is three. However, the alternative is to
make a call to recover depend on being able to walk the stack,
which I very much want to avoid. We have enough problems
walking the stack for garbage collection and profiling.
Also, if performance is critical in a specific case, it is already
faster to use a pointer receiver and avoid this kind of wrapper
entirely.

Bug #2:

The old code, which did not consider the possibility of two
calls, already contained a check to see if the call had split
its stack and so the panic-created segment was one behind the
current segment. In the wrapper case, both of the two calls
might split their stacks, so the panic-created segment can be
two behind the current segment.

Fix this by propagating the Stktop.panic flag forward during
stack splits instead of looking backward during recover.

Fixes #5406.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/13367052
2013-09-12 14:00:16 -04:00
Dmitriy Vyukov
a33ef8d11b runtime: account for all sys memory in MemStats
Currently lots of sys allocations are not accounted in any of XxxSys,
including GC bitmap, spans table, GC roots blocks, GC finalizer blocks,
iface table, netpoll descriptors and more. Up to ~20% can unaccounted.
This change introduces 2 new stats: GCSys and OtherSys for GC metadata
and all other misc allocations, respectively.
Also ensures that all XxxSys indeed sum up to Sys. All sys memory allocation
functions require the stat for accounting, so that it's impossible to miss something.
Also fix updating of mcache_sys/inuse, they were not updated after deallocation.

test/bench/garbage/parser before:
Sys		670064344
HeapSys		610271232
StackSys	65536
MSpanSys	14204928
MCacheSys	16384
BuckHashSys	1439992

after:
Sys		670064344
HeapSys		610271232
StackSys	65536
MSpanSys	14188544
MCacheSys	16384
BuckHashSys	3194304
GCSys		39198688
OtherSys	3129656

Fixes #5799.

R=rsc, dave, alex.brainman
CC=golang-dev
https://golang.org/cl/12946043
2013-09-06 16:55:40 -04:00
Dmitriy Vyukov
dfdd1ba028 runtime: do not trigger GC on g0
GC acquires worldsema, which is a goroutine-level semaphore
which parks goroutines. g0 can not be parked.
Fixes #6193.

R=khr, khr
CC=golang-dev
https://golang.org/cl/12880045
2013-08-22 02:17:45 +04:00
Keith Randall
6401e0f83f runtime: don't run finalizers if we're still on the g0 stack.
R=golang-dev, rsc, dvyukov, khr
CC=golang-dev
https://golang.org/cl/11386044
2013-08-19 12:20:50 -07:00
Russ Cox
757e0de89f runtime: impose stack size limit
The goal is to stop only those programs that would keep
going and run the machine out of memory, but before they do that.
1 GB on 64-bit, 250 MB on 32-bit.
That seems implausibly large, and it can be adjusted.

Fixes #2556.
Fixes #4494.
Fixes #5173.

R=khr, r, dvyukov
CC=golang-dev
https://golang.org/cl/12541052
2013-08-15 22:34:06 -04:00
Keith Randall
9cd570680b runtime: reimplement reflect.call to not use stack splitting.
R=golang-dev, r, khr, rsc
CC=golang-dev
https://golang.org/cl/12053043
2013-08-02 13:03:14 -07:00
Russ Cox
ebc5513be6 runtime: in newstack, double-check calling goroutine
Checking this condition helped me find the arm problem last night.

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/12267043
2013-08-02 13:51:28 -04:00
Dmitriy Vyukov
e84d9e1fb3 runtime: do not split stacks in syscall status
Split stack checks (morestack) corrupt g->sched,
but g->sched must be preserved consistent for GC/traceback.
The change implements runtime.notetsleepg function,
which does entersyscall/exitsyscall and is carefully arranged
to not call any split functions in between.

R=rsc
CC=golang-dev
https://golang.org/cl/11575044
2013-07-29 22:22:34 +04:00
Dmitriy Vyukov
f8a850b250 runtime: refactor mallocgc
Make it accept type, combine flags.
Several reasons for the change:
1. mallocgc and settype must be atomic wrt GC
2. settype is called from only one place now
3. it will help performance (eventually settype
functionality must be combined with markallocated)
4. flags are easier to read now (no mallocgc(sz, 0, 1, 0) anymore)

R=golang-dev, iant, nightlyone, rsc, dave, khr, bradfitz, r
CC=golang-dev
https://golang.org/cl/10136043
2013-07-26 21:17:24 +04:00
Dmitriy Vyukov
bc31bcccd3 runtime: preempt long-running goroutines
If a goroutine runs for more than 10ms, preempt it.
Update #543.

R=rsc
CC=golang-dev
https://golang.org/cl/10796043
2013-07-19 01:22:26 +04:00
Dmitriy Vyukov
5887f142a3 runtime: more reliable preemption
Currently preemption signal g->stackguard0==StackPreempt
can be lost if it is received when preemption is disabled
(e.g. m->lock!=0). This change duplicates the preemption
signal in g->preempt and restores g->stackguard0
when preemption is enabled.
Update #543.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/10792043
2013-07-17 12:52:37 -04:00
Russ Cox
031c107cad cmd/ld: fix large stack split for preempt check
If the stack frame size is larger than the known-unmapped region at the
bottom of the address space, then the stack split prologue cannot use the usual
condition:

        SP - size >= stackguard

because SP - size may wrap around to a very large number.
Instead, if the stack frame is large, the prologue tests:

        SP - stackguard >= size

(This ends up being a few instructions more expensive, so we don't do it always.)

Preemption requests register by setting stackguard to a very large value, so
that the first test (SP - size >= stackguard) cannot possibly succeed.
Unfortunately, that same very large value causes a wraparound in the
second test (SP - stackguard >= size), making it succeed incorrectly.

To avoid *that* wraparound, we have to amend the test:

        stackguard != StackPreempt && SP - stackguard >= size

This test is only used for functions with large frames, which essentially
always split the stack, so the cost of the few instructions is noise.

This CL and CL 11085043 together fix the known issues with preemption,
at the beginning of a function, so we will be able to try turning it on again.

R=ken2
CC=golang-dev
https://golang.org/cl/11205043
2013-07-12 12:12:56 -04:00
Dmitriy Vyukov
1e112cd59f runtime: preempt goroutines for GC
The last patch for preemptive scheduler,
with this change stoptheworld issues preemption
requests every 100us.
Update #543.

R=golang-dev, daniel.morsing, rsc
CC=golang-dev
https://golang.org/cl/10264044
2013-06-28 17:52:17 +04:00
Russ Cox
f0d73fbc7c runtime: use gp->sched.sp for stack overflow check
On x86 it is a few words lower on the stack than m->morebuf.sp
so it is a more precise check. Enabling the check requires recording
a valid gp->sched in reflect.call too. This is a good thing in general,
since it will make stack traces during reflect.call work better, and it
may be useful for preemption too.

R=dvyukov
CC=golang-dev
https://golang.org/cl/10709043
2013-06-27 16:51:06 -04:00
Dmitriy Vyukov
4eb17ecd1f runtime: fix goroutine status corruption
runtime.entersyscall() sets g->status = Gsyscall,
then calls runtime.lock() which causes stack split.
runtime.newstack() resets g->status to Grunning.
This will lead to crash during GC (world is not stopped) or GC will scan stack incorrectly.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/10696043
2013-06-28 00:49:53 +04:00
Russ Cox
6fa3c89b77 runtime: record proper goroutine state during stack split
Until now, the goroutine state has been scattered during the
execution of newstack and oldstack. It's all there, and those routines
know how to get back to a working goroutine, but other pieces of
the system, like stack traces, do not. If something does interrupt
the newstack or oldstack execution, the rest of the system can't
understand the goroutine. For example, if newstack decides there
is an overflow and calls throw, the stack tracer wouldn't dump the
goroutine correctly.

For newstack to save a useful state snapshot, it needs to be able
to rewind the PC in the function that triggered the split back to
the beginning of the function. (The PC is a few instructions in, just
after the call to morestack.) To make that possible, we change the
prologues to insert a jmp back to the beginning of the function
after the call to morestack. That is, the prologue used to be roughly:

        TEXT myfunc
                check for split
                jmpcond nosplit
                call morestack
        nosplit:
                sub $xxx, sp

Now an extra instruction is inserted after the call:

        TEXT myfunc
        start:
                check for split
                jmpcond nosplit
                call morestack
                jmp start
        nosplit:
                sub $xxx, sp

The jmp is not executed directly. It is decoded and simulated by
runtime.rewindmorestack to discover the beginning of the function,
and then the call to morestack returns directly to the start label
instead of to the jump instruction. So logically the jmp is still
executed, just not by the cpu.

The prologue thus repeats in the case of a function that needs a
stack split, but against the cost of the split itself, the extra few
instructions are noise. The repeated prologue has the nice effect of
making a stack split double-check that the new stack is big enough:
if morestack happens to return on a too-small stack, we'll now notice
before corruption happens.

The ability for newstack to rewind to the beginning of the function
should help preemption too. If newstack decides that it was called
for preemption instead of a stack split, it now has the goroutine state
correctly paused if rescheduling is needed, and when the goroutine
can run again, it can return to the start label on its original stack
and re-execute the split check.

Here is an example of a split stack overflow showing the full
trace, without any special cases in the stack printer.
(This one was triggered by making the split check incorrect.)

runtime: newstack framesize=0x0 argsize=0x18 sp=0x6aebd0 stack=[0x6b0000, 0x6b0fa0]
        morebuf={pc:0x69f5b sp:0x6aebd8 lr:0x0}
        sched={pc:0x68880 sp:0x6aebd0 lr:0x0 ctxt:0x34e700}
runtime: split stack overflow: 0x6aebd0 < 0x6b0000
fatal error: runtime: split stack overflow

goroutine 1 [stack split]:
runtime.mallocgc(0x290, 0x100000000, 0x1)
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:21 fp=0x6aebd8
runtime.new()
        /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:682 +0x5b fp=0x6aec08
go/build.(*Context).Import(0x5ae340, 0xc210030c71, 0xa, 0xc2100b4380, 0x1b, ...)
        /Users/rsc/g/go/src/pkg/go/build/build.go:424 +0x3a fp=0x6b00a0
main.loadImport(0xc210030c71, 0xa, 0xc2100b4380, 0x1b, 0xc2100b42c0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:249 +0x371 fp=0x6b01a8
main.(*Package).load(0xc21017c800, 0xc2100b42c0, 0xc2101828c0, 0x0, 0x0, ...)
        /Users/rsc/g/go/src/cmd/go/pkg.go:431 +0x2801 fp=0x6b0c98
main.loadPackage(0x369040, 0x7, 0xc2100b42c0, 0x0)
        /Users/rsc/g/go/src/cmd/go/pkg.go:709 +0x857 fp=0x6b0f80
----- stack segment boundary -----
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc2100e6c00, 0xc2100e5750, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:539 +0x437 fp=0x6b14a0
main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc21015b400, 0x2, ...)
        /Users/rsc/g/go/src/cmd/go/build.go:528 +0x1d2 fp=0x6b1658
main.(*builder).test(0xc2100902a0, 0xc210092000, 0x0, 0x0, 0xc21008ff60, ...)
        /Users/rsc/g/go/src/cmd/go/test.go:622 +0x1b53 fp=0x6b1f68
----- stack segment boundary -----
main.runTest(0x5a6b20, 0xc21000a020, 0x2, 0x2)
        /Users/rsc/g/go/src/cmd/go/test.go:366 +0xd09 fp=0x6a5cf0
main.main()
        /Users/rsc/g/go/src/cmd/go/main.go:161 +0x4f9 fp=0x6a5f78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:183 +0x92 fp=0x6a5fa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1266 fp=0x6a5fa8

And here is a seg fault during oldstack:

SIGSEGV: segmentation violation
PC=0x1b2a6

runtime.oldstack()
        /Users/rsc/g/go/src/pkg/runtime/stack.c:159 +0x76
runtime.lessstack()
        /Users/rsc/g/go/src/pkg/runtime/asm_amd64.s:270 +0x22

goroutine 1 [stack unsplit]:
fmt.(*pp).printArg(0x2102e64e0, 0xe5c80, 0x2102c9220, 0x73, 0x0, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:818 +0x3d3 fp=0x221031e6f8
fmt.(*pp).doPrintf(0x2102e64e0, 0x12fb20, 0x2, 0x221031eb98, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:1183 +0x15cb fp=0x221031eaf0
fmt.Sprintf(0x12fb20, 0x2, 0x221031eb98, 0x1, 0x1, ...)
        /Users/rsc/g/go/src/pkg/fmt/print.go:234 +0x67 fp=0x221031eb40
flag.(*stringValue).String(0x2102c9210, 0x1, 0x0)
        /Users/rsc/g/go/src/pkg/flag/flag.go:180 +0xb3 fp=0x221031ebb0
flag.(*FlagSet).Var(0x2102f6000, 0x293d38, 0x2102c9210, 0x143490, 0xa, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:633 +0x40 fp=0x221031eca0
flag.(*FlagSet).StringVar(0x2102f6000, 0x2102c9210, 0x143490, 0xa, 0x12fa60, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:550 +0x91 fp=0x221031ece8
flag.(*FlagSet).String(0x2102f6000, 0x143490, 0xa, 0x12fa60, 0x0, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:563 +0x87 fp=0x221031ed38
flag.String(0x143490, 0xa, 0x12fa60, 0x0, 0x161950, ...)
        /Users/rsc/g/go/src/pkg/flag/flag.go:570 +0x6b fp=0x221031ed80
testing.init()
        /Users/rsc/g/go/src/pkg/testing/testing.go:-531 +0xbb fp=0x221031edc0
strings_test.init()
        /Users/rsc/g/go/src/pkg/strings/strings_test.go:1115 +0x62 fp=0x221031ef70
main.init()
        strings/_test/_testmain.go:90 +0x3d fp=0x221031ef78
runtime.main()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:180 +0x8a fp=0x221031efa0
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269 fp=0x221031efa8

goroutine 2 [runnable]:
runtime.MHeap_Scavenger()
        /Users/rsc/g/go/src/pkg/runtime/mheap.c:438
runtime.goexit()
        /Users/rsc/g/go/src/pkg/runtime/proc.c:1269
created by runtime.main
        /Users/rsc/g/go/src/pkg/runtime/proc.c:166

rax     0x23ccc0
rbx     0x23ccc0
rcx     0x0
rdx     0x38
rdi     0x2102c0170
rsi     0x221032cfe0
rbp     0x221032cfa0
rsp     0x7fff5fbff5b0
r8      0x2102c0120
r9      0x221032cfa0
r10     0x221032c000
r11     0x104ce8
r12     0xe5c80
r13     0x1be82baac718
r14     0x13091135f7d69200
r15     0x0
rip     0x1b2a6
rflags  0x10246
cs      0x2b
fs      0x0
gs      0x0

Fixes #5723.

R=r, dvyukov, go.peter.90, dave, iant
CC=golang-dev
https://golang.org/cl/10360048
2013-06-27 11:32:01 -04:00
Dmitriy Vyukov
5caf762457 runtime: remove unused moreframesize_minalloc field
It was used to request large stack segment for GC
when it was running not on g0.
Now GC is running on g0 with large stack,
and it is not needed anymore.

R=golang-dev, dave
CC=golang-dev
https://golang.org/cl/10242045
2013-06-15 16:02:39 +04:00
Russ Cox
d67e7e3acf runtime: add lr, ctxt, ret to Gobuf
Add gostartcall and gostartcallfn.
The old gogocall = gostartcall + gogo.
The old gogocallfn = gostartcallfn + gogo.

R=dvyukov, minux.ma
CC=golang-dev
https://golang.org/cl/10036044
2013-06-12 15:22:26 -04:00
Russ Cox
e58f798c0c runtime: adjust traceback / garbage collector boundary
The garbage collection routine addframeroots is duplicating
logic in the traceback routine that calls it, sometimes correctly,
sometimes incorrectly, sometimes incompletely.
Pass necessary information to addframeroots instead of
deriving it anew.

Should make addframeroots significantly more robust.
It's certainly smaller.

Also try to standardize on uintptr for saved pc, sp values.

Will make CL 10036044 trivial.

R=golang-dev, dave, dvyukov
CC=golang-dev
https://golang.org/cl/10169045
2013-06-12 08:49:38 -04:00
Dmitriy Vyukov
f5becf4233 runtime: add stackguard0 to G
This is part of preemptive scheduler.
stackguard0 is checked in split stack checks and can be set to StackPreempt.
stackguard is not set to StackPreempt (holds the original value).

R=golang-dev, daniel.morsing, iant
CC=golang-dev
https://golang.org/cl/9875043
2013-06-03 12:28:24 +04:00