Profiling of multithreaded applications works correctly on OpenBSD
5.4-current, so enable the profiling test.
R=golang-codereviews, minux.ma
CC=golang-codereviews
https://golang.org/cl/50940043
Two bugs:
1. The first iteration of the traceback always uses LR when provided,
which it is (only) during a profiling signal, but in fact LR is correct
only if the stack frame has not been allocated yet. Otherwise an
intervening call may have changed LR, and the saved copy in the stack
frame should be used. Fix in traceback_arm.c.
2. The division runtime call adds 8 bytes to the stack. In order to
keep the traceback routines happy, it must copy the saved LR into
the new 0(SP). Change
SUB $8, SP
into
MOVW 0(SP), R11 // r11 is temporary, for use by linker
MOVW.W R11, -8(SP)
to update SP and 0(SP) atomically, so that the traceback always
sees a saved LR at 0(SP).
Fixes#6681.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/19910044
The CL causes misc/cgo/test to fail randomly.
I suspect that the problem is the use of a division instruction
in usleep, which can be called while trying to acquire an m
and therefore cannot store the denominator in m.
The solution to that would be to rewrite the code to use a
magic multiply instead of a divide, but now we're getting
pretty far off the original code.
Go back to the original in preparation for a different,
less efficient but simpler fix.
««« original CL description
cmd/5l, runtime: make ARM integer division profiler-friendly
The implementation of division constructed non-standard
stack frames that could not be handled by the traceback
routines.
CL 13239052 left the frames non-standard but fixed them
for the specific case of a divide-by-zero panic.
A profiling signal can arrive at any time, so that fix
is not sufficient.
Change the division to store the extra argument in the M struct
instead of in a new stack slot. That keeps the frames bog standard
at all times.
Also fix a related bug in the traceback code: when starting
a traceback, the LR register should be ignored if the current
function has already allocated its stack frame and saved the
original LR on the stack. The stack copy should be used, as the
LR register may have been modified.
Combined, these make the torture test from issue 6681 pass.
Fixes#6681.
R=golang-dev, r, josharian
CC=golang-dev
https://golang.org/cl/19810043
»»»
TBR=r
CC=golang-dev
https://golang.org/cl/20350043
The implementation of division constructed non-standard
stack frames that could not be handled by the traceback
routines.
CL 13239052 left the frames non-standard but fixed them
for the specific case of a divide-by-zero panic.
A profiling signal can arrive at any time, so that fix
is not sufficient.
Change the division to store the extra argument in the M struct
instead of in a new stack slot. That keeps the frames bog standard
at all times.
Also fix a related bug in the traceback code: when starting
a traceback, the LR register should be ignored if the current
function has already allocated its stack frame and saved the
original LR on the stack. The stack copy should be used, as the
LR register may have been modified.
Combined, these make the torture test from issue 6681 pass.
Fixes#6681.
R=golang-dev, r, josharian
CC=golang-dev
https://golang.org/cl/19810043
Makes build unnecessarily slower. Will fix the parser instead.
««« original CL description
runtime/pprof: run TestGoroutineSwitch for longer
Short test now takes about 0.5 second here.
Fixes#6417.
The failure was also seen on our builders.
R=golang-dev, minux.ma, r
CC=golang-dev
https://golang.org/cl/13321048
»»»
R=golang-dev, minux.ma
CC=golang-dev
https://golang.org/cl/13720048
Short test now takes about 0.5 second here.
Fixes#6417.
The failure was also seen on our builders.
R=golang-dev, minux.ma, r
CC=golang-dev
https://golang.org/cl/13321048
Because profiling signals can arrive at any time, we must
handle the case where a profiling signal arrives halfway
through a goroutine switch. Luckily, although there is much
to think through, very little needs to change.
Fixes#6000.
Fixes#6015.
R=golang-dev, dvyukov
CC=golang-dev
https://golang.org/cl/13421048
The NetBSD and OpenBSD failures are apparently real,
not due to the test bug fixed in 100b9fc0c46f.
««« original CL description
runtime/pprof: test netbsd and openbsd again
Maybe these will work now.
R=golang-dev, dvyukov, bradfitz
CC=golang-dev
https://golang.org/cl/12787044
»»»
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/12873043
NetBSD and OpenBSD are broken like OS X is. Good to know.
Drop required count from avg/2 to avg/3, because the
Plan 9 builder just barely missed avg/2 in one of its runs.
R=golang-dev, dvyukov
CC=golang-dev
https://golang.org/cl/12548043
This means that pprof will no longer report profiles on OS X.
That's unfortunate, but the profiles were often wrong and, worse,
it was difficult to tell whether the profile was wrong or not.
The workarounds were making the scheduler more complex,
possibly caused a deadlock (see issue 5519), and did not actually
deliver reliable results.
It may be possible for adventurous users to apply a patch to
their kernels to get working results, or perhaps having no results
will encourage someone to do the work of creating a profiling
thread like on Windows. Issue 6047 has details.
Fixes#5519.
Fixes#6047.
R=golang-dev, bradfitz, r
CC=golang-dev
https://golang.org/cl/12429045
until we decide what to do with issues 5659/5736.
Profiling with race detector is not very useful in general,
and now it makes race builders red.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/10523043
Work around profiling kernel bug with signal masks.
Still broken on 64-bit Snow Leopard kernel,
but I think we can ignore that one and let people
upgrade to Lion.
Add new trivial tools addr2line and objdump to take
the place of the GNU tools of the same name, since
those are not installed on OS X.
Adapt pprof to invoke 'go tool addr2line' and
'go tool objdump' if the system tools do not exist.
Clean up disassembly of base register on amd64.
Fixes#2008.
R=golang-dev, bradfitz, mikioh.mikioh, r, iant
CC=golang-dev
https://golang.org/cl/5697066
Fixes#1641.
Actually it side steps the real issue, which is that the
setitimer(2) implementation on OS X is not useful for
profiling of multi-threaded programs. I filed the below
using the Apple Bug Reporter.
/*
Filed as Apple Bug Report #9177434.
This program creates a new pthread that loops, wasting cpu time.
In the main pthread, it sleeps on a condition that will never come true.
Before doing so it sets up an interval timer using ITIMER_PROF.
The handler prints a message saying which thread it is running on.
POSIX does not specify which thread should receive the signal, but
in order to be useful in a user-mode self-profiler like pprof or gprof
http://code.google.com/p/google-perftoolshttp://www.delorie.com/gnu/docs/binutils/gprof_25.html
it is important that the thread that receives the signal is the one
whose execution caused the timer to expire.
Linux and FreeBSD handle this by sending the signal to the process's
queue but delivering it to the current thread if possible:
http://lxr.linux.no/linux+v2.6.38/kernel/signal.c#L802
807 /*
808 * Now find a thread we can wake up to take the signal off the queue.
809 *
810 * If the main thread wants the signal, it gets first crack.
811 * Probably the least surprising to the average bear.
812 * /
http://fxr.watson.org/fxr/source/kern/kern_sig.c?v=FREEBSD8;im=bigexcerpts#L1907
1914 /*
1915 * Check if current thread can handle the signal without
1916 * switching context to another thread.
1917 * /
On those operating systems, this program prints:
$ ./a.out
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
$
The OS X kernel does not have any such preference. Its get_signalthread
does not prefer current_thread(), in contrast to the other two systems,
so the signal gets delivered to the first thread in the list that is able to
handle it, which ends up being the main thread in this experiment.
http://fxr.watson.org/fxr/source/bsd/kern/kern_sig.c?v=xnu-1456.1.26;im=excerpts#L1666
$ ./a.out
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
$
The fix is to make get_signalthread use the same heuristic as
Linux and FreeBSD, namely to use current_thread() if possible
before scanning the process thread list.
*/
#include <sys/time.h>
#include <sys/signal.h>
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
static void handler(int);
static void* looper(void*);
static pthread_t pmain, ploop;
int
main(void)
{
struct itimerval it;
struct sigaction sa;
pthread_cond_t cond;
pthread_mutex_t mu;
memset(&sa, 0, sizeof sa);
sa.sa_handler = handler;
sa.sa_flags = SA_RESTART;
memset(&sa.sa_mask, 0xff, sizeof sa.sa_mask);
sigaction(SIGPROF, &sa, 0);
pmain = pthread_self();
pthread_create(&ploop, 0, looper, 0);
memset(&it, 0, sizeof it);
it.it_interval.tv_usec = 10000;
it.it_value = it.it_interval;
setitimer(ITIMER_PROF, &it, 0);
pthread_mutex_init(&mu, 0);
pthread_mutex_lock(&mu);
pthread_cond_init(&cond, 0);
for(;;)
pthread_cond_wait(&cond, &mu);
return 0;
}
static void
handler(int sig)
{
static int nsig;
pthread_t p;
p = pthread_self();
if(p == pmain)
printf("signal on sleeping main thread\n");
else if(p == ploop)
printf("signal on cpu-chewing looper thread\n");
else
printf("signal on %p\n", (void*)p);
if(++nsig >= 10)
exit(0);
}
static void*
looper(void *v)
{
for(;;);
}
R=r
CC=golang-dev
https://golang.org/cl/4273113