There are 3 issues:
1. Skip argument of callers is off by 3,
so that all allocations are deep inside of memory profiler.
2. Memory profiling statistics are not updated after runtime.GC.
3. Testing package does not update memory profiling statistics
before capturing the profile.
Also add an end-to-end test.
Fixes#8867.
LGTM=rsc
R=rsc
CC=golang-codereviews
https://golang.org/cl/148710043
For -mode=atomic, we need to read the counters
using an atomic load to avoid a race. Not worth worrying
about when -mode=atomic is set during generation
of the profile, so we use atomic loads always.
Fixes#8630.
LGTM=rsc
R=dvyukov, rsc
CC=golang-codereviews
https://golang.org/cl/141800043