Currently when we compute the trigger for the next GC, we do it based
on an estimate of the reachable heap size at the start of the GC
cycle, which is itself based on an estimate of the floating garbage.
This was introduced by 4655aad to fix a bad feedback loop that allowed
the heap to grow to many times the true reachable size.
However, this estimate gets easily confused by rapidly allocating
applications, and, worse it's different than the heap size the trigger
controller uses to compute the trigger itself. This results in the
trigger controller often thinking that GC finished before it started.
Since this would be a pretty great outcome from it's perspective, it
sets the trigger for the next cycle as close to the next goal as
possible (which is limited to 95% of the goal).
Furthermore, the bad feedback loop this estimate originally fixed
seems not to happen any more, suggesting it was fixed more correctly
by some other change in the mean time. Finally, with the change to
allocate black, it shouldn't even be theoretically possible for this
bad feedback loop to occur.
Hence, eliminate the floating garbage estimate and simply consider the
reachable heap to be the marked heap. This harms overall throughput
slightly for allocation-heavy benchmarks, but significantly improves
mutator availability.
Fixes#12204. This brings the average trigger in this benchmark from
0.95 (the cap) to 0.7 and the active GC utilization from ~90% to ~45%.
Updates #14951. This makes the trigger controller much better behaved,
so it pulls the trigger lower if assists are consuming a lot of CPU
like it's supposed to, increasing mutator availability.
name old time/op new time/op delta
XBenchGarbage-12 2.21ms ± 1% 2.28ms ± 3% +3.29% (p=0.000 n=17+17)
Some of this slow down we paid for in earlier commits. Relative to the
start of the series to switch to allocate-black (the parent of "count
black allocations toward scan work"), the garbage benchmark is 2.62%
slower.
name old time/op new time/op delta
BinaryTree17-12 2.53s ± 3% 2.53s ± 3% ~ (p=0.708 n=20+19)
Fannkuch11-12 2.08s ± 0% 2.08s ± 0% -0.22% (p=0.002 n=19+18)
FmtFprintfEmpty-12 45.3ns ± 2% 45.2ns ± 3% ~ (p=0.505 n=20+20)
FmtFprintfString-12 129ns ± 0% 131ns ± 2% +1.80% (p=0.000 n=16+19)
FmtFprintfInt-12 121ns ± 2% 121ns ± 2% ~ (p=0.768 n=19+19)
FmtFprintfIntInt-12 186ns ± 1% 188ns ± 3% +0.99% (p=0.000 n=19+19)
FmtFprintfPrefixedInt-12 188ns ± 1% 188ns ± 1% ~ (p=0.947 n=18+16)
FmtFprintfFloat-12 254ns ± 1% 255ns ± 1% +0.30% (p=0.002 n=19+17)
FmtManyArgs-12 763ns ± 0% 770ns ± 0% +0.92% (p=0.000 n=18+18)
GobDecode-12 7.00ms ± 1% 7.04ms ± 1% +0.61% (p=0.049 n=20+20)
GobEncode-12 5.88ms ± 1% 5.88ms ± 0% ~ (p=0.641 n=18+19)
Gzip-12 214ms ± 1% 215ms ± 1% +0.43% (p=0.002 n=18+19)
Gunzip-12 37.6ms ± 0% 37.6ms ± 0% +0.11% (p=0.015 n=17+18)
HTTPClientServer-12 76.9µs ± 2% 78.1µs ± 2% +1.44% (p=0.000 n=20+18)
JSONEncode-12 15.2ms ± 2% 15.1ms ± 1% ~ (p=0.271 n=19+18)
JSONDecode-12 53.1ms ± 1% 53.3ms ± 0% +0.49% (p=0.000 n=18+19)
Mandelbrot200-12 4.04ms ± 1% 4.03ms ± 0% -0.33% (p=0.005 n=18+18)
GoParse-12 3.29ms ± 1% 3.28ms ± 1% ~ (p=0.146 n=16+17)
RegexpMatchEasy0_32-12 69.9ns ± 3% 69.5ns ± 1% ~ (p=0.785 n=20+19)
RegexpMatchEasy0_1K-12 237ns ± 0% 237ns ± 0% ~ (p=1.000 n=18+18)
RegexpMatchEasy1_32-12 69.5ns ± 1% 69.2ns ± 1% -0.44% (p=0.020 n=16+19)
RegexpMatchEasy1_1K-12 372ns ± 1% 371ns ± 2% ~ (p=0.086 n=20+19)
RegexpMatchMedium_32-12 108ns ± 3% 107ns ± 1% -1.00% (p=0.004 n=19+14)
RegexpMatchMedium_1K-12 34.2µs ± 4% 34.0µs ± 2% ~ (p=0.380 n=19+20)
RegexpMatchHard_32-12 1.77µs ± 4% 1.76µs ± 3% ~ (p=0.558 n=18+20)
RegexpMatchHard_1K-12 53.4µs ± 4% 52.8µs ± 2% -1.10% (p=0.020 n=18+20)
Revcomp-12 359ms ± 4% 377ms ± 0% +5.19% (p=0.000 n=20+18)
Template-12 63.7ms ± 2% 62.9ms ± 2% -1.27% (p=0.005 n=18+20)
TimeParse-12 316ns ± 2% 313ns ± 1% ~ (p=0.059 n=20+16)
TimeFormat-12 329ns ± 0% 331ns ± 0% +0.39% (p=0.000 n=16+18)
[Geo mean] 51.6µs 51.7µs +0.18%
Change-Id: I1dce4640c8205d41717943b021039fffea863c57
Reviewed-on: https://go-review.googlesource.com/21324
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently we allocate white for most of concurrent marking. This is
based on the classical argument that it produces less floating
garbage, since allocations during GC may not get linked into the heap
and allocating white lets us reclaim these. However, it's not clear
how often this actually happens, especially since our write barrier
shades any pointer as soon as it's installed in the heap regardless of
the color of the slot.
On the other hand, allocating black has several advantages that seem
to significantly outweigh this downside.
1) It naturally bounds the total scan work to the live heap size at
the start of a GC cycle. Allocating white does not, and thus depends
entirely on assists to prevent the heap from growing faster than it
can be scanned.
2) It reduces the total amount of scan work per GC cycle by the size
of newly allocated objects that are linked into the heap graph, since
objects allocated black never need to be scanned.
3) It reduces total write barrier work since more objects will already
be black when they are linked into the heap graph.
This gives a slight overall improvement in benchmarks.
name old time/op new time/op delta
XBenchGarbage-12 2.24ms ± 0% 2.21ms ± 1% -1.32% (p=0.000 n=18+17)
name old time/op new time/op delta
BinaryTree17-12 2.60s ± 3% 2.53s ± 3% -2.56% (p=0.000 n=20+20)
Fannkuch11-12 2.08s ± 1% 2.08s ± 0% ~ (p=0.452 n=19+19)
FmtFprintfEmpty-12 45.1ns ± 2% 45.3ns ± 2% ~ (p=0.367 n=19+20)
FmtFprintfString-12 131ns ± 3% 129ns ± 0% -1.60% (p=0.000 n=20+16)
FmtFprintfInt-12 122ns ± 0% 121ns ± 2% -0.86% (p=0.000 n=16+19)
FmtFprintfIntInt-12 187ns ± 1% 186ns ± 1% ~ (p=0.514 n=18+19)
FmtFprintfPrefixedInt-12 189ns ± 0% 188ns ± 1% -0.54% (p=0.000 n=16+18)
FmtFprintfFloat-12 256ns ± 0% 254ns ± 1% -0.43% (p=0.000 n=17+19)
FmtManyArgs-12 769ns ± 0% 763ns ± 0% -0.72% (p=0.000 n=18+18)
GobDecode-12 7.08ms ± 2% 7.00ms ± 1% -1.22% (p=0.000 n=20+20)
GobEncode-12 5.88ms ± 0% 5.88ms ± 1% ~ (p=0.406 n=18+18)
Gzip-12 214ms ± 0% 214ms ± 1% ~ (p=0.103 n=17+18)
Gunzip-12 37.6ms ± 0% 37.6ms ± 0% ~ (p=0.563 n=17+17)
HTTPClientServer-12 77.2µs ± 3% 76.9µs ± 2% ~ (p=0.606 n=20+20)
JSONEncode-12 15.1ms ± 1% 15.2ms ± 2% ~ (p=0.138 n=19+19)
JSONDecode-12 53.3ms ± 1% 53.1ms ± 1% -0.33% (p=0.000 n=19+18)
Mandelbrot200-12 4.04ms ± 1% 4.04ms ± 1% ~ (p=0.075 n=19+18)
GoParse-12 3.30ms ± 1% 3.29ms ± 1% -0.57% (p=0.000 n=18+16)
RegexpMatchEasy0_32-12 69.5ns ± 1% 69.9ns ± 3% ~ (p=0.822 n=18+20)
RegexpMatchEasy0_1K-12 237ns ± 1% 237ns ± 0% ~ (p=0.398 n=19+18)
RegexpMatchEasy1_32-12 69.8ns ± 2% 69.5ns ± 1% ~ (p=0.090 n=20+16)
RegexpMatchEasy1_1K-12 371ns ± 1% 372ns ± 1% ~ (p=0.178 n=19+20)
RegexpMatchMedium_32-12 108ns ± 2% 108ns ± 3% ~ (p=0.124 n=20+19)
RegexpMatchMedium_1K-12 33.9µs ± 2% 34.2µs ± 4% ~ (p=0.309 n=20+19)
RegexpMatchHard_32-12 1.75µs ± 2% 1.77µs ± 4% +1.28% (p=0.018 n=19+18)
RegexpMatchHard_1K-12 52.7µs ± 1% 53.4µs ± 4% +1.23% (p=0.013 n=15+18)
Revcomp-12 354ms ± 1% 359ms ± 4% +1.27% (p=0.043 n=20+20)
Template-12 63.6ms ± 2% 63.7ms ± 2% ~ (p=0.654 n=20+18)
TimeParse-12 313ns ± 1% 316ns ± 2% +0.80% (p=0.014 n=17+20)
TimeFormat-12 332ns ± 0% 329ns ± 0% -0.66% (p=0.000 n=16+16)
[Geo mean] 51.7µs 51.6µs -0.09%
Change-Id: I2214a6a0e4f544699ea166073249a8efdf080dc0
Reviewed-on: https://go-review.googlesource.com/21323
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently allocating black switches to the system stack (which is
probably a historical accident) and atomically updates the global
bytes marked stat. Since we're about to depend on this much more,
optimize it a bit by putting it back on the regular stack and updating
the per-P bytes marked stat, which gets lazily folded into the global
bytes marked stat.
Change-Id: Ibbe16e5382d3fd2256e4381f88af342bf7020b04
Reviewed-on: https://go-review.googlesource.com/22170
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Currently we count black allocations toward the scannable heap size,
but not toward the scan work we've done so far. This is clearly
inconsistent (we have, in effect, scanned these allocations and since
they're already black, we're not going to scan them again). Worse, it
means we don't count black allocations toward the scannable heap size
as of the *next* GC because this is based on the amount of scan work
we did in this cycle.
Fix this by counting black allocations as scan work. Currently the GC
spends very little time in allocate-black mode, so this probably
hasn't been a problem, but this will become important when we switch
to always allocating black.
Change-Id: If6ff693b070c385b65b6ecbbbbf76283a0f9d990
Reviewed-on: https://go-review.googlesource.com/22119
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Consistently use type int for the size argument of
runtime.newarray, runtime.reflect_unsafe_NewArray
and reflect.unsafe_NewArray.
Change-Id: Ic77bf2dde216c92ca8c49462f8eedc0385b6314e
Reviewed-on: https://go-review.googlesource.com/22311
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
mapaccess{1,2} returns a pointer to the value. When the key
is not in the map, it returns a pointer to zeroed memory.
Currently, for large map values we have a complicated scheme which
dynamically allocates zeroed memory for this purpose. It is ugly
code and requires an atomic.Load in a bunch of places we'd rather
not have it.
Switch to a scheme where callsites of mapaccess{1,2} which expect
large return values pass in a pointer to zeroed memory that
mapaccess can return if the key is not found. This avoids the
atomic.Load on all map accesses with a few extra instructions only
for the large value acccesses, plus a bit of bss space.
There was a time (1.4 & 1.5?) where we did something like this but
all the tricks to make the right size zero value were done by the
linker. That scheme broke in the presence of dyamic linking.
The scheme in this CL works even when dynamic linking.
Fixes#12337
Change-Id: Ic2d0319944af33bbb59785938d9ab80958d1b4b1
Reviewed-on: https://go-review.googlesource.com/22221
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
mallocgc can calculate noscan itself. The only remaining
flag argument is needzero, so we just make that a boolean arg.
Fixes#15379
Change-Id: I839a70790b2a0c9dbcee2600052bfbd6c8148e20
Reviewed-on: https://go-review.googlesource.com/22290
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
No point in passing the slice type to these functions.
All they need is the element type. One less indirection,
maybe a few less []T type descriptors in the binary.
Change-Id: Ib0b83b5f14ca21d995ecc199ce8ac00c4eb375e6
Reviewed-on: https://go-review.googlesource.com/22275
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
The extra checks provided by newarray are
redundant in these cases.
This shrinks by one frame the call stack expected
by the pprof test.
name old time/op new time/op delta
MakeSlice-8 34.3ns ± 2% 30.5ns ± 3% -11.03% (p=0.000 n=24+22)
GrowSlicePtr-8 134ns ± 2% 129ns ± 3% -3.25% (p=0.000 n=25+24)
Change-Id: Icd828655906b921c732701fd9d61da3fa217b0af
Reviewed-on: https://go-review.googlesource.com/22276
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
On GNU/Linux, SIGSYS is specified to cause the process to terminate
without a core dump. In https://codereview.appspot.com/3749041 , it
appears that Golang accidentally introduced incorrect behavior for
this signal, which caused Golang processes to keep running after
receiving SIGSYS. This change reverts it to the old/correct behavior.
Updates #15204
Change-Id: I3aa48a9499c1bc36fa5d3f40c088fdd7599e0db5
Reviewed-on: https://go-review.googlesource.com/22202
Reviewed-by: Ian Lance Taylor <iant@golang.org>
We now inline type to interface conversions when the type
is pointer-shaped. No need to keep code to handle that in
convT2{I,E}.
Change-Id: I3a6668259556077cbb2986a9e8fe42a625d506c9
Reviewed-on: https://go-review.googlesource.com/22249
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Introduce and start using nameOff for two encoded names. This pair
of changes is best done together because the linker's method decoder
expects the method layouts to match.
Precursor to converting all existing name and *string fields to
nameOff.
linux/amd64:
cmd/go: -45KB (0.5%)
jujud: -389KB (0.6%)
linux/amd64 PIE:
cmd/go: -170KB (1.4%)
jujud: -1.5MB (1.8%)
For #6853.
Change-Id: Ia044423f010fb987ce070b94c46a16fc78666ff6
Reviewed-on: https://go-review.googlesource.com/21396
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Currently the scavenger marks memory unused in multiples of the
allocator page size (8K). This is safe as long as the true physical
page size is 4K (or 8K), as it is on many platforms. However, on
ARM64, PPC64x, and MIPS64, the physical page size is larger than 8K,
so if we attempt to mark memory unused, the kernel will round the
boundaries of the region *out* to all pages covered by the requested
region, and we'll release a larger region of memory than intended. As
a result, the scavenger is currently disabled on these platforms.
Fix this by first rounding the region to be marked unused *in* to
multiples of the physical page size, so that when we ask the kernel to
mark it unused, it releases exactly the requested region.
Fixes#9993.
Change-Id: I96d5fdc2f77f9d69abadcea29bcfe55e68288cb1
Reviewed-on: https://go-review.googlesource.com/22066
Reviewed-by: Rick Hudson <rlh@golang.org>
If sysUnused is passed an address or length that is not aligned to the
physical page boundary, the kernel will unmap more memory than the
caller wanted. Add a check for this.
For #9993.
Change-Id: I68ff03032e7b65cf0a853fe706ce21dc7f2aaaf8
Reviewed-on: https://go-review.googlesource.com/22065
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
The runtime hard-codes an assumed physical page size. If this is
smaller than the kernel's page size or not a multiple of it, sysUnused
may incorrectly release more memory to the system than intended.
Add a runtime startup check that the runtime's assumed physical page
is compatible with the kernel's physical page size.
For #9993.
Change-Id: Ida9d07f93c00ca9a95dd55fc59bf0d8a607f6728
Reviewed-on: https://go-review.googlesource.com/22064
Reviewed-by: Rick Hudson <rlh@golang.org>
archauxv no longer does anything on 386, so remove it.
Change-Id: I94545238e40fa6a6832a7c3b40aedfc6c1f6a97b
Reviewed-on: https://go-review.googlesource.com/22063
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
The Linux kernel provides 16 bytes of random data via the auxv vector
at startup. Currently we consume this separately on 386, amd64, arm,
and arm64. Now that we have a common auxv parser, handle _AT_RANDOM in
the common path.
Change-Id: Ib69549a1d37e2d07a351cf0f44007bcd24f0d20d
Reviewed-on: https://go-review.googlesource.com/22062
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Currently several different Linux architectures have separate copies
of the auxv parser. Bring these all together into a single copy of the
parser that calls out to a per-arch handler for each tag/value pair.
This is in preparation for handling common auxv tags in one place.
For #9993.
Change-Id: Iceebc3afad6b4133b70fca7003561ae370445c10
Reviewed-on: https://go-review.googlesource.com/22061
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Hudson-Doyle <michael.hudson@canonical.com>
https://golang.org/cl/10173 intrduced msigsave, ensureSigM and
_SigUnblock but didn't enable the new signal save/restore mechanism for
SIG{HUP,INT,QUIT,ABRT,TERM} on DragonFly BSD, FreeBSD and OpenBSD.
At present, it looks like they have the implementation. This change
enables the new mechanism on DragonFly BSD, FreeBSD and OpenBSD the same
as Darwin, NetBSD.
Change-Id: Ifb4b4743b3b4f50bfcdc7cf1fe1b59c377fa2a41
Reviewed-on: https://go-review.googlesource.com/18657
Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
sync/atomic.StorePointer (which is implemented in
runtime/atomic_pointer.go) writes the pointer twice (through two
completely different code paths, no less). Fix it to only write once.
Change-Id: Id3b2aef9aa9081c2cf096833e001b93d3dd1f5da
Reviewed-on: https://go-review.googlesource.com/21999
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
SwapPointer is declared as
func SwapPointer(addr *unsafe.Pointer, new unsafe.Pointer) (old unsafe.Pointer)
in sync/atomic, but defined in the runtime (where it's actually
implemented) as
func sync_atomic_SwapPointer(ptr unsafe.Pointer, new unsafe.Pointer) unsafe.Pointer
Make ptr a *unsafe.Pointer in the runtime definition to match the type
in sync/atomic.
Change-Id: I99bab651b995001bbe54f9e790fdef2417ef0e9e
Reviewed-on: https://go-review.googlesource.com/21998
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
Use deBruijn sequences to count low-order zeros.
Reorg bswap to not use &^, it takes another instruction on x86.
Change-Id: I4a5ed9fd16ee6a279d88c067e8a2ba11de821156
Reviewed-on: https://go-review.googlesource.com/22084
Reviewed-by: David Chase <drchase@google.com>
These comments were left behind after runtime.h was converted
from C to Go. I examined the original code and tried to move these
to the places that the most sense.
Change-Id: I8769d60234c0113d682f9de3bd8d6c34c450c188
Reviewed-on: https://go-review.googlesource.com/21969
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
By replacing the *string used to represent pkgPath with a
reflect.name everywhere, the embedded *string for package paths
inside the reflect.name can be replaced by an offset, nameOff.
This reduces the number of pointers in the type information.
This also moves all reflect.name types into the same section, making
it possible to use nameOff more widely in later CLs.
No significant binary size change for normal binaries, but:
linux/amd64 PIE:
cmd/go: -440KB (3.7%)
jujud: -2.6MB (3.2%)
For #6853.
Change-Id: I3890b132a784a1090b1b72b32febfe0bea77eaee
Reviewed-on: https://go-review.googlesource.com/21395
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Merge them together into os1_darwin.go. A future CL will rename it.
Change-Id: Ia4380d3296ebd5ce210908ce3582ff184566f692
Reviewed-on: https://go-review.googlesource.com/22004
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Make it clear that the point of this function stores a pointer
*without* a write barrier.
sed -i -e 's/Storep1/StorepNoWB/' $(git grep -l Storep1)
Updates #15270.
Change-Id: Ifad7e17815e51a738070655fe3b178afdadaecf6
Reviewed-on: https://go-review.googlesource.com/21994
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Michael Matloob <matloob@golang.org>
atomic.Storep1 is not supposed to invoke a write barrier (that's what
atomicstorep is for), but currently does on s390x. This causes a panic
in runtime.mapzero when it tries to use atomic.Storep1 to store what's
actually a scalar.
Fix this by eliminating the write barrier from atomic.Storep1 on
s390x. Also add some documentation to atomicstorep to explain the
difference between these.
Fixes#15270.
Change-Id: I291846732d82f090a218df3ef6351180aff54e81
Reviewed-on: https://go-review.googlesource.com/21993
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
This CL introduces the typeOff type and a lookup method of the same
name that can turn a typeOff offset into an *rtype.
In a typical Go binary (built with buildmode=exe, pie, c-archive, or
c-shared), there is one moduledata and all typeOff values are offsets
relative to firstmoduledata.types. This makes computing the pointer
cheap in typical programs.
With buildmode=shared (and one day, buildmode=plugin) there are
multiple modules whose relative offset is determined at runtime.
We identify a type in the general case by the pair of the original
*rtype that references it and its typeOff value. We determine
the module from the original pointer, and then use the typeOff from
there to compute the final *rtype.
To ensure there is only one *rtype representing each type, the
runtime initializes a typemap for each module, using any identical
type from an earlier module when resolving that offset. This means
that types computed from an offset match the type mapped by the
pointer dynamic relocations.
A series of followup CLs will replace other *rtype values with typeOff
(and name/*string with nameOff).
For types created at runtime by reflect, type offsets are treated as
global IDs and reference into a reflect offset map kept by the runtime.
darwin/amd64:
cmd/go: -57KB (0.6%)
jujud: -557KB (0.8%)
linux/amd64 PIE:
cmd/go: -361KB (3.0%)
jujud: -3.5MB (4.2%)
For #6853.
Change-Id: Icf096fd884a0a0cb9f280f46f7a26c70a9006c96
Reviewed-on: https://go-review.googlesource.com/21285
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
No need to acquire the M just to change G's paniconfault flag, and the
original C implementation of SetPanicOnFault did not. The M
acquisition logic is an artifact of golang.org/cl/131010044, which was
started before golang.org/cl/123640043 (which introduced the current
"getg" function) was submitted.
Change-Id: I6d1939008660210be46904395cf5f5bbc2c8f754
Reviewed-on: https://go-review.googlesource.com/21935
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This is the first in a series of CLs to replace the use of pointers
in binary read-only data with offsets.
In standard Go binaries these CLs have a small effect, shrinking
8-byte pointers to 4-bytes. In position-independent code, it also
saves the dynamic relocation for the pointer. This has a significant
effect on the binary size when building as PIE, c-archive, or
c-shared.
darwin/amd64:
cmd/go: -12KB (0.1%)
jujud: -82KB (0.1%)
linux/amd64 PIE:
cmd/go: -86KB (0.7%)
jujud: -569KB (0.7%)
For #6853.
Change-Id: Iad5625bbeba58dabfd4d334dbee3fcbfe04b2dcf
Reviewed-on: https://go-review.googlesource.com/21284
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Load and store instructions are atomic on the s390x.
Change-Id: I0031ed2fba43f33863bca114d0fdec2e7d1ce807
Reviewed-on: https://go-review.googlesource.com/20938
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The previous cleanup was done with a buggy tool, missing some potential
rewrites.
Change-Id: I333467036e355f999a6a493e8de87e084f374e26
Reviewed-on: https://go-review.googlesource.com/21378
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Merge the amd64 lfstack implementation into the general 64 bit
implementation.
Change-Id: Id9ed61b90d2e3bc3b0246294c03eb2c92803b6ca
Reviewed-on: https://go-review.googlesource.com/21707
Run-TryBot: Dave Cheney <dave@cheney.net>
Reviewed-by: Minux Ma <minux@golang.org>
After mdempsky's recent changes, these are the only references to
"TheChar" left in the Go tree. Without the context, and without
knowing the history, this is confusing.
Also rename sys.TheGoos and sys.TheGoarch to sys.GOOS
and sys.GOARCH.
Also change the heap dump format to include sys.GOARCH
rather than TheChar, which is no longer a concept.
Updates #15169 (changes heapdump format)
Change-Id: I3e99eeeae00ed55d7d01e6ed503d958c6e931dca
Reviewed-on: https://go-review.googlesource.com/21647
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Only compute the number of maximum allowed elements per slice once.
name old time/op new time/op delta
MakeSlice-2 55.5ns ± 1% 45.6ns ± 2% -17.88% (p=0.000 n=99+100)
Change-Id: I951feffda5d11910a75e55d7e978d306d14da2c5
Reviewed-on: https://go-review.googlesource.com/21801
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This reverts commit ab4c9298b8.
Sysmon critically depends on system timer resolution for retaking
of Ps blocked in system calls. See #14790 for an example
of a program where execution time goes from 2ms to 30ms if
timeBeginPeriod(1) is not used.
We can remove timeBeginPeriod(1) when we support UMS (#7876).
Update #14790
Change-Id: I362b56154359b2c52d47f9f2468fe012b481cf6d
Reviewed-on: https://go-review.googlesource.com/20834
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
This makes traces self-contained and simplifies trace workflow
in modern cloud environments where it is simpler to reach
a service via HTTP than to obtain the binary.
Change-Id: I6ff3ca694dc698270f1e29da37d5efaf4e843a0d
Reviewed-on: https://go-review.googlesource.com/21732
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
This broke solaris, which apparently does use the upper 17 bits of the address space.
This reverts commit 3b02c5b1b6.
Change-Id: Iedfe54abd0384960845468205f20191a97751c0b
Reviewed-on: https://go-review.googlesource.com/21652
Reviewed-by: Dave Cheney <dave@cheney.net>
The test for profiling of channel blocking is timing dependent,
and in particular the blockSelectRecvAsync case can fail on a
slow builder (plan9_arm) when many tests are run in parallel.
The child goroutine sleeps for a fixed period so the parent
can be observed to block in a select call reading from the
child; but if the OS process running the parent goroutine is
delayed long enough, the child may wake again before the
parent has reached the blocking point. By repeating the test
three times, the likelihood of a blocking event is increased.
Fixes#15096
Change-Id: I2ddb9576a83408d06b51ded682bf8e71e53ce59e
Reviewed-on: https://go-review.googlesource.com/21604
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Merge the remaining lfstack{Pack,Unpack} implemetations into one file.
unsafe.Sizeof(uintptr(0)) == 4 is a constant comparison so this branch
folds away at compile time.
Dmitry confirmed that the upper 17 bits of an address will be zero for a
user mode pointer, so there is no need to sign extend on amd64 during
unpack, so we can reuse the same implementation as all othe 64 bit
archs.
Change-Id: I99f589416d8b181ccde5364c9c2e78e4a5efc7f1
Reviewed-on: https://go-review.googlesource.com/21597
Run-TryBot: Dave Cheney <dave@cheney.net>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
So that all Go processes do not die on startup on a system with >256 CPUs.
I tested this by hacking osinit to set ncpu to 1000.
Updates #15131
Change-Id: I52e061a0de97be41d684dd8b748fa9087d6f1aef
Reviewed-on: https://go-review.googlesource.com/21599
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
None of the two places that call lfstackUnpack use the second argument.
This simplifies a followup CL that merges the lfstack{Pack,Unpack}
implementations.
Change-Id: I3c93f6259da99e113d94f8c8027584da79c1ac2c
Reviewed-on: https://go-review.googlesource.com/21595
Run-TryBot: Dave Cheney <dave@cheney.net>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Flaky tests are a distraction and cover up real problems.
File bugs instead and mark them as flaky.
This moves the net/http flaky test flagging mechanism to internal/testenv.
Updates #15156
Updates #15157
Updates #15158
Change-Id: I0e561cd2a09c0dec369cd4ed93bc5a2b40233dfe
Reviewed-on: https://go-review.googlesource.com/21614
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Merge all the 64bit lfstack impls into one file, adjust build tags to
match.
Merge all the comments on the various lfstack implementations for
posterity.
lfstack_amd64.go can probably be merged, but it is slightly different so
that will happen in a followup.
Change-Id: I5362d5e127daa81c9cb9d4fa8a0cc5c5e5c2707c
Reviewed-on: https://go-review.googlesource.com/21591
Run-TryBot: Dave Cheney <dave@cheney.net>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
A future CL will rename os1_windows.go to os_windows.go.
Change-Id: I223e76002dd1e9c9d1798fb0beac02c7d3bf4812
Reviewed-on: https://go-review.googlesource.com/21564
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Missed a case for closure calls (OCALLFUNC && indirect) in
esc.go:esccall.
Cleanup to runtime code for windows to more thoroughly hide
a technical escape. Also made code pickier about failing
to late non-optional kernel32.dll.
Fixes#14409.
Change-Id: Ie75486a2c8626c4583224e02e4872c2875f7bca5
Reviewed-on: https://go-review.googlesource.com/20102
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Usleep(100) in runqgrab negatively affects latency and throughput
of parallel application. We are sleeping instead of doing useful work.
This is effect is particularly visible on windows where minimal
sleep duration is 1-15ms.
Reduce sleep from 100us to 3us and use osyield on windows.
Sync chan send/recv takes ~50ns, so 3us gives us ~50x overshoot.
benchmark old ns/op new ns/op delta
BenchmarkChanSync-12 216 217 +0.46%
BenchmarkChanSyncWork-12 27213 25816 -5.13%
CPU consumption goes up from 106% to 108% in the first case,
and from 107% to 125% in the second case.
Test case from #14790 on windows:
BenchmarkDefaultResolution-8 4583372 29720 -99.35%
Benchmark1ms-8 992056 30701 -96.91%
99-th latency percentile for HTTP request serving is improved by up to 15%
(see http://golang.org/cl/20835 for details).
The following benchmarks are from the change that originally added this sleep
(see https://golang.org/s/go15gomaxprocs):
name old time/op new time/op delta
Chain 22.6µs ± 2% 22.7µs ± 6% ~ (p=0.905 n=9+10)
ChainBuf 22.4µs ± 3% 22.5µs ± 4% ~ (p=0.780 n=9+10)
Chain-2 23.5µs ± 4% 24.9µs ± 1% +5.66% (p=0.000 n=10+9)
ChainBuf-2 23.7µs ± 1% 24.4µs ± 1% +3.31% (p=0.000 n=9+10)
Chain-4 24.2µs ± 2% 25.1µs ± 3% +3.70% (p=0.000 n=9+10)
ChainBuf-4 24.4µs ± 5% 25.0µs ± 2% +2.37% (p=0.023 n=10+10)
Powser 2.37s ± 1% 2.37s ± 1% ~ (p=0.423 n=8+9)
Powser-2 2.48s ± 2% 2.57s ± 2% +3.74% (p=0.000 n=10+9)
Powser-4 2.66s ± 1% 2.75s ± 1% +3.40% (p=0.000 n=10+10)
Sieve 13.3s ± 2% 13.3s ± 2% ~ (p=1.000 n=10+9)
Sieve-2 7.00s ± 2% 7.44s ±16% ~ (p=0.408 n=8+10)
Sieve-4 4.13s ±21% 3.85s ±22% ~ (p=0.113 n=9+9)
Fixes#14790
Change-Id: Ie7c6a1c4f9c8eb2f5d65ab127a3845386d6f8b5d
Reviewed-on: https://go-review.googlesource.com/20835
Reviewed-by: Austin Clements <austin@google.com>
When we grow the heap, we create a temporary "in use" span for the
memory acquired from the OS and then free that span to link it into
the heap. Hence, we (1) increase pagesInUse when we make the temporary
span so that (2) freeing the span will correctly decrease it.
However, currently step (1) increases pagesInUse by the number of
pages requested from the heap, while step (2) decreases it by the
number of pages requested from the OS (the size of the temporary
span). These aren't necessarily the same, since we round up the number
of pages we request from the OS, so steps 1 and 2 don't necessarily
cancel out like they're supposed to. Over time, this can add up and
cause pagesInUse to underflow and wrap around to 2^64. The garbage
collector computes the sweep ratio from this, so if this happens, the
sweep ratio becomes effectively infinite, causing the first allocation
on each P in a sweep cycle to sweep the entire heap. This makes
sweeping effectively STW.
Fix this by increasing pagesInUse in step 1 by the number of pages
requested from the OS, so that the two steps correctly cancel out. We
add a test that checks that the running total matches the actual state
of the heap.
Fixes#15022. For 1.6.x.
Change-Id: Iefd9d6abe37d0d447cbdbdf9941662e4f18eeffc
Reviewed-on: https://go-review.googlesource.com/21280
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
It appears that windows osyield is just 15ms sleep on my computer
(see benchmarks below). Replace NtWaitForSingleObject in osyield
with SwitchToThread (as suggested by Dmitry).
Also add issue #14790 related benchmarks, so we can track perfomance
changes in CL 20834 and CL 20835 and beyond.
Update #14790
benchmark old ns/op new ns/op delta
BenchmarkChanToSyscallPing1ms 1953200 1953000 -0.01%
BenchmarkChanToSyscallPing15ms 31562904 31248400 -1.00%
BenchmarkSyscallToSyscallPing1ms 5247 4202 -19.92%
BenchmarkSyscallToSyscallPing15ms 5260 4374 -16.84%
BenchmarkChanToChanPing1ms 474 494 +4.22%
BenchmarkChanToChanPing15ms 468 489 +4.49%
BenchmarkOsYield1ms 980018 75.5 -99.99%
BenchmarkOsYield15ms 15625200 75.8 -100.00%
Change-Id: I1b4cc7caca784e2548ee3c846ca07ef152ebedce
Reviewed-on: https://go-review.googlesource.com/21294
Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Add supporting code for runtime initialization, including both
32- and 64-bit x86 architectures.
Add .ctors section on Windows to PE .o files, and INITENTRY to .ctors
section to plug in to the GCC C/C++ startup initialization mechanism.
This allows the Go runtime to initialize itself. Add .text section
symbol for .ctor relocations. Note: This is unlikely to be useful for
MSVC-based toolchains.
Fixes#13494
Change-Id: I4286a96f70e5f5228acae88eef46e2bed95813f3
Reviewed-on: https://go-review.googlesource.com/18057
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Change-Id: I91873aaebf79bdf1c00d38aacc1a1fb8d79656a7
Reviewed-on: https://go-review.googlesource.com/21433
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Make sure that for any DLL that Go uses itself, we only look for the
DLL in the Windows System32 directory, guarding against DLL preloading
attacks.
(Unless the Windows version is ancient and LoadLibraryEx is
unavailable, in which case the user probably has bigger security
problems anyway.)
This does not change the behavior of syscall.LoadLibrary or NewLazyDLL
if the DLL name is something unused by Go itself.
This change also intentionally does not add any new API surface. Instead,
x/sys is updated with a LoadLibraryEx function and LazyDLL.Flags in:
https://golang.org/cl/21388
Updates #14959
Change-Id: I8d29200559cc19edf8dcf41dbdd39a389cd6aeb9
Reviewed-on: https://go-review.googlesource.com/21140
Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Fixes a problem when using the external linker on Solaris. The Solaris
external linker still doesn't work due to issue #14957.
The problem is, for example, with `go test cmd/objdump`:
objdump_test.go:71: go build fmthello.go: exit status 2
# command-line-arguments
/var/gcc/iant/go/pkg/tool/solaris_amd64/link: running gcc failed: exit status 1
Undefined first referenced
symbol in file
x_cgo_callers /tmp/go-link-355600608/go.o
ld: fatal: symbol referencing errors
collect2: error: ld returned 1 exit status
Change-Id: I54917cfd5c288ee77ea25c439489bd2c9124fe73
Reviewed-on: https://go-review.googlesource.com/21392
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
The new function runtime.SetCgoTraceback may be used to register stack
traceback and symbolizer functions, written in C, to do a stack
traceback from cgo code.
There is a sample implementation of runtime.SetCgoSymbolizer at
github.com/ianlancetaylor/cgosymbolizer. Just importing that package is
sufficient to get symbolic C backtraces.
Currently only supported on linux/amd64.
Change-Id: If96ee2eb41c6c7379d407b9561b87557bfe47341
Reviewed-on: https://go-review.googlesource.com/17761
Reviewed-by: Austin Clements <austin@google.com>
Previously, cmd/compile rejected constant int->string conversions if
the integer value did not fit into an "int" value. Also, runtime
incorrectly truncated 64-bit values to 32-bit before checking if
they're a valid Unicode code point. According to the Go spec, both of
these cases should instead yield "\uFFFD".
Fixes#15039.
Change-Id: I3c8a3ad9a0780c0a8dc1911386a523800fec9764
Reviewed-on: https://go-review.googlesource.com/21344
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Only use REP;MOVSB if:
1) The CPUID flag says it is fast, and
2) The pointers are unaligned
Otherwise, use REP;MOVSQ.
Update #14630
Change-Id: I946b28b87880c08e5eed1ce2945016466c89db66
Reviewed-on: https://go-review.googlesource.com/21300
Reviewed-by: Nigel Tao <nigeltao@golang.org>
See #14874
This change tells the linker to collect all the itablink symbols and
collect them so that moduledata can have a slice of all compiler
generated itabs.
The logic is shamelessly adapted from what is done with typelink symbols.
Change-Id: Ie93b59acf0fcba908a876d506afbf796f222dbac
Reviewed-on: https://go-review.googlesource.com/20889
Reviewed-by: Keith Randall <khr@golang.org>
See #14874
This change tells the compiler to emit itab and itablink symbols in
situations where they could be useful; however the compiled code does
not actually make use of the new symbols yet.
Change-Id: I0db3e6ec0cb1f3b7cebd4c60229e4a48372fe586
Reviewed-on: https://go-review.googlesource.com/20888
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Michel Lespinasse <walken@google.com>
See #14874
This change makes the runtime register all compiler generated itabs
(as obtained from the moduledata) during init.
Change-Id: I9969a0985b99b8bda820a631f7fe4c78f1174cdf
Reviewed-on: https://go-review.googlesource.com/20900
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Michel Lespinasse <walken@google.com>
linux/386 depends on modify_ldt system call, but recent Linux kernels
can disable this system call. Any Go programs built as linux/386
crash with the message 'Trace/breakpoint trap'.
The kernel config CONFIG_MODIFY_LDT_SYSCALL, which control
enable/disable modify_ldt, is disabled on Amazon Linux 2016.03.
This fixes this problem by using set_thread_area instead of modify_ldt
on linux/386.
Fixes#14795.
Change-Id: I0cc5139e40e9e5591945164156a77b6bdff2c7f1
Reviewed-on: https://go-review.googlesource.com/21190
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Minux Ma <minux@golang.org>
One intrinsic was needed to help get the very best
performance out of a future GC; as long as that one was
being added, I also added Bswap since that is sometimes
a handy thing to have. I had intended to fill out the
bit-scan intrinsic family, but the mismatch between the
"scan forward" instruction and "count leading zeroes"
was large enough to cause me to leave it out -- it poses
a dilemma that I'd rather dodge right now.
These intrinsics are not exposed for general use.
That's a separate issue requiring an API proposal change
( https://github.com/golang/proposal )
All intrinsics are tested, both that they are substituted
on the appropriate architecture, and that they produce the
expected result.
Change-Id: I5848037cfd97de4f75bdc33bdd89bba00af4a8ee
Reviewed-on: https://go-review.googlesource.com/20564
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
For #14876.
Change-Id: I0992859264cbaf9c9b691fad53345bbb01b4cf3b
Reviewed-on: https://go-review.googlesource.com/21085
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
For #14876.
Change-Id: I33947f74e8058437a784862f1f064974afc99250
Reviewed-on: https://go-review.googlesource.com/21084
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
This is a follow-up of https://go-review.googlesource.com/#/c/20653/
Special case computation for slices with elements of byte size or
pointer size.
name old time/op new time/op delta
GrowSliceBytes-4 86.2ns ± 3% 75.4ns ± 2% -12.50% (p=0.000 n=20+20)
GrowSliceInts-4 161ns ± 3% 136ns ± 3% -15.59% (p=0.000 n=19+19)
GrowSlicePtr-4 239ns ± 2% 233ns ± 2% -2.52% (p=0.000 n=20+20)
GrowSliceStruct24Bytes-4 258ns ± 3% 256ns ± 3% ~ (p=0.134 n=20+20)
Change-Id: Ice5fa648058fe9d7fa89dee97ca359966f671128
Reviewed-on: https://go-review.googlesource.com/21101
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
There's a race between runtime.goexitsall killing all OS processes
of a go program in order to exit, and runtime.newosproc forking a
new one. If the new process has been created but not yet stored
its pid in m.procid, it will not be killed by goexitsall and
deadlock results.
This CL prevents the race by making the newly forked process
check whether the program is exiting. It also prevents a
potential "shoot-out" if multiple goroutines call Exit at
the same time, which could possibly lead to two processes
killing each other and leaving the rest deadlocked.
Change-Id: I3170b4a62d2461f6b029b3d6aad70373714ed53e
Reviewed-on: https://go-review.googlesource.com/21135
Run-TryBot: David du Colombier <0intro@gmail.com>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David du Colombier <0intro@gmail.com>
During random stealing we steal 4*GOMAXPROCS times from random procs.
One would expect that most of the time we check all procs this way,
but due to low quality PRNG we actually miss procs with frightening
probability. Below are modelling experiment results for 1e6 tries:
GOMAXPROCS = 2 : missed 1 procs 7944 times
GOMAXPROCS = 3 : missed 1 procs 101620 times
GOMAXPROCS = 3 : missed 2 procs 3571 times
GOMAXPROCS = 4 : missed 1 procs 63916 times
GOMAXPROCS = 4 : missed 2 procs 61 times
GOMAXPROCS = 4 : missed 3 procs 16 times
GOMAXPROCS = 5 : missed 1 procs 133136 times
GOMAXPROCS = 5 : missed 2 procs 1025 times
GOMAXPROCS = 5 : missed 3 procs 101 times
GOMAXPROCS = 5 : missed 4 procs 15 times
GOMAXPROCS = 8 : missed 1 procs 151765 times
GOMAXPROCS = 8 : missed 2 procs 5057 times
GOMAXPROCS = 8 : missed 3 procs 1726 times
GOMAXPROCS = 8 : missed 4 procs 68 times
GOMAXPROCS = 12 : missed 1 procs 199081 times
GOMAXPROCS = 12 : missed 2 procs 27489 times
GOMAXPROCS = 12 : missed 3 procs 3113 times
GOMAXPROCS = 12 : missed 4 procs 233 times
GOMAXPROCS = 12 : missed 5 procs 9 times
GOMAXPROCS = 16 : missed 1 procs 237477 times
GOMAXPROCS = 16 : missed 2 procs 30037 times
GOMAXPROCS = 16 : missed 3 procs 9466 times
GOMAXPROCS = 16 : missed 4 procs 1334 times
GOMAXPROCS = 16 : missed 5 procs 192 times
GOMAXPROCS = 16 : missed 6 procs 5 times
GOMAXPROCS = 16 : missed 7 procs 1 times
GOMAXPROCS = 16 : missed 8 procs 1 times
A missed proc won't lead to underutilization because we check all procs
again after dropping P. But it can lead to an unpleasant situation
when we miss a proc, drop P, check all procs, discover work, acquire P,
miss the proc again, repeat.
Improve stealing logic to cover all procs.
Also don't enter spinning mode and try to steal when there is nobody around.
Change-Id: Ibb6b122cc7fb836991bad7d0639b77c807aab4c2
Reviewed-on: https://go-review.googlesource.com/20836
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Marvin Stenger <marvin.stenger94@gmail.com>
Create a byte encoding designed for static Go names.
It is intended to be a compact representation of a name
and optional tag data that can be turned into a Go string
without allocating, and describes whether or not it is
exported without unicode table.
The encoding is described in reflect/type.go:
// The first byte is a bit field containing:
//
// 1<<0 the name is exported
// 1<<1 tag data follows the name
// 1<<2 pkgPath *string follow the name and tag
//
// The next two bytes are the data length:
//
// l := uint16(data[1])<<8 | uint16(data[2])
//
// Bytes [3:3+l] are the string data.
//
// If tag data follows then bytes 3+l and 3+l+1 are the tag length,
// with the data following.
//
// If the import path follows, then ptrSize bytes at the end of
// the data form a *string. The import path is only set for concrete
// methods that are defined in a different package than their type.
Shrinks binary sizes:
cmd/go: 164KB (1.6%)
jujud: 1.0MB (1.5%)
For #6853.
Change-Id: I46b6591015b17936a443c9efb5009de8dfe8b609
Reviewed-on: https://go-review.googlesource.com/20968
Run-TryBot: David Crawshaw <crawshaw@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
The current runtime attempts to forward signals generated by non-Go
code to the original signal handler. If it can't call the original
handler directly, it currently attempts to re-raise the signal after
resetting the handler. In this case, the original context is lost.
This fix prevents that problem by simply returning from the go signal
handler after resetting the original handler. It only does this when
the original handler is the system default handler, which in all cases
is known to not recover. The signal is not reset, so it is retriggered
and the original handler takes over with the proper context.
Fixes#14899
Change-Id: Ib1c19dfa4b50d9732d7a453de3784c8141e1cbb3
Reviewed-on: https://go-review.googlesource.com/21006
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Fixes#14938.
Additionally some simplifications along the way.
Change-Id: I2c5fb7e32dcc6fab68fff36a49cb72e715756abe
Reviewed-on: https://go-review.googlesource.com/21046
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
For darwin/arm{,64} a non-Go thread is created to convert
EXC_BAD_ACCESS to panics. However, the Go signal handler refuse to
handle signals that would otherwise be ignored if they arrive at
non-Go threads.
Block all (posix) signals to that thread, making sure that
no unexpected signals arrive to it. At least one test, TestStop in
os/signal, depends on signals not arriving on any non-Go threads.
For #14318
Change-Id: I901467fb53bdadb0d03b0f1a537116c7f4754423
Reviewed-on: https://go-review.googlesource.com/21047
Reviewed-by: David Crawshaw <crawshaw@golang.org>
The existing implementation for Equal and similar
functions in the bytes package operate on one byte at
at time. This performs poorly on ppc64/ppc64le especially
when the byte buffers are large. This change improves
those functions by loading and comparing double words where
possible. The common code has been moved to a function
that can be shared by the other functions in this
file which perform the same type of comparison.
Further optimizations are done for the case where
>= 32 bytes are being compared. The new function
memeqbody is used by memeq_varlen, Equal, and eqstring.
When running the bytes test with -test.bench=Equal
benchmark old MB/s new MB/s speedup
BenchmarkEqual1 164.83 129.49 0.79x
BenchmarkEqual6 563.51 445.47 0.79x
BenchmarkEqual9 656.15 1099.00 1.67x
BenchmarkEqual15 591.93 1024.30 1.73x
BenchmarkEqual16 613.25 1914.12 3.12x
BenchmarkEqual20 682.37 1687.04 2.47x
BenchmarkEqual32 807.96 3843.29 4.76x
BenchmarkEqual4K 1076.25 23280.51 21.63x
BenchmarkEqual4M 1079.30 13120.14 12.16x
BenchmarkEqual64M 1073.28 10876.92 10.13x
It was determined that the degradation in the smaller byte tests
were due to unfavorable code alignment of the single byte loop.
Fixes#14368
Change-Id: I0dd87382c28887c70f4fbe80877a8ba03c31d7cd
Reviewed-on: https://go-review.googlesource.com/20249
Reviewed-by: Minux Ma <minux@golang.org>