Actually working to stay within the limit could cause subtle deadlocks.
Crashing avoids the subtlety.
Fixes#4056.
R=golang-dev, r, dvyukov
CC=golang-dev
https://golang.org/cl/13037043
Fixes#6107.
race: output goroutine 1 as main goroutine
Fixes#6130.
race: option to abort program on first detected error
R=golang-dev, mikioh.mikioh
CC=golang-dev
https://golang.org/cl/12968044
The goal is to stop only those programs that would keep
going and run the machine out of memory, but before they do that.
1 GB on 64-bit, 250 MB on 32-bit.
That seems implausibly large, and it can be adjusted.
Fixes#2556.
Fixes#4494.
Fixes#5173.
R=khr, r, dvyukov
CC=golang-dev
https://golang.org/cl/12541052
The baseline architecture had been left to the GCC configured
default which can be more accomodating than the rest of the Go
toolchain. This prevented instructions used by the 5g compiler,
like BLX, from being used in GCC compiled assembler code.
R=golang-dev, dave, rsc, elias.naur, cshapiro
CC=golang-dev
https://golang.org/cl/12954043
It doughtily misses all possible corner cases.
In particular on machines with <1GHz processors,
SetBlockProfileRate(1) disables profiling.
Fixes#6114.
R=golang-dev, bradfitz, rsc
CC=golang-dev
https://golang.org/cl/12936043
Originally the requirement was f(x) where f's argument is
exactly x's type.
CL 11858043 relaxed the requirement in a non-standard
way: f's argument must be exactly x's type or interface{}.
If we're going to relax the requirement, it should be done
in a way consistent with the rest of Go. This CL allows f's
argument to have any type for which x is assignable;
that's the same requirement the compiler would impose
if compiling f(x) directly.
Fixes#5368.
R=dvyukov, bradfitz, pieter
CC=golang-dev
https://golang.org/cl/12895043
The ARM external linking CL used BLX instructions in gcc assembler. Replace with BL to retain support on older ARM processors.
R=rsc
CC=golang-dev
https://golang.org/cl/12938043
The ARM external linking CL left missed changes to sys_freebsd_arm.s and sys_netbsd_arm.s already done to sys_linux_arm.s.
R=rsc
CC=golang-dev
https://golang.org/cl/12842044
This CL is an aggregate of 10271047, 10499043, 9733044. Descriptions of each follow:
10499043
runtime,cmd/ld: Merge TLS symbols and teach 5l about ARM TLS
This CL prepares for external linking support to ARM.
The pseudo-symbols runtime.g and runtime.m are merged into a single
runtime.tlsgm symbol. When external linking, the offset of a thread local
variable is stored at a memory location instead of being embedded into a offset
of a ldr instruction. With a single runtime.tlsgm symbol for both g and m, only
one such offset is needed.
The larger part of this CL moves TLS code from gcc compiled to internally
compiled. The TLS code now uses the modern MRC instruction, and 5l is taught
about TLS fallbacks in case the instruction is not available or appropriate.
10271047
This CL adds support for -linkmode external to 5l.
For 5l itself, use addrel to allow for D_CALL relocations to be handled by the
host linker. Of the cases listed in rsc's comment in issue 4069, only case 5 and
63 needed an update. One of the TODO: addrel cases was since replaced, and the
rest of the cases are either covered by indirection through addpool (cases with
LTO or LFROM flags) or stubs (case 74). The addpool cases are covered because
addpool emits AWORD instructions, which in turn are handled by case 11.
In the runtime, change the argv argument in the rt0* functions slightly to be a
pointer to the argv list, instead of relying on a particular location of argv.
9733044
The -shared flag to 6l outputs a shared library, implemented in Go
and callable from non-Go programs such as C.
The main part of this CL change the thread local storage model.
Go uses the fastest and least general mode, local exec. TLS data in shared
libraries normally requires at least the local dynamic mode, however, this CL
instead opts for using the initial exec mode. Initial exec mode is faster than
local dynamic mode and can be used in linux since the linker has reserved a
limited amount of TLS space for performance sensitive TLS code.
Initial exec mode requires an extra load from the GOT table to determine the
TLS offset. This penalty will not be paid if ld is not in -shared mode, since
TLS accesses will be reduced to local exec.
The elf sections .init_array and .rela.init_array are added to register the Go
runtime entry with cgo at library load time.
The "hidden" attribute is added to Cgo functions called from Go, since Go
does not generate call through the GOT table, and adding non-GOT relocations for
a global function is not supported by gcc. Cgo symbols don't need to be global
and avoiding the GOT table is also faster.
The changes to 8l are only removes code relevant to the old -shared mode where
internal linking was used.
This CL only address the low level linker work. It can be submitted by itself,
but to be useful, the runtime changes in CL 9738047 is also needed.
Design discussion at
https://groups.google.com/forum/?fromgroups#!topic/golang-nuts/zmjXkGrEx6QFixes#5590.
R=rsc
CC=golang-dev
https://golang.org/cl/12871044
Breaks the build. Old bucket arrays kept by iterators
still need to be scanned.
««« original CL description
runtime: tell GC not to scan internal hashmap structures.
We'll do it ourselves via hash_gciter, thanks.
Fixes bug 6119.
R=golang-dev, dvyukov, cookieo9, rsc
CC=golang-dev
https://golang.org/cl/12840043
»»»
R=golang-dev
CC=golang-dev
https://golang.org/cl/12884043
The NetBSD and OpenBSD failures are apparently real,
not due to the test bug fixed in 100b9fc0c46f.
««« original CL description
runtime/pprof: test netbsd and openbsd again
Maybe these will work now.
R=golang-dev, dvyukov, bradfitz
CC=golang-dev
https://golang.org/cl/12787044
»»»
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/12873043
Currently it's possible that a goroutine
that periodically executes non-blocking
cgo/syscalls is never preempted.
This change splits scheduler and syscall
ticks to prevent such situation.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12658045
Currently we lose lots of profiling signals.
Most notably, GC is not accounted at all.
But stack splits, scheduler, syscalls, etc are lost as well.
This creates seriously misleading profile.
With this change all profiling signals are accounted.
Now I see these additional entries that were previously absent:
161 29.7% 29.7% 164 30.3% syscall.Syscall
12 2.2% 50.9% 12 2.2% scanblock
11 2.0% 55.0% 11 2.0% markonly
10 1.8% 58.9% 10 1.8% sweepspan
2 0.4% 85.8% 2 0.4% runtime.newstack
It is still impossible to understand what causes stack splits,
but at least it's clear how many time is spent on them.
Update #2197.
Update #5659.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12179043
If the timer goroutine is wakeup by timeout,
other goroutines will still notewakeup because sleeping is still set.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/12763043
The original plan was to collect allocation stacks
for all memory blocks. But it was never implemented
and it's not in near plans and it's unclear how to do it at all.
R=golang-dev, dave, bradfitz
CC=golang-dev
https://golang.org/cl/12724044
Prior to this change, pointer maps encoded the disposition of
a word using a single bit. A zero signaled a non-pointer
value and a one signaled a pointer value. Interface values,
which are a effectively a union type, were conservatively
labeled as a pointer.
This change widens the logical element size of the pointer map
to two bits per word. As before, zero signals a non-pointer
value and one signals a pointer value. Additionally, a two
signals an iface pointer and a three signals an eface pointer.
Following other changes to the runtime, values two and three
will allow a type information to drive interpretation of the
subsequent word so only those interface values containing a
pointer value will be scanned.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12689046
I've placed net.runtime_Semacquire into netpoll.goc,
but netbsd does not yet use netpoll.goc.
R=golang-dev, bradfitz, iant
CC=golang-dev
https://golang.org/cl/12699045
The mutex, fdMutex, handles locking and lifetime of sysfd,
and serializes Read and Write methods.
This allows to strip 2 sync.Mutex.Lock calls,
2 sync.Mutex.Unlock calls, 1 defer and some amount
of misc overhead from every network operation.
On linux/amd64, Intel E5-2690:
benchmark old ns/op new ns/op delta
BenchmarkTCP4Persistent 9595 9454 -1.47%
BenchmarkTCP4Persistent-2 8978 8772 -2.29%
BenchmarkTCP4ConcurrentReadWrite 4900 4625 -5.61%
BenchmarkTCP4ConcurrentReadWrite-2 2603 2500 -3.96%
In general it strips 70-500 ns from every network operation depending
on processor model. On my relatively new E5-2690 it accounts to ~5%
of network op cost.
Fixes#6074.
R=golang-dev, bradfitz, alex.brainman, iant, mikioh.mikioh
CC=golang-dev
https://golang.org/cl/12418043
Introduce freezetheworld function that is a best-effort attempt to stop any concurrently running goroutines. Call it during crash.
Fixes#5873.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12054044
GetQueuedCompletionStatusEx allows to dequeue a batch of completion
notifications, which is more efficient than dequeueing one by one.
benchmark old ns/op new ns/op delta
BenchmarkClientServerParallel4 100605 90945 -9.60%
BenchmarkClientServerParallel4-2 90225 74504 -17.42%
R=golang-dev, alex.brainman
CC=golang-dev
https://golang.org/cl/12436044
The test takes up to 64 seconds on windows builders.
I've tried to reduce number of iterations in the test,
but it does not affect run time.
Fixes#6054.
R=golang-dev, alex.brainman
CC=golang-dev
https://golang.org/cl/12531043