The stack scanner for not started goroutines ignored the arguments
area when its size was unknown. With this change, the distance
between the stack pointer and the stack base will be used instead.
Fixes#5486
R=golang-dev, bradfitz, iant, dvyukov
CC=golang-dev
https://golang.org/cl/9440043
If a slice points to an array embedded in a struct,
the whole struct can be incorrectly scanned as the slice buffer.
Fixes#5443.
R=cshapiro, iant, r, cshapiro, minux.ma
CC=bradfitz, gobot, golang-dev
https://golang.org/cl/9372044
Allocs of size 16 can bypass atomic set of the allocated bit, while allocs of size 8 can not.
Allocs with and w/o type info hit different paths inside of malloc.
Current results on linux/amd64:
BenchmarkMalloc8 50000000 43.6 ns/op
BenchmarkMalloc16 50000000 46.7 ns/op
BenchmarkMallocTypeInfo8 50000000 61.3 ns/op
BenchmarkMallocTypeInfo16 50000000 63.5 ns/op
R=golang-dev, remyoudompheng, minux.ma, bradfitz, iant
CC=golang-dev
https://golang.org/cl/9090045
for checking for page boundary. Also avoid boundary check
when >=16 bytes are hashed.
benchmark old ns/op new ns/op delta
BenchmarkHashStringSpeed 23 22 -0.43%
BenchmarkHashBytesSpeed 44 42 -3.61%
BenchmarkHashStringArraySpeed 71 68 -4.05%
R=iant, khr
CC=gobot, golang-dev, google
https://golang.org/cl/9123046
Finer-grained transfers were relevant with per-M caches,
with per-P caches they are not relevant and harmful for performance.
For few small size classes where it makes difference,
it's fine to grab the whole span (4K).
benchmark old ns/op new ns/op delta
BenchmarkMalloc 42 40 -4.45%
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/9374043
This is needed for preemptive scheduler,
it will preempt only when m->locks==0,
and we do not want to be preempted while
we have not completely unlocked the lock.
R=golang-dev, khr, iant
CC=golang-dev
https://golang.org/cl/9196047
Also change table type from int32[] to int8[] to save space in L1$.
benchmark old ns/op new ns/op delta
BenchmarkMalloc 42 40 -4.68%
R=golang-dev, bradfitz, r
CC=golang-dev
https://golang.org/cl/9199044
Move the documentation from race.go to doc.go, because
race.go uses +build race, so it's not normally parsed by go doc.
Rephrase the documentation for end users, provide link to race
detector manual.
Fixes#5444.
R=golang-dev, minux.ma, adg, r
CC=golang-dev
https://golang.org/cl/9144050
runtime.park() can access freed select descriptor
due to a racing free in another thread.
See the comment for details.
Slightly modified version of dvyukov's CL 9259045.
No test yet. Before this CL, the test described in issue 5422
would fail about every 40 times for me. With this CL, I ran
the test 5900 times with no failures.
Fixes#5422.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/9311043
The linker can generate split stack prolog when a textflag 7 function
makes an indirect function call. If it happens, badsignal() crashes
trying to dereference g.
Fixes#5337.
R=bradfitz, dave, adg, iant, r, minux.ma
CC=adonovan, golang-dev
https://golang.org/cl/9226043
runtime.setmg() calls another function (cgo_save_gm), so it must save
LR onto stack.
Re-enabled TestCthread test in misc/cgo/test.
Fixes#4863.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/9019043
It works on i386, but fails on amd64 and arm.
««« original CL description
runtime: prevent the GC from seeing the content of a frame in runfinq()
Fixes#5348.
R=golang-dev, dvyukov
CC=golang-dev
https://golang.org/cl/8954044
»»»
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/8695051
This will let us ask people to rebuild the Go system without
precise GC, and then rebuild and retest their program, to see
if precise GC is causing whatever problem they are having.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/8700043
UMTX_OP_WAIT expects that the address points to a uintptr, but
the code in lock_futex.c uses a uint32. UMTX_OP_WAIT_UINT is
just like UMTX_OP_WAIT, but the address points to a uint32.
This almost certainly makes no difference on a little-endian
system, but since the kernel supports it we should do the
right thing. And, who knows, maybe it matters.
R=golang-dev, bradfitz, r, ality
CC=golang-dev
https://golang.org/cl/8699043
The race detector uses a global lock to analyze atomic
operations. A panic in the middle of the code leaves the
lock acquired.
Similarly, the sync package may leave the race detectro
inconsistent when methods are called on nil pointers.
R=golang-dev, r, minux.ma, dvyukov, rsc, adg
CC=golang-dev
https://golang.org/cl/7981043
It's not trivial to make a comprehensive check
due to inferior pointers, reflect, gob, etc.
But this is essentially what I've used to debug
the GC issues.
Update #5193.
R=golang-dev, iant, 0xe2.0x9a.0x9b, r
CC=golang-dev
https://golang.org/cl/8455043
Use atomic operations on flags field to make sure we aren't
losing a flag update during parallel map operations.
R=golang-dev, dave, r
CC=golang-dev
https://golang.org/cl/8377046
The invariant is that there must be at least one running P or a thread polling network.
It was broken.
Fixes#5216.
R=golang-dev, bradfitz, r
CC=golang-dev
https://golang.org/cl/8459043
This makes it an unsafe.Pointer in Go so the garbage collector
will treat it as a pointer to untyped data, not a pointer to
bytes.
R=golang-dev, dvyukov
CC=golang-dev
https://golang.org/cl/8286045
If for whatever reason seh points into Go heap region,
the dangling pointer will cause memory corruption during GC.
Update #5193.
R=golang-dev, alex.brainman, iant
CC=golang-dev
https://golang.org/cl/8402045
Fixes#5175.
Race detector runtime expects values passed to MapShadow() to be page-aligned,
because they are used in mmap() call. If they are not aligned mmap() trims
either beginning or end of the mapping.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/8325043
This changes the map lookup behavior for string maps with 2-8 keys.
There was already previously a fastpath for 0 items and 1 item.
Now, if a string-keyed map has <= 8 items, first check all the
keys for length first. If only one has the right length, then
just check it for equality and avoid hashing altogether. Once
the map has more than 8 items, always hash like normal.
I don't know why some of the other non-string map benchmarks
got faster. This was with benchtime=2s, multiple times. I haven't
anything else getting slower, though.
benchmark old ns/op new ns/op delta
BenchmarkHashStringSpeed 37 34 -8.20%
BenchmarkHashInt32Speed 32 29 -10.67%
BenchmarkHashInt64Speed 31 27 -12.82%
BenchmarkHashStringArraySpeed 105 99 -5.43%
BenchmarkMegMap 274206 255153 -6.95%
BenchmarkMegOneMap 27 23 -14.80%
BenchmarkMegEqMap 148332 116089 -21.74%
BenchmarkMegEmptyMap 4 3 -12.72%
BenchmarkSmallStrMap 22 22 -0.89%
BenchmarkMapStringKeysEight_32 42 23 -43.71%
BenchmarkMapStringKeysEight_64 55 23 -56.96%
BenchmarkMapStringKeysEight_1M 279688 24 -99.99%
BenchmarkIntMap 16 15 -10.18%
BenchmarkRepeatedLookupStrMapKey32 40 37 -8.15%
BenchmarkRepeatedLookupStrMapKey1M 287918 272980 -5.19%
BenchmarkNewEmptyMap 156 130 -16.67%
R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/7641057
The expected precision setting for the x87 on Win32 is 53-bit
but MinGW resets the floating point unit to 64-bit. Win32
object code generally expects values to be rounded to double,
not double extended, precision.
R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/8175044
Doing grow work on reads is not multithreaded safe.
Changed code to do grow work only on inserts & deletes.
This is a short-term fix, eventually we'll want to do
grow work in parallel to recover the space of the old
table.
Fixes#5120.
R=bradfitz, khr
CC=golang-dev
https://golang.org/cl/8242043
Motivated by garbage profiling in HTTP benchmarks. This
changes means new empty maps are just one small allocation
(the HMap) instead the HMap + the relatively larger h->buckets
allocation. This helps maps which remain empty throughout
their life.
benchmark old ns/op new ns/op delta
BenchmarkNewEmptyMap 196 107 -45.41%
benchmark old allocs new allocs delta
BenchmarkNewEmptyMap 2 1 -50.00%
benchmark old bytes new bytes delta
BenchmarkNewEmptyMap 195 50 -74.36%
R=khr, golang-dev, r
CC=golang-dev
https://golang.org/cl/7722046
A HMUL node appears in some constant divisions, but
to observe a false negative in race detector the divisor must be
suitably chosen to make sure the only memory access is
done for HMUL.
R=dvyukov
CC=golang-dev
https://golang.org/cl/7935045
For Go 1.1, stop checking the rlimit, because it broke now
that mheap is allocated using SysAlloc. See issue 5049.
R=r
CC=golang-dev
https://golang.org/cl/7741050
The arm gentraceback mishandled frame linkage values pointing
to the assembly return function. This function is special as
its frame size is zero and it contains only one instruction.
These conditions would preserve the frame pointer and result
in an off by one error when unwinding the caller.
Fixes#5124
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/8023043
Prevents storm of error messages if something goes wrong.
In the case of issue 5073 the epoll fd was closed by the test.
Update #5073.
R=golang-dev, r, rsc
CC=golang-dev
https://golang.org/cl/7966043
Handle interface comparison correctly,
add a few more tests, mark more nodes as impossible.
R=dvyukov, golang-dev
CC=golang-dev
https://golang.org/cl/7942045
This keeps the logic about how to set the thread-local variables
m and g in code compiled and linked by the gc toolchain,
an important property for upcoming cgo changes.
It's also just a nice cleanup: one less place to update when
these details change.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/7560048
The right operand of a && and || is only executed conditionnally,
so the instrumentation must be more careful. In particular
it should not turn nodes assumed to be cheap after walk into
expensive ones.
Update #4228
R=dvyukov, golang-dev
CC=golang-dev
https://golang.org/cl/7986043
The ARM implementation of runtime.cgocallback_gofunc diverged
from the calling convention by leaving a word of garbage at
the top of the stack and storing the return PC above the
locals. This change stores the return PC at the top of the
stack and removes the save area above the locals.
Update #5124
This CL fixes first part of the ARM issues and added the unwind test.
R=golang-dev, bradfitz, minux.ma, cshapiro, rsc
CC=golang-dev
https://golang.org/cl/7728045
Adds the new debugging constant 'checkgc'. If its value is non-zero
all calls to mallocgc() from hashmap.c will start a garbage collection.
Fixes#5074.
R=golang-dev, khr
CC=golang-dev, rsc
https://golang.org/cl/7663051
Fixes performance of the current windows network poller
with the new scheduler.
Gives runtime a hint when GetQueuedCompletionStatus() will block.
Fixes#5068.
benchmark old ns/op new ns/op delta
BenchmarkTCP4Persistent 4004000 33906 -99.15%
BenchmarkTCP4Persistent-2 21790 17513 -19.63%
BenchmarkTCP4Persistent-4 44760 34270 -23.44%
BenchmarkTCP4Persistent-6 45280 43000 -5.04%
R=golang-dev, alex.brainman, coocood, rsc
CC=golang-dev
https://golang.org/cl/7612045
I'm not sure how to write a test for this. The change in
behaviour is that if you somehow get a SIGBUS signal for an
address >= 0x1000, the program will now crash rather than
calling panic. As far as I know, on x86 GNU/Linux, the only
way to get a SIGBUS (rather than a SIGSEGV) is to set the
stack pointer to an invalid value.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/7906045
On Darwin and FreeBSD, the mmap syscall return value is returned
unmodified. This means that the return value will either be a
valid address or a positive error number.
Also check return value from mmap in SysReserve - the callers of
SysReserve expect nil to be returned if the allocation failed.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7871043
Rather than just checking for ENOMEM, check for a return value of less
than 4096, so that we catch other errors such as EACCES and EINVAL.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/7942043
Fixes#5061.
Current code relies on the fact that fd's are automatically removed from epoll set when closed. However, it is not true. Underlying file description is removed from epoll set only when *all* fd's referring to it are closed.
There are 2 bad consequences:
1. Kernel delivers notifications on already closed fd's.
2. The following sequence of events leads to error:
- add fd1 to epoll
- dup fd1 = fd2
- close fd1 (not removed from epoll since we've dup'ed the fd)
- dup fd2 = fd1 (get the same fd as fd1)
- add fd1 to epoll = EEXIST
So, if fd can be potentially dup'ed of fork'ed, it's necessary to explicitly remove the fd from epoll set.
R=golang-dev, bradfitz, dave
CC=golang-dev
https://golang.org/cl/7870043
Add missing CLOSUREVAR in switch.
Mark MAKE, string conversion nodes as impossible.
Control statements do not need instrumentation.
Instrument COM and LROT nodes.
Instrument map length.
Update #4228
R=dvyukov, golang-dev
CC=golang-dev
https://golang.org/cl/7504047
Hashtable is arranged as an array of
8-entry buckets with chained overflow.
Each bucket has 8 extra hash bits
per key to provide quick lookup within
a bucket. Table is grown incrementally.
Update #3885
Go time drops from 0.51s to 0.34s.
R=r, rsc, m3b, dave, bradfitz, khr, ugorji, remyoudompheng
CC=golang-dev
https://golang.org/cl/7504044
Inserting a key-value pair into a hashmap storing keys or values
indirectly can cause the garbage collector to find the hashmap in
an inconsistent state.
Fixes#5074.
R=golang-dev, minux.ma, rsc
CC=golang-dev
https://golang.org/cl/7913043
On NetBSD tv_sec is already an int64 so no need for a test.
On OpenBSD, semasleep expects a Unix time as argument,
and 1<<30 is in 2004.
R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/7810044
The current SysAlloc implementation suffers from a signed vs unsigned
comparision bug. Since the error code from mmap is negated, the
unsigned comparision of v < 4096 is always false on error. Fix this
by switching to the darwin/freebsd/linux mmap model and leave the mmap
return value unmodified.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/7870044
This provides a way to generate core dumps when people need them.
The settings are:
GOTRACEBACK=0 no traceback on panic, just exit
GOTRACEBACK=1 default - traceback on panic, then exit
GOTRACEBACK=2 traceback including runtime frames on panic, then exit
GOTRACEBACK=crash traceback including runtime frames on panic, then crash
Fixes#3257.
R=golang-dev, devon.odell, r, daniel.morsing, ality
CC=golang-dev
https://golang.org/cl/7666044
NEGL does a negation of the bottom 32 bits and then zero-extends to 64 bits,
resulting in a negative 32-bit number but a positive 64-bit number.
NEGQ does a full 64-bit negation, so that the result is negative both as
a 32-bit and as a 64-bit number.
This doesn't matter for the functions that are declared to return int32.
It only matters for the ones that return int64 or void* [sic].
This will fix the current incorrect error in the OpenBSD/amd64 build.
The build will still be broken, but it won't report a bogus error.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/7536046
Bring net/fd_linux.go back (it was deleted this morning)
because it is still needed for ARM.
Fix a few typos in the runtime reorg.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/7759046
thread_GOOS.c becomes os_GOOS.c.
signal_GOOS_GOARCH.c becomes os_GOOS_GOARCH.c,
but with non-GOARCH-specific code moved into os_GOOS.c.
The actual arch-specific signal handler moves into signal_GOARCH.c
to avoid per-GOOS duplication.
New files signal_GOOS_GOARCH.h provide macros for
accessing fields of the very system-specific signal info structs.
Lots moving, but nothing changing.
This is a preliminarly cleanup so I can work on the signal
handling code to fix some open issues without having to
make each change 13 times.
Tested on Linux and OS X, 386 and amd64.
Will fix Plan 9, Windows, and ARM after the fact if necessary.
(Plan 9 and Windows should be fine; ARM will probably have some typos.)
Net effect: -1081 lines of code.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/7565048
With the global redefinition of runtime·open by CL 7543043,
we need to provide a third argument and remove the cast
to the string.
Fixes build on 386 version of Plan 9.
R=khr, rsc, rminnich, ality
CC=golang-dev
https://golang.org/cl/7644047
Uses AES hardware instructions on 386/amd64 to implement
a fast hash function. Incorporates a random key to
thwart hash collision DOS attacks.
Depends on CL#7548043 for new assembly instructions.
Update #3885
Helps some by making hashing faster. Go time drops from
0.65s to 0.51s.
R=rsc, r, bradfitz, remyoudompheng, khr, dsymonds, minux.ma, elias.naur
CC=golang-dev
https://golang.org/cl/7543043
The issue was that scvg is assigned *after* the scavenger goroutine is started,
so when the scavenger calls entersyscall() the g==scvg check can fail.
Fixes#5025.
R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/7629045
The problem is that there are lots of dead G's from previous tests,
each dead G consumes 1 stack segment.
Fixes#5034.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/7749043
The call to the C function runtime.findnull() requires
that we provide the argument at 0(SP).
R=rsc, rminnich, ality
CC=golang-dev
https://golang.org/cl/7559047
Otherwise the next goroutine run on the m
can get inadvertently locked if it executes a cgo call
that turns on the internal lock.
While we're here, fix the cgo panic unwind to
decrement m->ncgo like the non-panic unwind does.
Fixes#4971.
R=golang-dev, iant, dvyukov
CC=golang-dev
https://golang.org/cl/7627043
Now the default startup is that the program begins at _rt0_386_$GOOS,
which behaves as if calling main(argc, argv). Main jumps to _rt0_386.
This makes the _rt0_386 entry match the expected semantics for
the standard C "main" function, which we can now provide for use when
linking against a standard C library.
386 analogue of https://golang.org/cl/7525043
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/7551045
Change 231af8ac63aa (CL 7314062) made runtime.enteryscall()
set m->mcache = nil, which means that we can no longer use
syscall.errstr in syscall.Syscall and syscall.Syscall6, since it
requires a new buffer to be allocated for holding the error string.
Instead, we use pre-allocated per-M storage to hold error strings
from syscalls made while in entersyscall mode, and call
runtime.findnull to calculate the lengths.
Fixes#4994.
R=rsc, rminnich, ality, dvyukov, rminnich, r
CC=golang-dev
https://golang.org/cl/7567043
The deadlock episodically occurs on misc/cgo/test/TestCthread.
The problem is that starttheworld() leaves some P's with local work
without M's. Then all active M's enter into syscalls, but reject to
wake another M's due to the following check (both in entersyscallblock() and in retake()):
if(p->runqhead == p->runqtail &&
runtime·atomicload(&runtime·sched.nmspinning) +
runtime·atomicload(&runtime·sched.npidle) > 0)
continue;
R=rsc
CC=golang-dev
https://golang.org/cl/7424054
Still to do: non-linux and non-amd64.
It may work on other ELF-based amd64 systems too, but untested.
"go test -ldflags -hostobj $GOROOT/misc/cgo/test" passes.
Much may yet change, but this seems a reasonable checkpoint.
R=iant
CC=golang-dev
https://golang.org/cl/7369057
Now the default startup is that the program begins at _rt0_amd64_$GOOS,
which sets DI = argc, SI = argv and jumps to _rt0_amd64.
This makes the _rt0_amd64 entry match the expected semantics for
the standard C "main" function, which we can now provide for use when
linking against a standard C library.
R=golang-dev, devon.odell, minux.ma
CC=golang-dev
https://golang.org/cl/7525043
broke arm garbage collector
traceback_arm fails with a missing pc. It needs CL 7494043.
But that only makes the build break later, this time with
"invalid freelist". Roll back until it can be fixed correctly.
««« original CL description
runtime: restrict stack root scan to locals and arguments
R=rsc
CC=golang-dev
https://golang.org/cl/7301062
»»»
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/7493044
If the constant CollectStats is non-zero and GOGCTRACE=1
the garbage collector will print basic statistics about executed
GC instructions.
R=golang-dev, dvyukov
CC=golang-dev, rsc
https://golang.org/cl/7413049
Fixes#4904.
The problem was that when the test runs the heap had grown to ~100MB,
so GC allows it to grow to 200MB, and so the test fails.
Moving the test to a separate process makes it much more isolated and stable.
R=golang-dev, minux.ma
CC=golang-dev
https://golang.org/cl/7441046
Also make the crossover point an architecture-dependent constant,
although it's the same everywhere for now.
BenchmarkAppendStr1Byte 416 145 -65.14%
BenchmarkAppendStr4Bytes 743 217 -70.79%
BenchmarkAppendStr8Bytes 421 270 -35.87%
BenchmarkAppendStr16Bytes 415 403 -2.89%
BenchmarkAppendStr32Bytes 415 391 -5.78%
R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/7459044
Putting the M initialization in multiple places will not scale.
Various code assumes mstart is the start already. Make it so.
R=golang-dev, devon.odell
CC=golang-dev
https://golang.org/cl/7420048