Otherwise a poorly timed GC can collect the memory before it
is returned to the Go program.
R=golang-dev, dave, dvyukov, minux.ma
CC=golang-dev
https://golang.org/cl/6819119
The profiler collects goroutine blocking information similar to Google Perf Tools.
You may see an example of the profile (converted to svg) attached to
http://code.google.com/p/go/issues/detail?id=3946
The public API changes are:
+pkg runtime, func BlockProfile([]BlockProfileRecord) (int, bool)
+pkg runtime, func SetBlockProfileRate(int)
+pkg runtime, method (*BlockProfileRecord) Stack() []uintptr
+pkg runtime, type BlockProfileRecord struct
+pkg runtime, type BlockProfileRecord struct, Count int64
+pkg runtime, type BlockProfileRecord struct, Cycles int64
+pkg runtime, type BlockProfileRecord struct, embedded StackRecord
R=rsc, dave, minux.ma, r
CC=gobot, golang-dev, r, remyoudompheng
https://golang.org/cl/6443115
The assembly offsets were converted mechanically using
code.google.com/p/rsc/cmd/asmlint. The instruction
changes were done by hand.
Fixes#2188.
R=iant, r, bradfitz, remyoudompheng
CC=golang-dev
https://golang.org/cl/6550058
This CL makes the runtime understand that the type of
the len or cap of a map, slice, or string is 'int', not 'int32',
and it is also careful to distinguish between function arguments
and results of type 'int' vs type 'int32'.
In the runtime, the new typedefs 'intgo' and 'uintgo' refer
to Go int and uint. The C types int and uint continue to be
unavailable (cause intentional compile errors).
This CL does not change the meaning of int, but it should make
the eventual change of the meaning of int on amd64 a bit
smoother.
Update #2188.
R=iant, r, dave, remyoudompheng
CC=golang-dev
https://golang.org/cl/6551067
The change is a preparation for the new scheduler.
It introduces runtime.park() function,
that will atomically unlock the mutex and park the goroutine.
It will allow to remove the racy readyonstop flag
that is difficult to implement w/o the global scheduler mutex.
R=rsc, remyoudompheng, dave
CC=golang-dev
https://golang.org/cl/6501077
Depends on CL 6197045.
Result obtained on Core i7 620M, Darwin/amd64:
benchmark old ns/op new ns/op delta
BenchmarkComplex128DivNormal 57 28 -50.78%
BenchmarkComplex128DivNisNaN 49 15 -68.90%
BenchmarkComplex128DivDisNaN 49 15 -67.88%
BenchmarkComplex128DivNisInf 40 12 -68.50%
BenchmarkComplex128DivDisInf 33 13 -61.06%
Result obtained on Core i7 620M, Darwin/386:
benchmark old ns/op new ns/op delta
BenchmarkComplex128DivNormal 89 50 -44.05%
BenchmarkComplex128DivNisNaN 307 802 +161.24%
BenchmarkComplex128DivDisNaN 309 788 +155.02%
BenchmarkComplex128DivNisInf 278 237 -14.75%
BenchmarkComplex128DivDisInf 46 22 -52.46%
Result obtained on 700MHz OMAP4460, Linux/ARM:
benchmark old ns/op new ns/op delta
BenchmarkComplex128DivNormal 1557 465 -70.13%
BenchmarkComplex128DivNisNaN 1443 220 -84.75%
BenchmarkComplex128DivDisNaN 1481 218 -85.28%
BenchmarkComplex128DivNisInf 952 216 -77.31%
BenchmarkComplex128DivDisInf 861 231 -73.17%
The 386 version has a performance regression, but as we have
decided to use SSE2 instead of x87 FPU for 386 too (issue 3912),
I won't address this issue.
R=dsymonds, mchaten, iant, dave, mtj, rsc, r
CC=golang-dev
https://golang.org/cl/6024045
Move panic/defer/recover-related stuff from proc.c/runtime.c to a new file panic.c.
No semantic changes.
proc.c is 1800+ LOC and is a bit difficult to work with.
R=golang-dev, dave, r
CC=golang-dev
https://golang.org/cl/6343071
It's sad to introduce a new macro, but rnd shows up consistently
in profiles, and the function call overwhelms the two arithmetic
instructions it performs.
R=r
CC=golang-dev
https://golang.org/cl/6260051
Parallel GC needs to know in advance how many helper threads will be there.
Hopefully it's the last patch before I can tackle parallel sweep phase.
The benchmarks are unaffected.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6200064
Work around profiling kernel bug with signal masks.
Still broken on 64-bit Snow Leopard kernel,
but I think we can ignore that one and let people
upgrade to Lion.
Add new trivial tools addr2line and objdump to take
the place of the GNU tools of the same name, since
those are not installed on OS X.
Adapt pprof to invoke 'go tool addr2line' and
'go tool objdump' if the system tools do not exist.
Clean up disassembly of base register on amd64.
Fixes#2008.
R=golang-dev, bradfitz, mikioh.mikioh, r, iant
CC=golang-dev
https://golang.org/cl/5697066
It is possible that Linux and Windows copy the FP control word
from the parent thread when creating a new thread. Empirically,
Darwin does not. Reset the FP control world in all cases.
Enable the floating-point strconv test.
Fixes#2917 (again).
R=golang-dev, r, iant
CC=golang-dev
https://golang.org/cl/5660047
Restore package os/signal, with new API:
Notify replaces Incoming, allowing clients
to ask for certain signals only. Also, signals
go to everyone who asks, not just one client.
This could plausibly move into package os now
that there are no magic side effects as a result
of the import.
Update runtime for new API: move common Unix
signal handling code into signal_unix.c.
(It's so easy to do this now that we don't have
to edit Makefiles!)
Tested on darwin,linux 386,amd64.
Fixes#1266.
R=r, dsymonds, bradfitz, iant, borman
CC=golang-dev
https://golang.org/cl/3749041
unsafe: delete Typeof, Reflect, Unreflect, New, NewArray
Part of issue 2955 and issue 2968.
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/5650069
Same idea as heap profile: how did each thread get created?
Low memory (256 bytes per OS thread), high reward for
programs that suddenly have many threads running.
Fixes#1477.
R=golang-dev, r, dvyukov
CC=golang-dev
https://golang.org/cl/5639059
This patch adds a function to get the current cpu ticks. This is
deemed to be 'sufficiently random' to use to seed fastrand to mitigate
the algorithmic complexity attacks on the hash table implementation.
On AMD64 we use the RDTSC instruction. For 386, this instruction,
while valid, is not recognized by 8a so I've inserted the opcode by
hand. For ARM, this routine is currently stubbed to return a constant
0 value.
Future work: update 8a to recognize RDTSC.
Fixes#2630.
R=rsc
CC=golang-dev
https://golang.org/cl/5606048
Collapse the arch,os-specific directories into the main directory
by renaming xxx/foo.c to foo_xxx.c, and so on.
There are no substantial edits here, except to the Makefile.
The assumption is that the Go tool will #define GOOS_darwin
and GOARCH_amd64 and will make any file named something
like signals_darwin.h available as signals_GOOS.h during the
build. This replaces what used to be done with -I$(GOOS).
There is still work to be done to make runtime build with
standard tools, but this is a big step. After this we will have
to write a script to generate all the generated files so they
can be checked in (instead of generated during the build).
R=r, iant, r, lucio.dere
CC=golang-dev
https://golang.org/cl/5490053
To allow these types as map keys, we must fill in
equal and hash functions in their algorithm tables.
Structs or arrays that are "just memory", like [2]int,
can and do continue to use the AMEM algorithm.
Structs or arrays that contain special values like
strings or interface values use generated functions
for both equal and hash.
The runtime helper func runtime.equal(t, x, y) bool handles
the general equality case for x == y and calls out to
the equal implementation in the algorithm table.
For short values (<= 4 struct fields or array elements),
the sequence of elementwise comparisons is inlined
instead of calling runtime.equal.
R=ken, mpimenov
CC=golang-dev
https://golang.org/cl/5451105
Equality on structs will require arbitrary code for type equality,
so change algorithm in type data from uint8 to table pointer.
In the process, trim top-level map structure from
104/80 bytes (64-bit/32-bit) to 24/12.
Equality on structs will require being able to call code generated
by the Go compiler, and C code has no way to access Go return
values, so change the hash and equal algorithm functions to take
a pointer to a result instead of returning the result.
R=ken
CC=golang-dev
https://golang.org/cl/5453043
This looks like it is just moving some code from
time to runtime (and translating it to C), but the
runtime can do a better job managing the goroutines,
and it needs this functionality for its own maintenance
(for example, for the garbage collector to hand back
unused memory to the OS on a time delay).
Might as well have just one copy of the timer logic,
and runtime can't depend on time, so vice versa.
It also unifies Sleep, NewTicker, and NewTimer behind
one mechanism, so that there are no claims that one
is more efficient than another. (For example, today
people recommend using time.After instead of time.Sleep
to avoid blocking an OS thread.)
Fixes#1644.
Fixes#1731.
Fixes#2190.
R=golang-dev, r, hectorchu, iant, iant, jsing, alex.brainman, dvyukov
CC=golang-dev
https://golang.org/cl/5334051
runtime knows how to get the time of day
without allocating memory.
R=golang-dev, dsymonds, dave, hectorchu, r, cw
CC=golang-dev
https://golang.org/cl/5297078
We only guarantee that the main goroutine runs on the
main OS thread for initialization. Programs that wish to
preserve that property for main.main can call runtime.LockOSThread.
This is what programs used to do before we unleashed
goroutines during init, so it is both a simple fix and keeps
existing programs working.
R=iant, r, dave, dvyukov
CC=golang-dev
https://golang.org/cl/5309070
Running test/garbage/parser.out.
On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 1 cpu, no atomics
32.27u 0.58s 32.86r 1 cpu, atomic instructions
33.04u 0.83s 27.47r 2 cpu
On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 1 cpu, no atomics
34.87u 1.12s 29.60r 2 cpu
36.00u 1.87s 28.43r 3 cpu
36.46u 2.34s 27.10r 4 cpu
38.28u 3.85s 26.92r 5 cpu
37.72u 5.25s 26.73r 6 cpu
39.63u 7.11s 26.95r 7 cpu
39.67u 8.10s 26.68r 8 cpu
On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 1 cpu, no atomics
43.98u 2.95s 38.69r 2 cpu
On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 1 cpu, no atomics
57.15u 4.72s 51.54r 2 cpu
The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.
R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082