1
0
mirror of https://github.com/golang/go synced 2024-10-04 16:21:22 -06:00
Commit Graph

29 Commits

Author SHA1 Message Date
Dmitriy Vyukov
cd17a717f9 runtime: simpler and faster GC
Implement the design described in:
https://docs.google.com/document/d/1v4Oqa0WwHunqlb8C3ObL_uNQw3DfSY-ztoA-4wWbKcg/pub

Summary of the changes:
GC uses "2-bits per word" pointer type info embed directly into bitmap.
Scanning of stacks/data/heap is unified.
The old spans types go away.
Compiler generates "sparse" 4-bits type info for GC (directly for GC bitmap).
Linker generates "dense" 2-bits type info for data/bss (the same as stacks use).

Summary of results:
-1680 lines of code total (-1000+ in mgc0.c only)
-25% memory consumption
-3-7% binary size
-15% GC pause reduction
-7% run time reduction

LGTM=khr
R=golang-codereviews, rsc, christoph, khr
CC=golang-codereviews, rlh
https://golang.org/cl/106260045
2014-07-29 11:01:02 +04:00
Dmitriy Vyukov
0d72364616 runtime/race: update runtime to tip
This requires minimal changes to the runtime hooks. In particular,
synchronization events must be done only on valid addresses now,
so I've added the additional checks to race.c.

LGTM=iant
R=iant
CC=golang-codereviews
https://golang.org/cl/101000046
2014-06-20 16:36:21 -07:00
Dmitriy Vyukov
a1695d2ea3 runtime: use custom thunks for race calls instead of cgo
Implement custom assembly thunks for hot race calls (memory accesses and function entry/exit).
The thunks extract caller pc, verify that the address is in heap or global and switch to g0 stack.

Before:
ok  	regexp	3.692s
ok  	compress/bzip2	9.461s
ok  	encoding/json	6.380s
After:
ok  	regexp	2.229s (-40%)
ok  	compress/bzip2	4.703s (-50%)
ok  	encoding/json	3.629s (-43%)

For comparison, normal non-race build:
ok  	regexp	0.348s
ok  	compress/bzip2	0.304s
ok  	encoding/json	0.661s
Race build:
ok  	regexp	2.229s (+540%)
ok  	compress/bzip2	4.703s (+1447%)
ok  	encoding/json	3.629s (+449%)

Also removes some race-related special cases from cgocall and scheduler.
In long-term it will allow to remove cyclic runtime/race dependency on cmd/cgo.

Fixes #4249.
Fixes #7460.
Update #6508
Update #6688

R=iant, rsc, bradfitz
CC=golang-codereviews
https://golang.org/cl/55100044
2014-03-06 23:48:30 +04:00
Keith Randall
f59ea4e58b runtime: Fix race detector checks to ignore KindNoPointers bit
when comparing kinds.

R=dvyukov, dave, khr
CC=golang-codereviews
https://golang.org/cl/41660045
2014-01-04 08:43:17 -08:00
Dmitriy Vyukov
d017f578d0 runtime: do not preempt race calls
In the crash stack trace race cgocall() calls endcgo(),
this means that m->racecall is wrong.
Indeed this can happen is a goroutine is rescheduled to another M
during race call.
Disable preemption for race calls.
Fixes #6155.

R=golang-dev, rsc, cshapiro
CC=golang-dev
https://golang.org/cl/12866045
2013-08-19 23:06:46 +04:00
Keith Randall
e838334beb runtime: change textflags from numbers to symbols
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/12798043
2013-08-12 13:47:18 -07:00
Dmitriy Vyukov
fa4628346b runtime: remove unused m->racepc
The original plan was to collect allocation stacks
for all memory blocks. But it was never implemented
and it's not in near plans and it's unclear how to do it at all.

R=golang-dev, dave, bradfitz
CC=golang-dev
https://golang.org/cl/12724044
2013-08-12 21:48:19 +04:00
Dmitriy Vyukov
23e15f7253 net: add special netFD mutex
The mutex, fdMutex, handles locking and lifetime of sysfd,
and serializes Read and Write methods.
This allows to strip 2 sync.Mutex.Lock calls,
2 sync.Mutex.Unlock calls, 1 defer and some amount
of misc overhead from every network operation.

On linux/amd64, Intel E5-2690:
benchmark                             old ns/op    new ns/op    delta
BenchmarkTCP4Persistent                    9595         9454   -1.47%
BenchmarkTCP4Persistent-2                  8978         8772   -2.29%
BenchmarkTCP4ConcurrentReadWrite           4900         4625   -5.61%
BenchmarkTCP4ConcurrentReadWrite-2         2603         2500   -3.96%

In general it strips 70-500 ns from every network operation depending
on processor model. On my relatively new E5-2690 it accounts to ~5%
of network op cost.

Fixes #6074.

R=golang-dev, bradfitz, alex.brainman, iant, mikioh.mikioh
CC=golang-dev
https://golang.org/cl/12418043
2013-08-09 21:43:00 +04:00
Rémy Oudompheng
3be794cdc2 cmd/gc: instrument arrays properly in race detector.
The previous implementation would only record access to
the address of the array but the memory access to the whole
memory range must be recorded instead.

R=golang-dev, dvyukov, r
CC=golang-dev
https://golang.org/cl/8053044
2013-06-14 11:14:45 +02:00
Dmitriy Vyukov
e2d95c1f24 runtime/race: remove now unused step parameter from range access functions
R=golang-dev, dave
CC=golang-dev
https://golang.org/cl/10259043
2013-06-13 16:38:44 +04:00
Dmitriy Vyukov
fc80764792 runtime/race: tell race detector what memory Read/Write syscalls touch
Fixes #5567.

R=golang-dev, dave, iant
CC=golang-dev
https://golang.org/cl/10085043
2013-06-10 22:40:35 +04:00
Dmitriy Vyukov
8bbb08533d runtime: make mheap statically allocated again
This depends on: 9791044: runtime: allocate page table lazily
Once page table is moved out of heap, the heap becomes small.
This removes unnecessary dereferences during heap access.
No logical changes.

R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/9802043
2013-05-28 22:14:47 +04:00
Anthony Martin
2dc751ac21 runtime, cmd/gc: clean up function protoypes
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/8652043
2013-05-18 15:49:23 -07:00
Dmitriy Vyukov
12b7db3d57 runtime: fix data/bss shadow memory mapping for race detector
Fixes #5175.
Race detector runtime expects values passed to MapShadow() to be page-aligned,
because they are used in mmap() call. If they are not aligned mmap() trims
either beginning or end of the mapping.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/8325043
2013-04-04 09:11:34 +11:00
Rémy Oudompheng
e2f9e816b7 runtime: fix racefuncenter argument corruption.
Revision 6a88e1893941 corrupts the argument to
racefuncenter by pushing the data block pointer
to the stack.

Fixes #4885.

R=dvyukov, rsc
CC=golang-dev
https://golang.org/cl/7381053
2013-02-28 07:32:29 +01:00
Russ Cox
ec892be1af runtime: preserve DX during racefuncenter
R=golang-dev, dvyukov
CC=golang-dev
https://golang.org/cl/7382049
2013-02-22 13:06:43 -05:00
Dave Cheney
21327e1970 runtime: fix unused variable warning
R=rsc, dvyukov
CC=golang-dev
https://golang.org/cl/7312103
2013-02-16 14:32:04 +11:00
Russ Cox
8a6ff3ab34 runtime: allocate heap metadata at run time
Before, the mheap structure was in the bss,
but it's quite large (today, 256 MB, much of
which is never actually paged in), and it makes
Go binaries run afoul of exec-time bss size
limits on some BSD systems.

Fixes #4447.

R=golang-dev, dave, minux.ma, remyoudompheng, iant
CC=golang-dev
https://golang.org/cl/7307122
2013-02-15 14:27:03 -05:00
Dmitriy Vyukov
0a40cd2661 runtime/race: switch to explicit race context instead of goroutine id's
Removes limit on maximum number of goroutines ever existed.
code.google.com/p/goexecutor tests now pass successfully.
Also slightly improves performance.
Before: $ time ./flate.test -test.short
real	0m9.314s
After:  $ time ./flate.test -test.short
real	0m8.958s
Fixes #4286.
The runtime is built from llvm rev 174312.

R=rsc
CC=golang-dev
https://golang.org/cl/7218044
2013-02-06 11:40:54 +04:00
Rémy Oudompheng
ccc61eadd5 runtime: implement range access functions in race detector.
Range access functions are already available in TSan library
but were not yet used.

Time for go test -race -short:

Before:
compress/flate 24.244s
exp/norm       >200s
go/printer     78.268s

After:
compress/flate 17.760s
exp/norm        5.537s
go/printer      5.738s

Fixes #4250.

R=dvyukov, golang-dev, fullung
CC=golang-dev
https://golang.org/cl/7229044
2013-01-30 01:55:02 +01:00
Dmitriy Vyukov
0ce96f9ef4 runtime: better stack traces in race reports
When a race happens inside of runtime (chan, slice, etc),
currently reports contain only user file:line.
If the line contains a complex expression,
it's difficult to figure out where the race exactly.
This change adds one more top frame with exact
runtime function (e.g. runtime.chansend, runtime.mapaccess).

R=golang-dev
CC=golang-dev
https://golang.org/cl/6851125
2012-11-30 10:29:41 +04:00
Dmitriy Vyukov
8f82bff545 runtime: hide mheap from race detector
This significantly decreases amount of shadow memory
mapped by race detector.
I haven't tested on Windows, but on Linux it reduces
virtual memory size from 1351m to 330m for fmt.test.
Fixes #4379.

R=golang-dev, alex.brainman, iant
CC=golang-dev
https://golang.org/cl/6849057
2012-11-16 20:06:11 +04:00
Dmitriy Vyukov
51e89f59b2 runtime: add RaceRead/RaceWrite functions
It allows to catch e.g. a data race between atomic write and non-atomic write,
or Mutex.Lock() and mutex overwrite (e.g. mu = Mutex{}).

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6817103
2012-11-14 16:51:23 +04:00
Rémy Oudompheng
f59a605645 runtime: use runtime·callers when racefuncenter's pc is on the heap.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6821069
2012-11-07 21:35:21 +01:00
Dmitriy Vyukov
1a19f01a68 runtime/race: lazily allocate shadow memory
Currently race detector runtime maps shadow memory eagerly at process startup.
It works poorly on Windows, because Windows requires reservation in swap file
(especially problematic if several Go program runs at the same, each consuming GBs
of memory).
With this change race detector maps shadow memory lazily, so Go runtime must notify
about all new heap memory.
It will help with Windows port, but also eliminates scary 16TB virtual mememory
consumption in top output (which sometimes confuses some monitoring scripts).

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6811085
2012-11-07 12:48:58 +04:00
Dmitriy Vyukov
0ead18c59e runtime: mark race instrumentation callbacks as nosplitstack
It speedups the race detector somewhat, but also prevents
getcallerpc() from obtaining lessstack().

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/6812091
2012-11-06 20:54:22 +04:00
Rémy Oudompheng
ce287933d6 cmd/gc, runtime: pass PC directly to racefuncenter.
go test -race -run none -bench . encoding/json
benchmark                      old ns/op    new ns/op    delta
BenchmarkCodeEncoder          3207689000   1716149000  -46.50%
BenchmarkCodeMarshal          3206761000   1715677000  -46.50%
BenchmarkCodeDecoder          8647304000   4482709000  -48.16%
BenchmarkCodeUnmarshal        8032217000   3451248000  -57.03%
BenchmarkCodeUnmarshalReuse   8016722000   3480502000  -56.58%
BenchmarkSkipValue           10340453000   4560313000  -55.90%

benchmark                       old MB/s     new MB/s  speedup
BenchmarkCodeEncoder                0.60         1.13    1.88x
BenchmarkCodeMarshal                0.61         1.13    1.85x
BenchmarkCodeDecoder                0.22         0.43    1.95x
BenchmarkCodeUnmarshal              0.24         0.56    2.33x
BenchmarkCodeUnmarshalReuse         0.24         0.56    2.33x
BenchmarkSkipValue                  0.19         0.44    2.32x

Fixes #4248.

R=dvyukov, golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6815066
2012-11-01 19:43:29 +01:00
Dmitriy Vyukov
27e93fbd00 runtime: fix race detector handling of stackalloc()
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6632051
2012-10-10 18:06:29 +04:00
Dmitriy Vyukov
2f6cbc74f1 race: runtime changes
This is a part of a bigger change that adds data race detection feature:
https://golang.org/cl/6456044

R=rsc
CC=gobot, golang-dev
https://golang.org/cl/6535050
2012-10-07 22:05:32 +04:00