1
0
mirror of https://github.com/golang/go synced 2024-10-05 02:21:22 -06:00
Commit Graph

71 Commits

Author SHA1 Message Date
Dmitriy Vyukov
5d637b83a9 runtime: speedup malloc stats collection
Count only number of frees, everything else is derivable
and does not need to be counted on every malloc.
benchmark                    old ns/op    new ns/op    delta
BenchmarkMalloc8                    68           66   -3.07%
BenchmarkMalloc16                   75           70   -6.48%
BenchmarkMallocTypeInfo8           102           97   -4.80%
BenchmarkMallocTypeInfo16          108          105   -2.78%

R=golang-dev, dave, rsc
CC=golang-dev
https://golang.org/cl/9776043
2013-06-06 14:56:50 +04:00
Anthony Martin
cdfbe00d91 runtime: fix description of SysAlloc
R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/10010046
2013-06-04 17:12:29 -07:00
Dmitriy Vyukov
86da989ee5 runtime: introduce helper persistentalloc() function
It is a caching wrapper around SysAlloc() that can allocate small chunks.
Use it for symtab allocations. Reduces number of symtab walks from 4 to 3
(reduces buildfuncs time from 10ms to 7.5ms on a large binary,
reduces initial heap size by 680K on the same binary).
Also can be used for type info allocation, itab allocation.
There are also several places in GC where we do the same thing,
they can be changed to use persistentalloc().
Also can be used in FixAlloc, because each instance of FixAlloc allocates
in 128K regions, which is too eager.
Reincarnation of committed and rolled back https://golang.org/cl/9805043
The latent bugs that it revealed are fixed:
https://golang.org/cl/9837049
https://golang.org/cl/9778048

R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/9778049
2013-05-31 10:42:30 +04:00
Dmitriy Vyukov
e17281b397 runtime: rename mheap.maps to mheap.spans
as was dicussed in cl/9791044

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/9853046
2013-05-30 17:09:58 +04:00
Dmitriy Vyukov
8bbb08533d runtime: make mheap statically allocated again
This depends on: 9791044: runtime: allocate page table lazily
Once page table is moved out of heap, the heap becomes small.
This removes unnecessary dereferences during heap access.
No logical changes.

R=golang-dev, khr
CC=golang-dev
https://golang.org/cl/9802043
2013-05-28 22:14:47 +04:00
Dmitriy Vyukov
671814b904 runtime: allocate page table lazily
This removes the 256MB memory allocation at startup,
which conflicts with ulimit.
Also will allow to eliminate an unnecessary memory dereference in GC,
because the page table is usually mapped at known address.
Update #5049.
Update #5236.

R=golang-dev, khr, r, khr, rsc
CC=golang-dev
https://golang.org/cl/9791044
2013-05-28 22:04:34 +04:00
Dmitriy Vyukov
828c68f8d8 undo CL 9805043 / 776aba85ece8
multiple failures on amd64

««« original CL description
runtime: introduce helper persistentalloc() function
It is a caching wrapper around SysAlloc() that can allocate small chunks.
Use it for symtab allocations. Reduces number of symtab walks from 4 to 3
(reduces buildfuncs time from 10ms to 7.5ms on a large binary,
reduces initial heap size by 680K on the same binary).
Also can be used for type info allocation, itab allocation.
There are also several places in GC where we do the same thing,
they can be changed to use persistentalloc().
Also can be used in FixAlloc, because each instance of FixAlloc allocates
in 128K regions, which is too eager.

R=golang-dev, daniel.morsing, khr
CC=golang-dev
https://golang.org/cl/9805043
»»»

R=golang-dev
CC=golang-dev
https://golang.org/cl/9822043
2013-05-28 11:14:39 +04:00
Dmitriy Vyukov
5166013f75 runtime: inline MCache_Alloc() into mallocgc()
benchmark                    old ns/op    new ns/op    delta
BenchmarkMalloc8                    68           62   -8.63%
BenchmarkMalloc16                   75           69   -7.94%
BenchmarkMallocTypeInfo8           102           98   -3.73%
BenchmarkMallocTypeInfo16          108          103   -4.63%

R=golang-dev, dave, khr
CC=golang-dev
https://golang.org/cl/9790043
2013-05-28 11:05:55 +04:00
Dmitriy Vyukov
47e0a3d7b1 runtime: introduce helper persistentalloc() function
It is a caching wrapper around SysAlloc() that can allocate small chunks.
Use it for symtab allocations. Reduces number of symtab walks from 4 to 3
(reduces buildfuncs time from 10ms to 7.5ms on a large binary,
reduces initial heap size by 680K on the same binary).
Also can be used for type info allocation, itab allocation.
There are also several places in GC where we do the same thing,
they can be changed to use persistentalloc().
Also can be used in FixAlloc, because each instance of FixAlloc allocates
in 128K regions, which is too eager.

R=golang-dev, daniel.morsing, khr
CC=golang-dev
https://golang.org/cl/9805043
2013-05-28 10:47:35 +04:00
Dmitriy Vyukov
5782f4117d runtime: introduce cnewarray() to simplify allocation of typed arrays
R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/9648044
2013-05-27 11:29:11 +04:00
Dmitriy Vyukov
c075d82cca runtime: fix and speedup malloc stats
Currently per-sizeclass stats are lost for destroyed MCache's. This patch fixes this.
Also, only update mstats.heap_alloc on heap operations, because that's the only
stat that needs to be promptly updated. Everything else needs to be up-to-date only in ReadMemStats().

R=golang-dev, remyoudompheng, dave, iant
CC=golang-dev
https://golang.org/cl/9207047
2013-05-22 22:22:57 +04:00
Dmitriy Vyukov
c4cfef075e runtime: simplify MCache
The nlistmin/size thresholds are copied from tcmalloc,
but are unnecesary for Go malloc. We do not do explicit
frees into MCache. For sparse cases when we do (mainly hashmap),
simpler logic will do.

R=rsc, dave, iant
CC=gobot, golang-dev, r, remyoudompheng
https://golang.org/cl/9373043
2013-05-22 13:29:17 +04:00
Dmitriy Vyukov
23ad563119 runtime: transfer whole span from MCentral to MCache
Finer-grained transfers were relevant with per-M caches,
with per-P caches they are not relevant and harmful for performance.
For few small size classes where it makes difference,
it's fine to grab the whole span (4K).

benchmark          old ns/op    new ns/op    delta
BenchmarkMalloc           42           40   -4.45%

R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/9374043
2013-05-15 18:35:05 +04:00
Dmitriy Vyukov
5a89b35bca runtime: inline size to class conversion in malloc()
Also change table type from int32[] to int8[] to save space in L1$.

benchmark          old ns/op    new ns/op    delta
BenchmarkMalloc           42           40   -4.68%

R=golang-dev, bradfitz, r
CC=golang-dev
https://golang.org/cl/9199044
2013-05-15 11:02:33 +04:00
Shenghou Ma
b3b1efd882 runtime: reduce max arena size on windows/amd64 to 32 GiB
Update #5236
Update #5402
This CL reduces gofmt's committed memory from 545864 KiB to 139568 KiB.
Note: Go 1.0.3 uses about 70MiB.

R=golang-dev, r, iant, nightlyone
CC=golang-dev
https://golang.org/cl/9245043
2013-05-07 06:53:02 +08:00
Dmitriy Vyukov
cfe336770b runtime: replace union in MHeap with a struct
Unions break precise GC.
Update #5193.

R=golang-dev, iant
CC=golang-dev
https://golang.org/cl/8368044
2013-04-06 20:02:03 -07:00
Jan Ziak
a656f82071 runtime: precise garbage collection of channels
This changeset adds a mostly-precise garbage collection of channels.
The garbage collection support code in the linker isn't recognizing
channel types yet.

Fixes issue http://stackoverflow.com/questions/14712586/memory-consumption-skyrocket

R=dvyukov, rsc, bradfitz
CC=dave, golang-dev, minux.ma, remyoudompheng
https://golang.org/cl/7307086
2013-02-25 15:58:23 -05:00
Russ Cox
1903ad7189 cmd/gc, reflect, runtime: switch to indirect func value representation
Step 1 of http://golang.org/s/go11func.

R=golang-dev, r, daniel.morsing, remyoudompheng
CC=golang-dev
https://golang.org/cl/7393045
2013-02-21 17:01:13 -05:00
Russ Cox
8a6ff3ab34 runtime: allocate heap metadata at run time
Before, the mheap structure was in the bss,
but it's quite large (today, 256 MB, much of
which is never actually paged in), and it makes
Go binaries run afoul of exec-time bss size
limits on some BSD systems.

Fixes #4447.

R=golang-dev, dave, minux.ma, remyoudompheng, iant
CC=golang-dev
https://golang.org/cl/7307122
2013-02-15 14:27:03 -05:00
Russ Cox
472354f81e runtime/debug: add controls for garbage collector
Fixes #4090.

R=golang-dev, iant, bradfitz, dsymonds
CC=golang-dev
https://golang.org/cl/7229070
2013-02-04 00:00:55 -05:00
Shenghou Ma
4019d0e424 runtime: avoid defining the same variable in more than one translation unit
For gccgo runtime and Darwin where -fno-common is the default.

R=iant, dave
CC=golang-dev
https://golang.org/cl/7094061
2013-01-26 09:57:06 +08:00
Jan Ziak
9204eb4d3c runtime: interpret type information during garbage collection
R=rsc, dvyukov, remyoudompheng, dave, minux.ma, bradfitz
CC=golang-dev
https://golang.org/cl/6945069
2013-01-10 15:45:46 -05:00
Russ Cox
9799a5a4fd runtime: allow up to 128 GB of allocated memory
Incorporates code from CL 6828055.

Fixes #2142.

R=golang-dev, iant, devon.odell
CC=golang-dev
https://golang.org/cl/6826088
2012-11-13 12:45:08 -05:00
Jan Ziak
e0c9d04aec runtime: add memorydump() debugging function
R=golang-dev
CC=golang-dev, remyoudompheng, rsc
https://golang.org/cl/6780059
2012-11-01 12:56:25 -04:00
Jingcheng Zhang
5d05c7800e runtime: sizeclass in MSpan should be int32.
R=golang-dev, minux.ma, dave, rsc
CC=golang-dev
https://golang.org/cl/6643046
2012-10-21 20:32:43 -04:00
Jan Ziak
4a191c2c1b runtime: store types of allocated objects
R=rsc
CC=golang-dev
https://golang.org/cl/6569057
2012-10-21 17:41:32 -04:00
Shenghou Ma
c1b7ddc6aa runtime: update docs for MemStats.PauseNs
PauseNs is a circular buffer of recent pause times, and the
most recent one is at [((NumGC-1)+256)%256].

   Also fix comments cross-linking the Go and C definition of
various structs.

R=golang-dev, rsc, bradfitz
CC=golang-dev
https://golang.org/cl/6657047
2012-10-22 01:08:13 +08:00
Jan Ziak
f8c58373e5 runtime: add types to MSpan
R=rsc
CC=golang-dev
https://golang.org/cl/6554060
2012-09-24 20:08:05 -04:00
Russ Cox
0b08c9483f runtime: prepare for 64-bit ints
This CL makes the runtime understand that the type of
the len or cap of a map, slice, or string is 'int', not 'int32',
and it is also careful to distinguish between function arguments
and results of type 'int' vs type 'int32'.

In the runtime, the new typedefs 'intgo' and 'uintgo' refer
to Go int and uint. The C types int and uint continue to be
unavailable (cause intentional compile errors).

This CL does not change the meaning of int, but it should make
the eventual change of the meaning of int on amd64 a bit
smoother.

Update #2188.

R=iant, r, dave, remyoudompheng
CC=golang-dev
https://golang.org/cl/6551067
2012-09-24 14:58:34 -04:00
Dmitriy Vyukov
ed516df4e4 runtime: add freemcache() function
It will be required for scheduler that maintains
GOMAXPROCS MCache's.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6350062
2012-07-01 13:10:01 +04:00
Dave Cheney
0b09425b5c runtime: use uintptr where possible in malloc stats
linux/arm OMAP4 pandaboard

benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17   68723297000  37026214000  -46.12%
BenchmarkFannkuch11     34962402000  35958435000   +2.85%
BenchmarkGobDecode        137298600    124182150   -9.55%
BenchmarkGobEncode         60717160     60006700   -1.17%
BenchmarkGzip            5647156000   5550873000   -1.70%
BenchmarkGunzip          1196350000   1198670000   +0.19%
BenchmarkJSONEncode       863012800    782898000   -9.28%
BenchmarkJSONDecode      3312989000   2781800000  -16.03%
BenchmarkMandelbrot200     45727540     45703120   -0.05%
BenchmarkParse             74781800     59990840  -19.78%
BenchmarkRevcomp          140043650    139462300   -0.42%
BenchmarkTemplate        6467682000   5832153000   -9.83%

benchmark                  old MB/s     new MB/s  speedup
BenchmarkGobDecode             5.59         6.18    1.11x
BenchmarkGobEncode            12.64        12.79    1.01x
BenchmarkGzip                  3.44         3.50    1.02x
BenchmarkGunzip               16.22        16.19    1.00x
BenchmarkJSONEncode            2.25         2.48    1.10x
BenchmarkJSONDecode            0.59         0.70    1.19x
BenchmarkParse                 0.77         0.97    1.26x
BenchmarkRevcomp              18.15        18.23    1.00x
BenchmarkTemplate              0.30         0.33    1.10x

darwin/386 core duo

benchmark                 old ns/op    new ns/op    delta
BenchmarkBinaryTree17   10591616577   9678245733   -8.62%
BenchmarkFannkuch11     10758473315  10749303846   -0.09%
BenchmarkGobDecode         34379785     34121250   -0.75%
BenchmarkGobEncode         23523721     23475750   -0.20%
BenchmarkGzip            2486191492   2446539568   -1.59%
BenchmarkGunzip           444179328    444250293   +0.02%
BenchmarkJSONEncode       221138507    219757826   -0.62%
BenchmarkJSONDecode      1056034428   1048975133   -0.67%
BenchmarkMandelbrot200     19862516     19868346   +0.03%
BenchmarkRevcomp         3742610872   3724821662   -0.48%
BenchmarkTemplate         960283112    944791517   -1.61%

benchmark                  old MB/s     new MB/s  speedup
BenchmarkGobDecode            22.33        22.49    1.01x
BenchmarkGobEncode            32.63        32.69    1.00x
BenchmarkGzip                  7.80         7.93    1.02x
BenchmarkGunzip               43.69        43.68    1.00x
BenchmarkJSONEncode            8.77         8.83    1.01x
BenchmarkJSONDecode            1.84         1.85    1.01x
BenchmarkRevcomp              67.91        68.24    1.00x
BenchmarkTemplate              2.02         2.05    1.01x

R=rsc, 0xe2.0x9a.0x9b, mirtchovski
CC=golang-dev, minux.ma
https://golang.org/cl/6297047
2012-06-08 17:35:14 -04:00
Dmitriy Vyukov
b0702bd0db runtime: faster GC mark phase
Also bump MaxGcproc to 8.

benchmark             old ns/op    new ns/op    delta
Parser               3796323000   3763880000   -0.85%
Parser-2             3591752500   3518560250   -2.04%
Parser-4             3423825250   3334955250   -2.60%
Parser-8             3304585500   3267014750   -1.14%
Parser-16            3313615750   3286160500   -0.83%

Tree                  984128500    942501166   -4.23%
Tree-2                932564444    883266222   -5.29%
Tree-4                835831000    799912777   -4.30%
Tree-8                819238500    789717333   -3.73%
Tree-16               880837833    837840055   -5.13%

Tree2                 604698100    579716900   -4.13%
Tree2-2               372414500    356765200   -4.20%
Tree2-4               187488100    177455900   -5.56%
Tree2-8               136315300    102086700  -25.11%
Tree2-16               93725900     76705800  -22.18%

ParserPause           157441210    166202783   +5.56%
ParserPause-2          93842650     85199900   -9.21%
ParserPause-4          56844404     53535684   -5.82%
ParserPause-8          35739446     30767613  -16.15%
ParserPause-16         32718255     27212441  -16.83%

TreePause              29610557     29787725   +0.60%
TreePause-2            24001659     20674421  -13.86%
TreePause-4            15114887     12842781  -15.03%
TreePause-8            13128725     10741747  -22.22%
TreePause-16           16131360     12506901  -22.47%

Tree2Pause           2673350920   2651045280   -0.83%
Tree2Pause-2         1796999200   1709350040   -4.88%
Tree2Pause-4         1163553320   1090706480   -6.67%
Tree2Pause-8          987032520    858916360  -25.11%
Tree2Pause-16         864758560    809567480   -6.81%

ParserLastPause       280537000    289047000   +3.03%
ParserLastPause-2     183030000    166748000   -8.90%
ParserLastPause-4     105817000     91552000  -13.48%
ParserLastPause-8      65127000     53288000  -18.18%
ParserLastPause-16     45258000     38334000  -15.30%

TreeLastPause          45072000     51449000  +12.39%
TreeLastPause-2        39269000     37866000   -3.57%
TreeLastPause-4        23564000     20649000  -12.37%
TreeLastPause-8        20881000     15807000  -24.30%
TreeLastPause-16       23297000     17309000  -25.70%

Tree2LastPause       6046912000   5797120000   -4.13%
Tree2LastPause-2     3724034000   3567592000   -4.20%
Tree2LastPause-4     1874831000   1774524000   -5.65%
Tree2LastPause-8     1363108000   1020809000  -12.79%
Tree2LastPause-16     937208000    767019000  -22.18%

R=rsc, 0xe2.0x9a.0x9b
CC=golang-dev
https://golang.org/cl/6223050
2012-05-24 10:55:50 +04:00
Dmitriy Vyukov
01826280eb runtime: refactor helpgc functionality in preparation for parallel GC
Parallel GC needs to know in advance how many helper threads will be there.
Hopefully it's the last patch before I can tackle parallel sweep phase.
The benchmarks are unaffected.

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/6200064
2012-05-15 19:10:16 +04:00
Dmitriy Vyukov
eb0bc8164a runtime: revert MaxGcproc from 16 to 4
The change accidentally come in with this revision:
https://code.google.com/p/go/source/detail?spec=svn345cbca96c5550f2e89bc727703301933802923c&r=14c38c23c819a17021b1808cf4a34ef3a1a17db5

R=golang-dev
CC=golang-dev
https://golang.org/cl/6195073
2012-05-11 13:30:34 +04:00
Dmitriy Vyukov
c1c851bbe8 runtime: avoid unnecessary zeroization of huge memory blocks
+move zeroization out of the heap mutex

R=golang-dev, iant, rsc
CC=golang-dev
https://golang.org/cl/6094050
2012-05-02 18:01:11 +04:00
Dmitriy Vyukov
4945fc8e40 runtime: speedup GC sweep phase (batch free)
benchmark                             old ns/op    new ns/op    delta
garbage.BenchmarkParser              4370050250   3779668750  -13.51%
garbage.BenchmarkParser-2            3713087000   3628771500   -2.27%
garbage.BenchmarkParser-4            3519755250   3406349750   -3.22%
garbage.BenchmarkParser-8            3386627750   3319144000   -1.99%

garbage.BenchmarkTree                 493585529    408102411  -17.32%
garbage.BenchmarkTree-2               500487176    402285176  -19.62%
garbage.BenchmarkTree-4               473238882    361484058  -23.61%
garbage.BenchmarkTree-8               486977823    368334823  -24.36%

garbage.BenchmarkTree2                 31446600     31203200   -0.77%
garbage.BenchmarkTree2-2               21469000     21077900   -1.82%
garbage.BenchmarkTree2-4               11007600     10899100   -0.99%
garbage.BenchmarkTree2-8                7692400      7032600   -8.58%

garbage.BenchmarkParserPause          241863263    163249450  -32.50%
garbage.BenchmarkParserPause-2        120135418    112981575   -5.95%
garbage.BenchmarkParserPause-4         83411552     64580700  -22.58%
garbage.BenchmarkParserPause-8         51870697     42207244  -18.63%

garbage.BenchmarkTreePause             20940474     13147011  -37.22%
garbage.BenchmarkTreePause-2           20115124     11146715  -44.59%
garbage.BenchmarkTreePause-4           17217584      7486327  -56.52%
garbage.BenchmarkTreePause-8           18258845      7400871  -59.47%

garbage.BenchmarkTree2Pause           174067190    172674190   -0.80%
garbage.BenchmarkTree2Pause-2         131175809    130615761   -0.43%
garbage.BenchmarkTree2Pause-4          95406666     93972047   -1.50%
garbage.BenchmarkTree2Pause-8          86056095     85334952   -0.84%

garbage.BenchmarkParserLastPause      329932000    324790000   -1.56%
garbage.BenchmarkParserLastPause-2    209383000    210456000   +0.51%
garbage.BenchmarkParserLastPause-4    113981000    112921000   -0.93%
garbage.BenchmarkParserLastPause-8     77967000     76625000   -1.72%

garbage.BenchmarkTreeLastPause         29752000     18444000  -38.01%
garbage.BenchmarkTreeLastPause-2       24274000     14766000  -39.17%
garbage.BenchmarkTreeLastPause-4       19565000      8726000  -55.40%
garbage.BenchmarkTreeLastPause-8       21956000     10530000  -52.04%

garbage.BenchmarkTree2LastPause       314411000    311945000   -0.78%
garbage.BenchmarkTree2LastPause-2     214641000    210836000   -1.77%
garbage.BenchmarkTree2LastPause-4     110024000    108943000   -0.98%
garbage.BenchmarkTree2LastPause-8      76873000     70263000   -8.60%

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5991049
2012-04-12 12:01:24 +04:00
Dmitriy Vyukov
342658bbb6 runtime: preparation for parallel GC
make MHeap.allspans an array instead on a linked-list,
it's required for parallel for

benchmark                              old ns/op    new ns/op    delta

garbage.BenchmarkTree                  494435529    487962705   -1.31%
garbage.BenchmarkTree-2                499652705    485358000   -2.86%
garbage.BenchmarkTree-4                468482117    454093117   -3.07%
garbage.BenchmarkTree-8                488533235    471872470   -3.41%
garbage.BenchmarkTree-16               507835176    492558470   -3.01%

garbage.BenchmarkTree2                  31453900     31404300   -0.16%
garbage.BenchmarkTree2-2                21440600     21477000   +0.17%
garbage.BenchmarkTree2-4                10982000     11117400   +1.23%
garbage.BenchmarkTree2-8                 7544700      7456700   -1.17%
garbage.BenchmarkTree2-16                7049500      6805700   -3.46%

garbage.BenchmarkParser               4448988000   4453264000   +0.10%
garbage.BenchmarkParser-2             4086045000   4057948000   -0.69%
garbage.BenchmarkParser-4             3677365000   3661246000   -0.44%
garbage.BenchmarkParser-8             3517253000   3540190000   +0.65%
garbage.BenchmarkParser-16            3506562000   3463478000   -1.23%

garbage.BenchmarkTreePause              20969784     21100238   +0.62%
garbage.BenchmarkTreePause-2            20215875     20139572   -0.38%
garbage.BenchmarkTreePause-4            17240709     16683624   -3.23%
garbage.BenchmarkTreePause-8            18196386     17639306   -3.06%
garbage.BenchmarkTreePause-16           20621158     20215056   -1.97%

garbage.BenchmarkTree2Pause            173992142    173872380   -0.07%
garbage.BenchmarkTree2Pause-2          131281904    131366666   +0.06%
garbage.BenchmarkTree2Pause-4           93484952     95109619   +1.74%
garbage.BenchmarkTree2Pause-8           88950523     86533333   -2.72%
garbage.BenchmarkTree2Pause-16          86071238     84089190   -2.30%

garbage.BenchmarkParserPause           135815000    135255952   -0.41%
garbage.BenchmarkParserPause-2          92691523     91451428   -1.34%
garbage.BenchmarkParserPause-4          53392190     51611904   -3.33%
garbage.BenchmarkParserPause-8          36059523     35116666   -2.61%
garbage.BenchmarkParserPause-16         30174300     27340600   -9.39%

garbage.BenchmarkTreeLastPause          28420000     29142000   +2.54%
garbage.BenchmarkTreeLastPause-2        23514000     26779000  +13.89%
garbage.BenchmarkTreeLastPause-4        21773000     18660000  -14.30%
garbage.BenchmarkTreeLastPause-8        24072000     21276000  -11.62%
garbage.BenchmarkTreeLastPause-16       25149000     28541000  +13.49%

garbage.BenchmarkTree2LastPause        314491000    313982000   -0.16%
garbage.BenchmarkTree2LastPause-2      214363000    214715000   +0.16%
garbage.BenchmarkTree2LastPause-4      109778000    111115000   +1.22%
garbage.BenchmarkTree2LastPause-8       75390000     74522000   -1.15%
garbage.BenchmarkTree2LastPause-16      70333000     67880000   -3.49%

garbage.BenchmarkParserLastPause       327247000    326815000   -0.13%
garbage.BenchmarkParserLastPause-2     217039000    212529000   -2.08%
garbage.BenchmarkParserLastPause-4     119722000    111535000   -6.84%
garbage.BenchmarkParserLastPause-8      70806000     69613000   -1.68%
garbage.BenchmarkParserLastPause-16     62813000     48009000  -23.57%

R=rsc, r
CC=golang-dev
https://golang.org/cl/5992055
2012-04-09 13:05:43 +04:00
Russ Cox
e4b02bfdc0 runtime: goroutine profile, stack dumps
R=golang-dev, r, r
CC=golang-dev
https://golang.org/cl/5687076
2012-02-22 21:45:01 -05:00
Sébastien Paolacci
5c598d3c9f runtime: release unused memory to the OS.
Periodically browse MHeap's freelists for long unused spans and release them if any.

Current hardcoded settings:
        - GC is forced if none occured over the last 2 minutes.
        - spans are handed back after 5 minutes of uselessness.

SysUnused (for Unix) is a wrapper on madvise MADV_DONTNEED on Linux and MADV_FREE on BSDs.

R=rsc, dvyukov, remyoudompheng
CC=golang-dev
https://golang.org/cl/5451057
2012-02-16 13:30:04 -05:00
Rémy Oudompheng
842c906e2e runtime: delete UpdateMemStats, replace with ReadMemStats(&stats).
Unexports runtime.MemStats and rename MemStatsType to MemStats.
The new accessor requires passing a pointer to a user-allocated
MemStats structure.

Fixes #2572.

R=bradfitz, rsc, bradfitz, gustavo
CC=golang-dev, remy
https://golang.org/cl/5616072
2012-02-06 19:16:26 +01:00
Russ Cox
a6d8b483b6 runtime: make garbage collector faster by deleting code
Suggested by Sanjay Ghemawat.  5-20% faster depending
on the benchmark.

Add tree2 garbage benchmark.
Update other garbage benchmarks to build again.

R=golang-dev, r, adg
CC=golang-dev
https://golang.org/cl/5530074
2012-01-10 19:49:11 -08:00
Dmitriy Vyukov
c14b2689f0 runtime: faster finalizers
Linux/amd64, 2 x Intel Xeon E5620, 8 HT cores, 2.40GHz
benchmark                    old ns/op    new ns/op    delta
BenchmarkFinalizer              420.00       261.00  -37.86%
BenchmarkFinalizer-2            985.00       201.00  -79.59%
BenchmarkFinalizer-4           1077.00       244.00  -77.34%
BenchmarkFinalizer-8           1155.00       180.00  -84.42%
BenchmarkFinalizer-16          1182.00       184.00  -84.43%

BenchmarkFinalizerRun          2128.00      1378.00  -35.24%
BenchmarkFinalizerRun-2        1655.00      1418.00  -14.32%
BenchmarkFinalizerRun-4        1634.00      1522.00   -6.85%
BenchmarkFinalizerRun-8        2213.00      1581.00  -28.56%
BenchmarkFinalizerRun-16       2424.00      1599.00  -34.03%

Darwin/amd64, Intel L9600, 2 cores, 2.13GHz
benchmark                    old ns/op    new ns/op    delta
BenchmarkChanCreation          1451.00       926.00  -36.18%
BenchmarkChanCreation-2        3124.00      1412.00  -54.80%
BenchmarkChanCreation-4        6121.00      2628.00  -57.07%

BenchmarkFinalizer              684.00       420.00  -38.60%
BenchmarkFinalizer-2          11195.00       398.00  -96.44%
BenchmarkFinalizer-4          15862.00       654.00  -95.88%

BenchmarkFinalizerRun          2025.00      1397.00  -31.01%
BenchmarkFinalizerRun-2        3920.00      1447.00  -63.09%
BenchmarkFinalizerRun-4        9471.00      1545.00  -83.69%

R=golang-dev, cw, rsc
CC=golang-dev
https://golang.org/cl/4963057
2011-10-06 18:42:51 +03:00
Dmitriy Vyukov
5695915833 runtime: fix spurious deadlock reporting
Fixes #2337.
Unfortunate sequence of events is:
1. maxcpu=2, mcpu=1, grunning=1
2. starttheworld creates an extra M:
   maxcpu=2, mcpu=2, grunning=1
4. the goroutine calls runtime.GOMAXPROCS(1)
   maxcpu=1, mcpu=2, grunning=1
5. since it sees mcpu>maxcpu, it calls gosched()
6. schedule() deschedules the goroutine:
   maxcpu=1, mcpu=1, grunning=0
7. schedule() call getnextandunlock() which
   fails to pick up the goroutine again,
   because canaddcpu() fails, because mcpu==maxcpu
8. then it sees that grunning==0,
   reports deadlock and terminates

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/5191044
2011-10-06 18:10:14 +03:00
Russ Cox
d324f2143b runtime: parallelize garbage collector mark + sweep
Running test/garbage/parser.out.

On a 4-core Lenovo X201s (Linux):
31.12u 0.60s 31.74r 	 1 cpu, no atomics
32.27u 0.58s 32.86r 	 1 cpu, atomic instructions
33.04u 0.83s 27.47r 	 2 cpu

On a 16-core Xeon (Linux):
33.08u 0.65s 33.80r 	 1 cpu, no atomics
34.87u 1.12s 29.60r 	 2 cpu
36.00u 1.87s 28.43r 	 3 cpu
36.46u 2.34s 27.10r 	 4 cpu
38.28u 3.85s 26.92r 	 5 cpu
37.72u 5.25s 26.73r	 6 cpu
39.63u 7.11s 26.95r	 7 cpu
39.67u 8.10s 26.68r	 8 cpu

On a 2-core MacBook Pro Core 2 Duo 2.26 (circa 2009, MacBookPro5,5):
39.43u 1.45s 41.27r 	 1 cpu, no atomics
43.98u 2.95s 38.69r 	 2 cpu

On a 2-core Mac Mini Core 2 Duo 1.83 (circa 2008; Macmini2,1):
48.81u 2.12s 51.76r 	 1 cpu, no atomics
57.15u 4.72s 51.54r 	 2 cpu

The handoff algorithm is really only good for two cores.
Beyond that we will need to so something more sophisticated,
like have each core hand off to the next one, around a circle.
Even so, the code is a good checkpoint; for now we'll limit the
number of gc procs to at most 2.

R=dvyukov
CC=golang-dev
https://golang.org/cl/4641082
2011-09-30 09:40:01 -04:00
Dmitriy Vyukov
27753ff108 runtime: add per-M caches for MemStats
Avoid touching centralized state during
memory manager operations.

R=mirtchovski
CC=golang-dev, rsc
https://golang.org/cl/4766042
2011-07-18 14:56:22 -04:00
Dmitriy Vyukov
66d5c9b1e9 runtime: add per-M caches for MemStats
Avoid touching centralized state during
memory manager opreations.

R=rsc
CC=golang-dev
https://golang.org/cl/4766042
2011-07-18 14:52:57 -04:00
Dmitriy Vyukov
c9152a8568 runtime: eliminate contention during stack allocation
Standard-sized stack frames use plain malloc/free
instead of centralized lock-protected FixAlloc.
Benchmark results on HP Z600 (2 x Xeon E5620, 8 HT cores, 2.40GHz)
are as follows:
benchmark                                        old ns/op    new ns/op    delta
BenchmarkStackGrowth                               1045.00       949.00   -9.19%
BenchmarkStackGrowth-2                             3450.00       800.00  -76.81%
BenchmarkStackGrowth-4                             5076.00       513.00  -89.89%
BenchmarkStackGrowth-8                             7805.00       471.00  -93.97%
BenchmarkStackGrowth-16                           11751.00       321.00  -97.27%

R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/4657091
2011-07-12 09:24:32 -07:00
Russ Cox
1e063b32cd runtime: faster allocator, garbage collector
GC is still single-threaded.
Multiple threads will happen in another CL.

Garbage collection pauses are typically
about half as long as they were before this CL.

R=brainman, iant, r
CC=golang-dev
https://golang.org/cl/3975046
2011-02-02 23:03:47 -05:00
Russ Cox
4608feb18b runtime: simpler heap map, memory allocation
The old heap maps used a multilevel table, but that
was overkill: there are only 1M entries on a 32-bit
machine and we can arrange to use a dense address
range on a 64-bit machine.

The heap map is in bss.  The assumption is that if
we don't touch the pages they won't be mapped in.

Also moved some duplicated memory allocation
code out of the OS-specific files.

R=r
CC=golang-dev
https://golang.org/cl/4118042
2011-01-28 15:03:26 -05:00
Russ Cox
bcd910cfe2 runtime: add per-pause gc stats
R=r, r2
CC=golang-dev
https://golang.org/cl/3980042
2011-01-19 13:41:42 -05:00