1
0
mirror of https://github.com/golang/go synced 2024-11-25 03:37:58 -07:00
Commit Graph

4351 Commits

Author SHA1 Message Date
Russ Cox
132c42ff73 bufio: use copy - significant speedup for writers
R=r
https://golang.org/cl/167047
2009-12-08 18:19:48 -08:00
Devon H. O'Dell
5a4a08fab8 Fix stack on FreeBSD / add stack check across the board
FreeBSD was passing stk as the new thread's stack base, while
stk is the top of the stack in go. The added check should cause
a trap if this ever comes up in any new ports, or regresses
in current ones.

R=rsc
CC=golang-dev
https://golang.org/cl/167055
2009-12-08 18:19:30 -08:00
Devon H. O'Dell
cdce7325c8 When SA_SIGINFO is set, we should use __sa_sigaction on FreeBSD
R=rsc
CC=golang-dev
https://golang.org/cl/165097
2009-12-08 18:18:04 -08:00
Russ Cox
b73b43ea31 6l, 8l: make string buffer big enough for 8 chars (and then some)
Fixes #221.

R=ken2
https://golang.org/cl/165086
2009-12-07 22:01:59 -08:00
Russ Cox
86c0c54d27 test/bench: faster fasta (mostly due to bufio fix)
R=r
https://golang.org/cl/165083
2009-12-07 19:39:09 -08:00
Russ Cox
0d3301a557 runtime: don't touch pages of memory unnecessarily.
cuts working size for hello world from 6 MB to 1.2 MB.
still some work to be done, but diminishing returns.

R=r
https://golang.org/cl/165080
2009-12-07 15:52:14 -08:00
Russ Cox
33649bd278 runtime: introduce unsafe.New and unsafe.NewArray
to provide functionality previously hacked in to
    reflect and gob.

R=r
https://golang.org/cl/165076
2009-12-07 15:51:58 -08:00
Robert Griesemer
a4a8224152 use a bootstrap array to avoid allocation for short vectors
R=r
https://golang.org/cl/165078
2009-12-07 12:46:20 -08:00
Christopher Wedgwood
8c22dd24e0 Remove copyBytes completely in favor of copy.
R=r, rsc
https://golang.org/cl/165068
2009-12-07 11:31:56 -08:00
Rob Pike
20c1ec263a pick off special one-byte case in copy. worth 2x in benchmarks (38ns->16ns).
the one-item case could be generalized easily with no cost. worth considering.

R=rsc
CC=golang-dev, cw
https://golang.org/cl/167044
2009-12-07 11:28:02 -08:00
Roger Peppe
80e17d6797 the AST walker currently provides no way to find out how the
nodes in the tree are nested with respect to one another.
a simple change to the Visitor interface makes it possible
to do this (for example to maintain a current node-depth, or a
knowledge of the name of the current function).

Visit(nil) is called at the end of a node's children;
this make possible the channel-based interface below,
amongst other possibilities.

It is still just as simple to get the original behaviour - just
return the same Visitor from Visit.

Here are a couple of possible Visitor types.

// closure-based
type FVisitor func(n interface{}) FVisitor
func (f FVisitor) Visit(n interface{}) Visitor {
	return f(n);
}

// channel-based
type CVisitor chan Visit;
type Visit struct {
	node interface{};
	reply chan CVisitor;
};
func (v CVisitor) Visit(n interface{}) Visitor
{
	if n == nil {
		close(v);
	} else {
		reply := make(chan CVisitor);
		v <- Visit{n, reply};
		r := <-reply;
		if r == nil {
			return nil;
		}
		return r;
	}
	return nil;
}

R=gri
CC=rsc
https://golang.org/cl/166047
2009-12-07 10:33:45 -08:00
Roger Peppe
ea98e4b5e9 changes necessary to get the new chameneosredux onto shootout.alioth.debian.org .
it's now there: http://shootout.alioth.debian.org/u32q/benchmark.php?test=chameneosredux&lang=all&box=1!

R=r, rsc
CC=golang-dev
https://golang.org/cl/167043
2009-12-07 10:06:51 -08:00
Rob Pike
f91cd44736 save a few ns by inlining (which mostly simplifies things anyway).
a couple of cleanups.
don't keep big buffers in the free list.

R=rsc
CC=golang-dev
https://golang.org/cl/166078
2009-12-06 15:01:07 -08:00
Rob Pike
353ef80f65 unexport Fmt. it's not needed outside this package any more
cleans up godoc's output for package fmt substantially.

R=rsc
CC=golang-dev
https://golang.org/cl/165070
2009-12-06 12:58:16 -08:00
Rob Pike
4c0e51cd43 Make printing faster by avoiding mallocs and some other advances.
Roughly 33% faster for simple cases, probably more for complex ones.

Before:

mallocs per Sprintf(""): 4
mallocs per Sprintf("xxx"): 6
mallocs per Sprintf("%x"): 10
mallocs per Sprintf("%x %x"): 12

Now:

mallocs per Sprintf(""): 2
mallocs per Sprintf("xxx"): 3
mallocs per Sprintf("%x"): 5
mallocs per Sprintf("%x %x"): 7

Speed improves because of avoiding mallocs and also by sharing a bytes.Buffer
between print.go and format.go rather than copying the data back after each
printed item.

Before:

fmt_test.BenchmarkSprintfEmpty	1000000	      1346 ns/op
fmt_test.BenchmarkSprintfString	500000	      3461 ns/op
fmt_test.BenchmarkSprintfInt	500000	      3671 ns/op

Now:

fmt_test.BenchmarkSprintfEmpty	 2000000	       995 ns/op
fmt_test.BenchmarkSprintfString	 1000000	      2745 ns/op
fmt_test.BenchmarkSprintfInt	 1000000	      2391 ns/op
fmt_test.BenchmarkSprintfIntInt	  500000	      3751 ns/op

I believe there is more to get but this is a good milestone.

R=rsc
CC=golang-dev, hong
https://golang.org/cl/166076
2009-12-06 12:03:52 -08:00
Russ Cox
ed6fd1bcbe runtime: disable pointer scan optimization
* broken by reflect, gob

TBR=r
https://golang.org/cl/166077
2009-12-06 08:18:58 -08:00
Ian Lance Taylor
44c1eb6bed Fix syscall.Statfs and syscall.Fstatfs for 386 GNU/Linux.
For 386 we use the [f]statfs64 system call, which takes three
parameters: the filename, the size of the statfs64 structure,
and a pointer to the structure itself.

R=rsc
https://golang.org/cl/166073
2009-12-04 21:58:32 -08:00
Russ Cox
864c6bcbc7 test/bench: use range in reverse-complement
1.9s	gcc reverse-complement.c

reverse-complement.go
4.5s / 3.5s	original, with/without bounds checks
3.5s / 3.3s	bounds check reduction
3.3s / 2.8s	smarter garbage collector
2.6s / 2.3s	assembler bytes.IndexByte
2.5s / 2.1s	even smarter garbage collector
2.3s / 2.1s	fix optimizer unnecessary spill bug
2.0s / 1.9s	change loop to range (this CL)

R=r
https://golang.org/cl/166072
2009-12-04 21:44:29 -08:00
Russ Cox
864c757a1c gc/runtime: pass type structure to makeslice.
* inform garbage collector about memory with no pointers in it

1.9s	gcc reverse-complement.c

reverse-complement.go
4.5s / 3.5s	original, with/without bounds checks
3.5s / 3.3s	bounds check reduction
3.3s / 2.8s	smarter garbage collector
2.6s / 2.3s		assembler bytes.IndexByte
2.5s / 2.1s	even smarter garbage collector (this CL)

R=r
https://golang.org/cl/165064
2009-12-04 21:44:05 -08:00
Russ Cox
6f14cada11 gc: walk pointer in range on slice/array
R=ken2
https://golang.org/cl/166071
2009-12-04 20:40:21 -08:00
Russ Cox
7c4aeec868 6g/8g optimizer fix: throw functions now in runtime
R=ken2
https://golang.org/cl/166070
2009-12-04 20:37:32 -08:00
Russ Cox
e2b23e42a8 test/bench: dead code in reverse-complement
R=r
https://golang.org/cl/165065
2009-12-04 19:25:25 -08:00
Russ Cox
d7402cea8c gotest: stop if the // gotest commands fail
R=r
https://golang.org/cl/166067
2009-12-04 18:34:59 -08:00
Russ Cox
2807621d01 net: more fiddling with the udp test.
i don't know why the timeout needs
  to be so big.

R=r
https://golang.org/cl/165063
2009-12-04 18:34:45 -08:00
Russ Cox
d539d079ad libmach: fix disassembly of MOVLQSX
R=r
https://golang.org/cl/166068
2009-12-04 18:34:35 -08:00
Russ Cox
01f0f16ebc gotest: ignore *_test.pb.go
R=r
https://golang.org/cl/166064
2009-12-04 17:08:54 -08:00
Ian Lance Taylor
9e0b68d1ee Add syscall.Rename for NaCl. Fixes NaCl build.
R=rsc
https://golang.org/cl/165062
2009-12-04 13:49:58 -08:00
Adam Langley
e79bcf8bfd runtime: shift the index for the sort by one.
Makes the code look cleaner, even if it's a little harder to figure
out from the sort invariants.

R=rsc
CC=golang-dev
https://golang.org/cl/165061
2009-12-04 13:31:18 -08:00
Ian Lance Taylor
0b5cc31693 Add os.Rename.
R=rsc
https://golang.org/cl/166058
2009-12-04 11:46:56 -08:00
Adam Langley
d1740bb3a6 Remove global chanlock.
On a microbenchmark that ping-pongs on lots of channels, this makes
the multithreaded case about 20% faster and the uniprocessor case
about 1% slower. (Due to cache effects, I expect.)

R=rsc, agl
CC=golang-dev
https://golang.org/cl/166043
2009-12-04 10:57:01 -08:00
Russ Cox
d6b3f37e1e bytes: asm for bytes.IndexByte
PERFORMANCE DIFFERENCE

SUMMARY

                                                   amd64           386
2.2 GHz AMD Opteron 8214 HE (Linux)             3.0x faster    8.2x faster
3.60 GHz Intel Xeon (Linux)                     2.2x faster    6.2x faster
2.53 GHz Intel Core2 Duo E7200 (Linux)          1.5x faster    4.4x faster
2.66 Ghz Intel Xeon 5150 (Mac Pro, OS X)        1.5x SLOWER    3.0x faster
2.33 GHz Intel Xeon E5435 (Linux)               1.5x SLOWER    3.0x faster
2.33 GHz Intel Core2 T7600 (MacBook Pro, OS X)  1.4x SLOWER    3.0x faster
1.83 GHz Intel Core2 T5600 (Mac Mini, OS X)        none*       3.0x faster

* but yesterday I consistently saw 1.4x SLOWER.

DETAILS

2.2 GHz AMD Opteron 8214 HE (Linux)

amd64 (3x faster)

IndexByte4K            500000           3733 ns/op     1097.24 MB/s
IndexByte4M               500        4328042 ns/op      969.10 MB/s
IndexByte64M               50       67866160 ns/op      988.84 MB/s

IndexBytePortable4K    200000          11161 ns/op      366.99 MB/s
IndexBytePortable4M       100       11795880 ns/op      355.57 MB/s
IndexBytePortable64M       10      188675000 ns/op      355.68 MB/s

386 (8.2x faster)

IndexByte4K            500000           3734 ns/op     1096.95 MB/s
IndexByte4M               500        4209954 ns/op      996.28 MB/s
IndexByte64M               50       68031980 ns/op      986.43 MB/s

IndexBytePortable4K     50000          30670 ns/op      133.55 MB/s
IndexBytePortable4M        50       31868220 ns/op      131.61 MB/s
IndexBytePortable64M        2      508851500 ns/op      131.88 MB/s

3.60 GHz Intel Xeon (Linux)

amd64 (2.2x faster)

IndexByte4K            500000           4612 ns/op      888.12 MB/s
IndexByte4M               500        4835250 ns/op      867.44 MB/s
IndexByte64M               20       77388450 ns/op      867.17 MB/s

IndexBytePortable4K    200000          10306 ns/op      397.44 MB/s
IndexBytePortable4M       100       11201460 ns/op      374.44 MB/s
IndexBytePortable64M       10      179456800 ns/op      373.96 MB/s

386 (6.3x faster)

IndexByte4K            500000           4631 ns/op      884.47 MB/s
IndexByte4M               500        4846388 ns/op      865.45 MB/s
IndexByte64M               20       78691200 ns/op      852.81 MB/s

IndexBytePortable4K    100000          28989 ns/op      141.29 MB/s
IndexBytePortable4M        50       31183180 ns/op      134.51 MB/s
IndexBytePortable64M        5      498347200 ns/op      134.66 MB/s

2.53 GHz Intel Core2 Duo E7200  (Linux)

amd64 (1.5x faster)

IndexByte4K            500000           6502 ns/op      629.96 MB/s
IndexByte4M               500        6692208 ns/op      626.74 MB/s
IndexByte64M               10      107410400 ns/op      624.79 MB/s

IndexBytePortable4K    200000           9721 ns/op      421.36 MB/s
IndexBytePortable4M       100       10013680 ns/op      418.86 MB/s
IndexBytePortable64M       10      160460800 ns/op      418.23 MB/s

386 (4.4x faster)

IndexByte4K            500000           6505 ns/op      629.67 MB/s
IndexByte4M               500        6694078 ns/op      626.57 MB/s
IndexByte64M               10      107397600 ns/op      624.86 MB/s

IndexBytePortable4K    100000          28835 ns/op      142.05 MB/s
IndexBytePortable4M        50       29562680 ns/op      141.88 MB/s
IndexBytePortable64M        5      473221400 ns/op      141.81 MB/s

2.66 Ghz Intel Xeon 5150  (Mac Pro, OS X)

amd64 (1.5x SLOWER)

IndexByte4K            200000           9290 ns/op      440.90 MB/s
IndexByte4M               200        9568925 ns/op      438.33 MB/s
IndexByte64M               10      154473600 ns/op      434.44 MB/s

IndexBytePortable4K    500000           6202 ns/op      660.43 MB/s
IndexBytePortable4M       500        6583614 ns/op      637.08 MB/s
IndexBytePortable64M       20      107166250 ns/op      626.21 MB/s

386 (3x faster)

IndexByte4K            200000           9301 ns/op      440.38 MB/s
IndexByte4M               200        9568025 ns/op      438.37 MB/s
IndexByte64M               10      154391000 ns/op      434.67 MB/s

IndexBytePortable4K    100000          27526 ns/op      148.80 MB/s
IndexBytePortable4M       100       28302490 ns/op      148.20 MB/s
IndexBytePortable64M        5      454170200 ns/op      147.76 MB/s

2.33 GHz Intel Xeon E5435  (Linux)

amd64 (1.5x SLOWER)

IndexByte4K            200000          10601 ns/op      386.38 MB/s
IndexByte4M               100       10827240 ns/op      387.38 MB/s
IndexByte64M               10      173175500 ns/op      387.52 MB/s

IndexBytePortable4K    500000           7082 ns/op      578.37 MB/s
IndexBytePortable4M       500        7391792 ns/op      567.43 MB/s
IndexBytePortable64M       20      122618550 ns/op      547.30 MB/s

386 (3x faster)

IndexByte4K            200000          11074 ns/op      369.88 MB/s
IndexByte4M               100       10902620 ns/op      384.71 MB/s
IndexByte64M               10      181292800 ns/op      370.17 MB/s

IndexBytePortable4K     50000          31725 ns/op      129.11 MB/s
IndexBytePortable4M        50       32564880 ns/op      128.80 MB/s
IndexBytePortable64M        2      545926000 ns/op      122.93 MB/s

2.33 GHz Intel Core2 T7600 (MacBook Pro, OS X)

amd64 (1.4x SLOWER)

IndexByte4K            200000          11120 ns/op      368.35 MB/s
IndexByte4M               100       11531950 ns/op      363.71 MB/s
IndexByte64M               10      184819000 ns/op      363.11 MB/s

IndexBytePortable4K    500000           7419 ns/op      552.10 MB/s
IndexBytePortable4M       200        8018710 ns/op      523.06 MB/s
IndexBytePortable64M       10      127614900 ns/op      525.87 MB/s

386 (3x faster)

IndexByte4K            200000          11114 ns/op      368.54 MB/s
IndexByte4M               100       11443530 ns/op      366.52 MB/s
IndexByte64M               10      185212000 ns/op      362.34 MB/s

IndexBytePortable4K     50000          32891 ns/op      124.53 MB/s
IndexBytePortable4M        50       33930580 ns/op      123.61 MB/s
IndexBytePortable64M        2      545400500 ns/op      123.05 MB/s

1.83 GHz Intel Core2 T5600  (Mac Mini, OS X)

amd64 (no difference)

IndexByte4K            200000          13497 ns/op      303.47 MB/s
IndexByte4M               100       13890650 ns/op      301.95 MB/s
IndexByte64M                5      222358000 ns/op      301.81 MB/s

IndexBytePortable4K    200000          13584 ns/op      301.53 MB/s
IndexBytePortable4M       100       13913280 ns/op      301.46 MB/s
IndexBytePortable64M       10      222572600 ns/op      301.51 MB/s

386 (3x faster)

IndexByte4K            200000          13565 ns/op      301.95 MB/s
IndexByte4M               100       13882640 ns/op      302.13 MB/s
IndexByte64M                5      221411600 ns/op      303.10 MB/s

IndexBytePortable4K     50000          39978 ns/op      102.46 MB/s
IndexBytePortable4M        50       41038160 ns/op      102.20 MB/s
IndexBytePortable64M        2      656362500 ns/op      102.24 MB/s

R=r
CC=golang-dev
https://golang.org/cl/166055
2009-12-04 10:23:43 -08:00
Russ Cox
2a5f0c67ca spec: document that built-ins cannot be used as func values
R=gri
CC=golang-dev
https://golang.org/cl/164088
2009-12-04 10:23:12 -08:00
Russ Cox
609eeee817 make Native Client support build again,
add README explaining how to try the
web demos.

Fixes #339.

R=r
CC=barry.d.silverman, bss, vadim
https://golang.org/cl/165057
2009-12-04 10:11:32 -08:00
Russ Cox
11384eecf8 testing: compute MB/s in benchmarks
R=r
https://golang.org/cl/166060
2009-12-04 09:56:31 -08:00
Rob Pike
4ed57173b4 avoid an allocation inside bytes.Buffer by providing a static array.
R=rsc
https://golang.org/cl/165058
2009-12-04 00:26:08 -08:00
Russ Cox
f2c7a20142 8l: fix print line number format, buffer overflow
R=ken2
https://golang.org/cl/165059
2009-12-03 23:29:48 -08:00
Russ Cox
3b858fb808 net: turn off empty packet test by default
Fixes #374.

R=r
https://golang.org/cl/166053
2009-12-03 22:19:55 -08:00
Russ Cox
9da6666a8a gc: check for assignment to private fields during initialization
R=ken2
https://golang.org/cl/165055
2009-12-03 22:09:58 -08:00
Ken Thompson
62be24d949 6g code gen bug
R=rsc
https://golang.org/cl/166052
2009-12-03 20:28:24 -08:00
Michael Elkins
f3d63bea42 Add Count, Cycle, ZipWith, GroupBy, Repeat, RepeatTimes, Unique to exp/iterable.
Modify iterFunc to take chan<- instead of just chan.

R=rsc, dsymonds1
CC=golang-dev, r
https://golang.org/cl/160064
2009-12-03 20:03:07 -08:00
Adam Langley
e93132c982 crypto/rsa: fix shadowing error.
Fixes bug 375.

R=rsc
https://golang.org/cl/165045
2009-12-03 19:33:23 -08:00
Russ Cox
cf37254b1c runtime: fix Caller crash on 386.
Fixes #176.

R=r
https://golang.org/cl/166044
2009-12-03 17:24:14 -08:00
Russ Cox
6301fb4134 faq: add question about translation
R=jini, r
https://golang.org/cl/163092
2009-12-03 17:23:33 -08:00
Russ Cox
9a86cc679a codereview: do not gofmt deleted files
R=r
https://golang.org/cl/164083
2009-12-03 17:23:11 -08:00
Russ Cox
aaa2374b74 Make.conf: fix if $HOME has spaces
R=r
https://golang.org/cl/164086
2009-12-03 17:22:43 -08:00
Russ Cox
7e5055ceea runtime: malloc fixes
* throw away dead code
  * add mlookup counter
  * add malloc counter
  * set up for blocks with no pointers

Fixes #367.

R=r
https://golang.org/cl/165050
2009-12-03 17:22:23 -08:00
Rob Pike
10a349a7c1 The String() method requires global state that makes it not work outside of this package,
so make it a local method (_String()).

R=rsc
CC=golang-dev
https://golang.org/cl/165049
2009-12-03 17:14:32 -08:00
Rob Pike
fcc4dd6d64 error propagation in gob/encoder.
R=rsc
CC=golang-dev
https://golang.org/cl/165048
2009-12-03 17:12:57 -08:00
Rob Pike
bc3e34759c Add ReadFrom and WriteTo methods to bytes.Buffer, to enable i/o without buffer allocation.
Use them in Copy and Copyn.
Speed up ReadFile by using ReadFrom and avoiding Copy altogether (a minor win).

R=rsc, gri
CC=golang-dev
https://golang.org/cl/166041
2009-12-03 12:56:16 -08:00
Christopher Wedgwood
7e7008fa5e gc: Allow allow data types up to 1GB
R=rsc
https://golang.org/cl/164095
2009-12-03 12:46:34 -08:00