New make target "testshort" runs "gotest -test.short" and is invoked
by run.bash, which is invoked by all.bash.
Use -test.short to make one package (crypto ecdsa) run much faster.
More changes to come.
Once this is in, I will update the long-running tests to use the new flag.
R=rsc
CC=golang-dev
https://golang.org/cl/4317043
Fixes#1641.
Actually it side steps the real issue, which is that the
setitimer(2) implementation on OS X is not useful for
profiling of multi-threaded programs. I filed the below
using the Apple Bug Reporter.
/*
Filed as Apple Bug Report #9177434.
This program creates a new pthread that loops, wasting cpu time.
In the main pthread, it sleeps on a condition that will never come true.
Before doing so it sets up an interval timer using ITIMER_PROF.
The handler prints a message saying which thread it is running on.
POSIX does not specify which thread should receive the signal, but
in order to be useful in a user-mode self-profiler like pprof or gprof
http://code.google.com/p/google-perftoolshttp://www.delorie.com/gnu/docs/binutils/gprof_25.html
it is important that the thread that receives the signal is the one
whose execution caused the timer to expire.
Linux and FreeBSD handle this by sending the signal to the process's
queue but delivering it to the current thread if possible:
http://lxr.linux.no/linux+v2.6.38/kernel/signal.c#L802
807 /*
808 * Now find a thread we can wake up to take the signal off the queue.
809 *
810 * If the main thread wants the signal, it gets first crack.
811 * Probably the least surprising to the average bear.
812 * /
http://fxr.watson.org/fxr/source/kern/kern_sig.c?v=FREEBSD8;im=bigexcerpts#L1907
1914 /*
1915 * Check if current thread can handle the signal without
1916 * switching context to another thread.
1917 * /
On those operating systems, this program prints:
$ ./a.out
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
signal on cpu-chewing looper thread
$
The OS X kernel does not have any such preference. Its get_signalthread
does not prefer current_thread(), in contrast to the other two systems,
so the signal gets delivered to the first thread in the list that is able to
handle it, which ends up being the main thread in this experiment.
http://fxr.watson.org/fxr/source/bsd/kern/kern_sig.c?v=xnu-1456.1.26;im=excerpts#L1666
$ ./a.out
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
signal on sleeping main thread
$
The fix is to make get_signalthread use the same heuristic as
Linux and FreeBSD, namely to use current_thread() if possible
before scanning the process thread list.
*/
#include <sys/time.h>
#include <sys/signal.h>
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
static void handler(int);
static void* looper(void*);
static pthread_t pmain, ploop;
int
main(void)
{
struct itimerval it;
struct sigaction sa;
pthread_cond_t cond;
pthread_mutex_t mu;
memset(&sa, 0, sizeof sa);
sa.sa_handler = handler;
sa.sa_flags = SA_RESTART;
memset(&sa.sa_mask, 0xff, sizeof sa.sa_mask);
sigaction(SIGPROF, &sa, 0);
pmain = pthread_self();
pthread_create(&ploop, 0, looper, 0);
memset(&it, 0, sizeof it);
it.it_interval.tv_usec = 10000;
it.it_value = it.it_interval;
setitimer(ITIMER_PROF, &it, 0);
pthread_mutex_init(&mu, 0);
pthread_mutex_lock(&mu);
pthread_cond_init(&cond, 0);
for(;;)
pthread_cond_wait(&cond, &mu);
return 0;
}
static void
handler(int sig)
{
static int nsig;
pthread_t p;
p = pthread_self();
if(p == pmain)
printf("signal on sleeping main thread\n");
else if(p == ploop)
printf("signal on cpu-chewing looper thread\n");
else
printf("signal on %p\n", (void*)p);
if(++nsig >= 10)
exit(0);
}
static void*
looper(void *v)
{
for(;;);
}
R=r
CC=golang-dev
https://golang.org/cl/4273113
Also fix comment.
The only caller of chanrecv initializes the value to false, so
this patch makes no difference at present. But it seems like
the right thing to do.
R=rsc
CC=golang-dev
https://golang.org/cl/4312053
Correctly distinguish between lhs and rhs identifiers
and resolve/declare them accordingly.
Collect field and method names in respective scopes
(will be available after some minor AST API changes).
Also collect imports since it's useful to have that
list directly w/o having to re-traverse the AST
(will also be available after some minor AST API changes).
No external API changes in this CL.
R=rsc, rog
CC=golang-dev
https://golang.org/cl/4271061
The top level bytes.Buffer is always there and can be re-used.
Rpc goes from 83 to 79 mallocs per round trip.
R=rsc
CC=golang-dev
https://golang.org/cl/4271062
- StartProcess will work with relative (to attr.Dir, not
current directory) executable filenames
- StartProcess will only work if executable filename points
to the real file, it will not search for executable in the
$PATH list and others (see CreateProcess manual for details)
- StartProcess argv strings can contain any characters
R=golang-dev, r
CC=golang-dev
https://golang.org/cl/4306041
This patch adds a connection cache and keep-alive
support to Transport, which is used by the
HTTP client.
It's also structured such that it's easy to add
HTTP pipelining in the future.
R=rsc, petar-m, bradfitzwork, r
CC=golang-dev
https://golang.org/cl/4272045
Revert changes to printer.Config. Pass in the
nodeSizes map trough an internal helper function.
R=golang-dev, rsc1
CC=golang-dev
https://golang.org/cl/4309042
Use memoization to avoid repeated recomputation of nested
node sizes. Speeds up testdata/slow.input by several orders
of magnitude.
- added respective test case
- added timeout to test code
- deleted some unrelated unused code
Fixes#1628.
R=rsc, r
CC=golang-dev
https://golang.org/cl/4274075
These timeouts are breaking tests in very slow
systems every once in a while. I've noticed
problems when compiling the Ubuntu packages for
arm, specifically.
R=golang-dev, adg
CC=golang-dev
https://golang.org/cl/4291058
Also in the common case avoid unnecessary buffering in
the channel.
Removes 13 allocations per round trip. Now at 86, down from
144 a week ago.
R=rsc, bradfitzgo, r2, rsc1
CC=golang-dev
https://golang.org/cl/4277060
The scanner returns slices into the original source
for token values. If those slices are making it into
the AST and from there into other long-living data
structures (e.g. godoc search), references to the
original source are kept around involuntarily.
For the current godoc and source tree, this change reduces
memory consumption after indexing and before GC by ~92MB
or almost 30%, and by ~10MB after GC (or about 6%).
R=rsc
CC=golang-dev
https://golang.org/cl/4273072
In conjunction with the non-blocking system call CL, this
gives about an 8% performance improvement on a client/server
test running on my local machine.
R=rsc, iant2
CC=golang-dev
https://golang.org/cl/4272057
- just an oversight; we were reallocating a buffer.
- use unsafe to avoid allocating storage for a string twice.
R=rsc
CC=golang-dev
https://golang.org/cl/4290056
Permit system calls to be designated as non-blocking, meaning
that we simply call them without involving the scheduler.
This change by itself is mostly performance neutral. In
combination with a following change to the net package there
is a performance advantage.
R=rsc, dfc, r2, iant2, rsc1
CC=golang-dev
https://golang.org/cl/4278055
- use enc.err and dec.err instead of return values in deferred error catcher
- replace io.WriteString with buffer.WriteString
now at:
mallocs per encode of type Bench: 7
mallocs per decode of type Bench: 8
R=rsc
CC=golang-dev
https://golang.org/cl/4277057
This just returns a ClientConn suitable for writing
proxy requests. To be used in Transport.
R=rsc, petar-m
CC=golang-dev
https://golang.org/cl/4290052
Dependency on bufio crept in during last CL; this breaks the cycle.
Also add a missing '-' to the documentation.
R=rsc
CC=golang-dev
https://golang.org/cl/4274061
There is some disagreement about how to deal with hash values larger
than the curve order size. We choose to follow OpenSSL's lead here.
R=bradfitzgo, r
CC=golang-dev
https://golang.org/cl/4273059
If braces don't have position information for a composite
literal, don't assume alignment of key:value pairs under
the (wrong) assumption that there may be multiple lines.
R=rsc
CC=golang-dev
https://golang.org/cl/4297043
On my mac:
mallocs per rpc round trip: 144
rpc.BenchmarkEndToEnd 10000 228244 ns/op
Room for improvement.
R=rsc
CC=golang-dev
https://golang.org/cl/4274058
This reduces the number of writes by 2 (1 client, 1 server) on each round trip.
A simple test shows 24% higher throughput.
R=rsc
CC=golang-dev
https://golang.org/cl/4279057
The loop always makes an extra system call. It only makes a
difference if more than 100 goroutines started waiting for
something to happen on a network file descriptor since the
last time the pipe was drained, which is unlikely since we
will be woken up the first time a goroutine starts waiting.
If we don't drain the pipe this time, we'll be woken up again
right away and can drain again.
R=rsc
CC=golang-dev
https://golang.org/cl/4275042
Add a benchmark.
BenchmarkEndToEndPipe gives 14.3microseconds/op before,
13.1microseconds/op after, or about 76e3 round trips per second
through the kernel.
With a bytes buffer, and therefore no system calls for I/O, the
numbers go to 7.3microseconds/op, or about 137e3 round trips
per second.
R=rsc
CC=golang-dev
https://golang.org/cl/4279045
Transport.Do -> RoundTripper.RoundTrip
This makes way for a subsequent CL to export the
currently private RoundTripper implementation
as struct Transport.
R=rsc, r
CC=golang-dev
https://golang.org/cl/4286043