better mach binaries.
cgo working on darwin+linux amd64+386.
eliminated context switches - pi is 30x faster.
add libcgo to build.
on snow leopard:
- non-cgo binaries work; all tests pass.
- cgo binaries work on amd64 but not 386.
R=r
DELTA=2031 (1316 added, 626 deleted, 89 changed)
OCL=35264
CL=35304