1
0
mirror of https://github.com/golang/go synced 2024-10-04 16:31:22 -06:00
go/src/pkg/crypto
Russ Cox 1af960802a crypto/rc4: faster amd64, 386 implementations
-- amd64 --

On a MacBookPro10,2 (Core i5):

benchmark           old ns/op    new ns/op    delta
BenchmarkRC4_128          470          421  -10.43%
BenchmarkRC4_1K          3123         3275   +4.87%
BenchmarkRC4_8K         26351        25866   -1.84%

benchmark            old MB/s     new MB/s  speedup
BenchmarkRC4_128       272.22       303.40    1.11x
BenchmarkRC4_1K        327.80       312.58    0.95x
BenchmarkRC4_8K        307.24       313.00    1.02x

For comparison, on the same machine, openssl 0.9.8r reports
its rc4 speed as somewhat under 350 MB/s for both 1K and 8K.
The Core i5 performance can be boosted another 20%, but only
by making the Xeon performance significantly slower.

On an Intel Xeon E5520:

benchmark           old ns/op    new ns/op    delta
BenchmarkRC4_128          774          417  -46.12%
BenchmarkRC4_1K          6121         3200  -47.72%
BenchmarkRC4_8K         48394        25151  -48.03%

benchmark            old MB/s     new MB/s  speedup
BenchmarkRC4_128       165.18       306.84    1.86x
BenchmarkRC4_1K        167.28       319.92    1.91x
BenchmarkRC4_8K        167.29       321.89    1.92x

For comparison, on the same machine, openssl 1.0.1
(which uses a different implementation than 0.9.8r)
reports its rc4 speed as 587 MB/s for 1K and 601 MB/s for 8K.
It is using SIMD instructions to do more in parallel.

So there's still some improvement to be had, but even so,
this is almost 2x faster than what it replaced.

-- 386 --

On a MacBookPro10,2 (Core i5):

benchmark           old ns/op    new ns/op    delta
BenchmarkRC4_128         3491          421  -87.94%
BenchmarkRC4_1K         28063         3205  -88.58%
BenchmarkRC4_8K        220392        25228  -88.55%

benchmark            old MB/s     new MB/s  speedup
BenchmarkRC4_128        36.66       303.81    8.29x
BenchmarkRC4_1K         36.49       319.42    8.75x
BenchmarkRC4_8K         36.73       320.90    8.74x

On an Intel Xeon E5520:

benchmark           old ns/op    new ns/op    delta
BenchmarkRC4_128         2268          524  -76.90%
BenchmarkRC4_1K         18161         4137  -77.22%
BenchmarkRC4_8K        142396        32350  -77.28%

benchmark            old MB/s     new MB/s  speedup
BenchmarkRC4_128        56.42       244.13    4.33x
BenchmarkRC4_1K         56.38       247.46    4.39x
BenchmarkRC4_8K         56.86       250.26    4.40x

R=agl
CC=golang-dev
https://golang.org/cl/7547050
2013-03-21 11:25:09 -04:00
..
aes crypto/aes: speed up using AES-NI on amd64 2012-09-27 01:54:10 +08:00
cipher crypto/cipher: avoid out of bounds error in CryptBlocks 2013-01-30 12:45:13 -08:00
des crypto/des: add an example to demonstrate EDE2 operation. 2012-12-22 10:50:11 -05:00
dsa all: remove now-unnecessary unreachable panics 2013-03-11 14:16:55 -07:00
ecdsa crypto/elliptic: explicitly handle P+P, ∞+P and P+∞ 2012-08-03 15:42:14 -04:00
elliptic crypto/elliptic: explicitly handle P+P, ∞+P and P+∞ 2012-08-03 15:42:14 -04:00
hmac crypto/hmac: add Equal function. 2012-10-11 15:28:02 -04:00
md5 src: use internal tests if possible 2013-02-19 10:02:01 -05:00
rand crypto/rsa: ensure that RSA keys use the full number of bits. 2012-12-28 19:11:37 -05:00
rc4 crypto/rc4: faster amd64, 386 implementations 2013-03-21 11:25:09 -04:00
rsa crypto/rsa: fix infinite loop in GenerateMultiPrimeKey for large nprimes 2013-02-24 17:19:09 +01:00
sha1 src: use internal tests if possible 2013-02-19 10:02:01 -05:00
sha256 crypto: use better hash benchmarks 2012-11-01 16:21:18 -04:00
sha512 crypto: use better hash benchmarks 2012-11-01 16:21:18 -04:00
subtle build: remove Make.pkg, Make.tool 2012-01-30 23:43:46 -05:00
tls crypto/tls: use method values 2013-03-20 23:53:38 -04:00
x509 all: remove now-unnecessary unreachable panics 2013-03-11 14:16:55 -07:00
crypto.go crypto/...: changes to address some of bug 2841. 2012-02-03 15:08:53 -05:00