ParseProfiles currently uses a regex to parse each line. This is not
very fast, and can lead to ParseProfiles being excessively slow on
certain pathological inputs.
This change substantially improves the performance by parsing manually
instead. On an input of about 3 GB of data containing about 36 million
lines, the time spent in ParseProfiles drops from 72 seconds to 11
seconds, with actual string parsing time dropping from 61 seconds to 2
seconds.
Since this change completely changes the parsing, it also adds some
tests for ParseProfiles to help ensure the new parsing is correct.
A benchmark for parseLine is also included. Here is a comparison of the old
regex implementation versus the new manual one:
name old time/op new time/op delta
ParseLine-12 2.43µs ± 2% 0.05µs ± 8% -97.98% (p=0.000 n=10+9)
name old speed new speed delta
ParseLine-12 42.5MB/s ± 2% 2103.2MB/s ± 7% +4853.14% (p=0.000 n=10+9)
Fixesgolang/go#32211
Change-Id: If8f91ecbda776c08243de4e423de4eea55f0082b
Reviewed-on: https://go-review.googlesource.com/c/tools/+/179377
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>