This borrows a trick from the bzip2 source and effects a decent speed
up when decompressing highly compressed sources. Rather than unshuffle
the BTW block when performing the IBTW, a linked-list is threaded
through the array, in place. This improves cache hit rates.
R=bradfitzgo, bradfitzwork, cw
CC=golang-dev
https://golang.org/cl/4247047