A temporary 512 bytes buffer is allocated for every call to
readHeader. This buffer isn't returned to the caller and it could
be reused to lower the number of memory allocations.
This CL improves it by using a pool and zeroing out the buffer before
putting it back into the pool.
benchmark old ns/op new ns/op delta
BenchmarkListFiles100k 545249903 538832687 -1.18%
benchmark old allocs new allocs delta
BenchmarkListFiles100k 2105167 2005692 -4.73%
benchmark old bytes new bytes delta
BenchmarkListFiles100k 105903472 54831527 -48.22%
This improvement is very important if your code has to deal with a lot
of tarballs which contain a lot of files.
LGTM=dsymonds
R=golang-codereviews, dave, dsymonds, bradfitz
CC=golang-codereviews
https://golang.org/cl/108240044
A temporary 512 bytes buffer is allocated for every call to
writeHeader. This buffer could be reused the lower the number
of memory allocations.
benchmark old ns/op new ns/op delta
BenchmarkWriteFiles100k 634622051 583810847 -8.01%
benchmark old allocs new allocs delta
BenchmarkWriteFiles100k 2701920 2602621 -3.68%
benchmark old bytes new bytes delta
BenchmarkWriteFiles100k 115383884 64349922 -44.23%
This change is very important if your code has to write a lot of
tarballs with a lot of files.
LGTM=dsymonds
R=golang-codereviews, dave, dsymonds
CC=golang-codereviews
https://golang.org/cl/107440043
Calling tar.Reader.Read() used to work fine, but without this patch it panics.
Simply return EOF to indicate the tar.Reader.Next() needs to be called.
LGTM=iant, bradfitz
R=golang-codereviews, bradfitz, iant, mikioh.mikioh, dominik.honnef
CC=golang-codereviews
https://golang.org/cl/94530043
Do not use ustar format if we need the GNU one.
Change \000 to \x00 for consistency
Check for "ustar\x00" instead of "ustar\x00\x00" for conistency with tar
and compatiblity with archive generated with older code (which was ustar\x00\x20\x00)
Add test for long name + big file.
LGTM=iant
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/99050043
Supports all the current GNU tar sparse formats, including the
old GNU format and the GNU PAX format versions 0.0, 0.1, and 1.0.
Fixes#3864.
LGTM=rsc
R=golang-codereviews, dave, gobot, dsymonds, rsc
CC=golang-codereviews
https://golang.org/cl/64740043
This adds support for archives with the SCHILY.xattr field in the
pax header. This is what gnu tar and star generate.
Fixes#7154.
LGTM=dsymonds
R=golang-codereviews, gobot, dsymonds
CC=golang-codereviews
https://golang.org/cl/54570043
Prevents a ton of garbage. (Noticed this when writing large
Camlistore zip archives to Amazon Glacier)
Note that the Closer part of the io.WriteCloser is never given
to users. It's an internal detail of the package.
benchmark old ns/op new ns/op delta
BenchmarkCompressedZipGarbage 42884123 40732373 -5.02%
benchmark old allocs new allocs delta
BenchmarkCompressedZipGarbage 204 149 -26.96%
benchmark old bytes new bytes delta
BenchmarkCompressedZipGarbage 4397576 66744 -98.48%
LGTM=adg, rsc
R=adg, rsc
CC=golang-codereviews
https://golang.org/cl/54300053
Use the smaller read-only bytes.NewReader/strings.NewReader instead
of a bytes.Buffer when possible.
LGTM=r
R=golang-codereviews, r
CC=golang-codereviews
https://golang.org/cl/54660045
ZIP64 Extra records are variably sized, but we weren't capping
our reading of the extra fields at its previously-declared
size.
No test because I don't know how to easily create such files
and don't feel like manually construction one. But all
existing tests pass, and this is "obviously correct" (queue
laughter).
Fixes#7069
R=golang-codereviews, iant
CC=golang-codereviews
https://golang.org/cl/48150043
For some long filenames the USTAR-split code does not work
correctly. It is wrongly assumed that the path would not be too long,
but it is.
The user visible result was that a filename was split, but it still
caused an error.
The cause was a wrongly calculated nlen. In addition I noticed that
at this place it is also seems necessary to check if the prefix will
fit in the 155 chars available for the prefix.
R=dsymonds, rsc
CC=golang-dev
https://golang.org/cl/13300046
This is potentially an API-breaking change, but it is an important bug fix.
The CL https://golang.org/cl/7305072/ added stuff to make
the tar file look more like a file system internally, including providing an
implementation of os.FileInfo for the file headers within the archive.
But the code is incorrect because FileInfo.Name is supposed to return
the base name only; this implementation returns the full path. A round
trip test added in the same shows this in action, as the slashes are
preserved as we create a header using the local implementation of
FileInfo.
The CL here changes the behavior of the tar (and zip) FileInfo to honor
the Go spec for that interface. It also clarifies that the FileInfoHeader
function, which takes a FileInfo as an argument, will therefore create
a header with only the base name of the file recorded, and that
subsequent adjustment may be necessary.
There may be code out there that depends on the broken behavior.
We can call out the risk in the release notes.
Fixes#6180.
R=golang-dev, dsymonds, adg, bradfitz
CC=golang-dev
https://golang.org/cl/13118043
The tar/archive code from golang has a problem with linknames with length >
100. A pax header is added but the original header still written with a too
long field length.
As it is clear that pax support is incomplete I have added missing
implementation parts.
This commit contains code from the golang project in the folder tar/archiv.
The following pax header records are now automatically written:
- gname)
- linkpath
- path
- uname
The following fields can be written with PAX, but the default is to use the
star binary extension.
- gid (value > 2097151)
- size (value > 8589934591)
- uid (value > 2097151)
The string fields are written when the value is longer as the field or if the
string contains a char that is not encodable as 7-bit ASCII value.
The change was tested against a current ubuntu-cloud image tarball comparing
the compressed result.
+ added some automated tests for the new functionality.
Fixes#6056.
R=dsymonds
CC=golang-dev
https://golang.org/cl/12561043
Took 76 seconds or so before. By avoiding flate and crc32 on
4GB of data, it's now only 12 seconds. Still a slow test, but
not painful to run anymore when you forget -short.
R=golang-dev, adg
CC=golang-dev
https://golang.org/cl/12950043
Update #6138
TestOver65kFiles spends all its time garbage collecting.
Removing the 1.4 MB of allocations per each of the 65k
files brings this from 34 seconds to 0.23 seconds.
R=golang-dev, rsc
CC=golang-dev
https://golang.org/cl/12894043
This change replaces the hard-coded switch on compression method
in zipfile reader and writer with a map into which users can
register compressors and decompressors in their init()s.
R=golang-dev, bradfitz, rsc
CC=golang-dev
https://golang.org/cl/12421043
trivial: it is not a serious problem to leak a fd in a short lived process, but it was obscuring my investigation of issue 5593.
R=golang-dev, iant, bradfitz
CC=golang-dev
https://golang.org/cl/10391043
The spec doesn't explicitly say that trailing data is okay, but a lot
of people do this and most unzippers will handle it just fine. In any
case, this makes the package more useful, and led me to make the
directory parsing code marginally more robust.
Fixes#5228.
R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/8504044
Support reading pax archives and applying extended attributes
like long names, subsecond mtime resolutions, etc. Default to
writing pax archives for long file and link names.
Support reading gnu archives using the ././@LongLink extended
header for file name and link target.
Fixes#3300
R=dsymonds, dave, rogpeppe, remyoudompheng, chressie, rsc
CC=golang-dev
https://golang.org/cl/6700047
This fixes some example code in the tar package documentation, which
first refers to tar.NewWriter and then to Header, which is inconsistent
because NewWriter and Header are both in the tar namespace.
R=golang-dev, dsymonds
CC=golang-dev
https://golang.org/cl/6595050