Commit graph

16 commits

Author SHA1 Message Date
Russ Cox
9cc9f10855 archive/zip: add test for Modified vs ModTime behavior
Lock in fix for #22738, submitted in CL 78031.

Fixes #22738.

Change-Id: I6896feb158569e3f12fa7055387cbd7caad29ef4
Reviewed-on: https://go-review.googlesource.com/80635
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2017-12-01 00:22:21 +00:00
Russ Cox
be67e269b4 archive/zip: replace Writer.Comment field with SetComment method
A method is more in keeping with the rest of the Writer API and
incidentally allows the comment error to be reported earlier.

Fixes #22737.

Change-Id: I1eee2103a0720c76d0c394ccd6541e6219996dc0
Reviewed-on: https://go-review.googlesource.com/79415
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2017-11-29 16:27:42 +00:00
Russ Cox
3a181dc7bc archive/zip: fix handling of replacement rune in UTF8 check
The replacement rune is a valid rune and can appear as itself in valid UTF8
(it encodes as three bytes). To check for invalid UTF8 it is necessary to
look for utf8.DecodeRune returning the replacement rune and size==1.

Change-Id: I169be8d1fe61605c921ac13cc2fde94f80f3463c
Reviewed-on: https://go-review.googlesource.com/78126
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
2017-11-15 21:30:30 +00:00
Joe Tsai
4fcc835971 archive/zip: add FileHeader.NonUTF8 field
The NonUTF8 field provides users with a way to explictly tell the
ZIP writer to avoid setting the UTF-8 flag.
This is necessary because many readers:
	1) (Still) do not support UTF-8
	2) And use the local system encoding instead

Thus, even though character encodings other than CP-437 and UTF-8
are not officially supported by the ZIP specification, pragmatically
the world has permitted use of them.

When a non-standard encoding is used, it is the user's responsibility
to ensure that the target system is expecting the encoding used
(e.g., producing a ZIP file you know is used on a Chinese version of Windows).

We adjust the detectUTF8 function to account for Shift-JIS and EUC-KR
not being identical to ASCII for two characters.

We don't need an API for users to explicitly specify that they are encoding
with UTF-8 since all single byte characters are compatible with all other
common encodings (Windows-1256, Windows-1252, Windows-1251, Windows-1250,
IEC-8859, EUC-KR, KOI8-R, Latin-1, Shift-JIS, GB-2312, GBK) except for
the non-printable characters and the backslash character (all of which
are invalid characters in a path name anyways).

Fixes #10741

Change-Id: I9004542d1d522c9137973f1b6e2b623fa54dfd66
Reviewed-on: https://go-review.googlesource.com/75592
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-11-06 21:35:59 +00:00
Joe Tsai
6e8894d5ff archive/zip: add FileHeader.Modified field
The ModifiedTime and ModifiedDate fields are not expressive enough
for many of the time extensions that have since been added to ZIP,
nor are they easy to access since they in a legacy MS-DOS format,
and must be set and retrieved via the SetModTime and ModTime methods.

Instead, we add new field Modified of time.Time type that contains
all of the previous information and more.

Support for extended timestamps have been attempted before, but the
change was reverted because it provided no ability for the user to
specify the timezone of the legacy MS-DOS fields.
Technically the old API did not either, but users were manually offsetting
the timestamp to achieve the same effect.

The Writer now writes the legacy timestamps according to the timezone
of the FileHeader.Modified field. When the Modified field is set via
the SetModTime method, it is in UTC, which preserves the old behavior.

The Reader attempts to determine the timezone if both the legacy
and extended timestamps are present since it can compute the delta
between the two values.

Since Modified is a superset of the information in ModifiedTime and ModifiedDate,
we mark ModifiedTime, ModifiedDate, ModTime, and SetModTime as deprecated.

Fixes #18359

Change-Id: I29c6bc0a62908095d02740df3e6902f50d3152f1
Reviewed-on: https://go-review.googlesource.com/74970
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-11-06 19:50:28 +00:00
Carl Mastrangelo
f265f5db5d archive/zip, crypto/tls: use rand.Read instead of casting ints to bytes
Makes tests run ~1ms faster.

Change-Id: Ida509952469540280996d2bd9266724829e53c91
Reviewed-on: https://go-review.googlesource.com/47359
Reviewed-by: Filippo Valsorda <hi@filippo.io>
Run-TryBot: Filippo Valsorda <hi@filippo.io>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-11-01 05:51:30 +00:00
Joe Tsai
78805c07f4 archive/zip: restrict UTF-8 detection for comment and name fields
CL 39570 added support for automatically setting flag bit 11 to
indicate that the filename and comment fields are encoded in UTF-8,
which is (conventionally) the encoding using for most Go strings.

However, the detection added is too lose for two reasons:
* We need to ensure both fields are at least possibly UTF-8.
That is, if any field is definitely not UTF-8, then we can't set the bit.
* The utf8.ValidRune returns true for utf8.RuneError, which iterating
over a Go string automatically returns for invalid UTF-8.
Thus, we manually check for that value.

Updates #22367
Updates #10741

Change-Id: Ie8aae388432e546e44c6bebd06a00434373ca99e
Reviewed-on: https://go-review.googlesource.com/72791
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-10-25 22:16:46 +00:00
Kenji Yano
fda8269cc6 archive/zip: support "end of central directory record comment"
This change added support "end of central directory record comemnt" to the Writer.

There is a new exported field Writer.Comment in this change.
If invalid size of comment was set, Close returns error without closing resources.

Fixes #21634

Change-Id: Ifb62bc6c7f81b9257ac83eb570ad9915de727f8c
Reviewed-on: https://go-review.googlesource.com/59310
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-08-31 02:48:46 +00:00
Yasuhiro Matsumoto
0a3f3e166d archive/zip: set utf-8 flag
See: https://pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.3.0.TXT

Document says:
> If general purpose bit 11 is set, the filename and comment must support The
> Unicode Standard, Version 4.1.0 or greater using the character encoding form
> defined by the UTF-8 storage specification.

Since Go encode the filename to UTF-8, general purpose bit 11 should be set.

Change-Id: Ica4af02b4dc695e9a5c015ae360e70171efb6ee3
Reviewed-on: https://go-review.googlesource.com/39570
Reviewed-by: Joe Tsai <thebrokentoaster@gmail.com>
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-05-26 17:08:18 +00:00
Bryan C. Mills
d0a045daaf archive/zip: parallelize benchmarks
Add subbenchmarks for BenchmarkZip64Test with different sizes to tease
apart construction costs vs. steady-state throughput.

Results remain comparable with the non-parallel version with -cpu=1:

benchmark                           old ns/op     new ns/op     delta
BenchmarkCompressedZipGarbage       26832835      27506953      +2.51%
BenchmarkCompressedZipGarbage-6     27172377      4321534       -84.10%
BenchmarkZip64Test                  196758732     197765510     +0.51%
BenchmarkZip64Test-6                193850605     192625458     -0.63%

benchmark                           old allocs     new allocs     delta
BenchmarkCompressedZipGarbage       44             44             +0.00%
BenchmarkCompressedZipGarbage-6     44             44             +0.00%

benchmark                           old bytes     new bytes     delta
BenchmarkCompressedZipGarbage       5592          5664          +1.29%
BenchmarkCompressedZipGarbage-6     5592          21946         +292.45%

updates #18177

Change-Id: Icfa359d9b1a8df5e085dacc07d2b9221b284764c
Reviewed-on: https://go-review.googlesource.com/36719
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-03-15 18:26:51 +00:00
Joe Tsai
5df59a4fc9 Revert: "archive/zip: handle mtime in NTFS/UNIX/ExtendedTS extra fields"
This change reverts the following CLs:
	CL/18274: handle mtime in NTFS/UNIX/ExtendedTS extra fields
	CL/30811: only use Extended Timestamp on non-zero MS-DOS timestamps

We are reverting support for extended timestamps since the support was not
not complete. CL/18274 added full support for reading extended timestamp fields
and minimal support for writing them. CL/18274 is incomplete because it made
no changes to the FileHeader struct, so timezone information was lost when
reading and/or writing.

While CL/18274 was a step in the right direction, we should provide full
support for high precision timestamps in both the reader and writer.
This will probably require that we add a new field of type time.Time.
The complete fix is too involved to add in the time remaining for Go 1.8
and will be completed in Go 1.9.

Updates #10242
Updates #17403
Updates #18359
Fixes #18378

Change-Id: Icf6d028047f69379f7979a29bfcb319a02f4783e
Reviewed-on: https://go-review.googlesource.com/34651
Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-12-20 01:39:35 +00:00
Yasuhiro Matsumoto
4c79ed5f44 archive/zip: handle mtime in NTFS/UNIX/ExtendedTS extra fields
Handle NTFS timestamp, UNIX timestamp, Extended extra timestamp.
Writer supports only Extended extra timestamp field, matching most
zip creators.

Fixes #10242.

Change-Id: Id665db274e63def98659231391fb77392267ac1e
Reviewed-on: https://go-review.googlesource.com/18274
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2016-10-06 19:05:52 +00:00
Josh Bleecher Snyder
d713e8e806 archive/zip: improve BenchmarkCompressedZipGarbage
Before this CL:

$ go test -bench=CompressedZipGarbage -count=5 -run=NONE archive/zip
BenchmarkCompressedZipGarbage-8        50  20677087 ns/op   42973 B/op      47 allocs/op
BenchmarkCompressedZipGarbage-8       100  20584764 ns/op   24294 B/op      47 allocs/op
BenchmarkCompressedZipGarbage-8        50  20859221 ns/op   42973 B/op      47 allocs/op
BenchmarkCompressedZipGarbage-8       100  20901176 ns/op   24294 B/op      47 allocs/op
BenchmarkCompressedZipGarbage-8        50  21282409 ns/op   42973 B/op      47 allocs/op

The B/op number is effectively meaningless. There
is a surprisingly large one-time cost that gets
divided by the number of iterations that your
machine can get through in a second.

This CL discards the first run, which helps.
It is not a panacea. Running with -benchtime=10s
will allow the sync.Pool to be emptied,
which brings the problem back.
However, since there are more iterations to divide
the cost through, it’s not quite as bad,
and running with a high benchtime is rare.

This CL changes the meaning of the B/op number,
which is unfortunate, since it won’t have the
same order of magnitude as previous Go versions.
But it wasn’t really comparable before anyway,
since it didn’t have any reliable meaning at all.

After this CL:

$ go test -bench=CompressedZipGarbage -count=5 -run=NONE archive/zip
BenchmarkCompressedZipGarbage-8   	     100	  20881890 ns/op	    5616 B/op	      47 allocs/op
BenchmarkCompressedZipGarbage-8   	      50	  20622757 ns/op	    5616 B/op	      47 allocs/op
BenchmarkCompressedZipGarbage-8   	      50	  20628193 ns/op	    5616 B/op	      47 allocs/op
BenchmarkCompressedZipGarbage-8   	     100	  20756612 ns/op	    5616 B/op	      47 allocs/op
BenchmarkCompressedZipGarbage-8   	     100	  20639774 ns/op	    5616 B/op	      47 allocs/op

Change-Id: Iedee04f39328974c7fa272a6113d423e7ffce50f
Reviewed-on: https://go-review.googlesource.com/22585
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-05-01 04:42:00 +00:00
Andrew Gerrand
1f35bb6466 archive/zip: remove WriterOptions and replace with SetOffset method
Change-Id: I0a8b972c33e80c750ff1d63717177a5a3294a112
Reviewed-on: https://go-review.googlesource.com/7445
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Geert-Johan Riemer <gjr19912@gmail.com>
Reviewed-by: Andrew Gerrand <adg@golang.org>
2015-03-12 21:32:09 +00:00
Geert-Johan Riemer
de573f8748 archive/zip: add NewWriterWithOptions
When appending zip data to existing data such as a binary file the
zip headers must use the correct offset. NewWriterWithOptions
allows creating a Writer that uses the provided offset in the zip
headers.

Fixes #8669

Change-Id: I6ec64f1e816cc57b6fc8bb9e8a0918e586fc56b0
Reviewed-on: https://go-review.googlesource.com/2978
Reviewed-by: Andrew Gerrand <adg@golang.org>
2015-03-12 00:13:29 +00:00
Russ Cox
c007ce824d build: move package sources from src/pkg to src
Preparation was in CL 134570043.
This CL contains only the effect of 'hg mv src/pkg/* src'.
For more about the move, see golang.org/s/go14nopkg.
2014-09-08 00:08:51 -04:00
Renamed from src/pkg/archive/zip/writer_test.go (Browse further)