Stowage/go - Remotebranch.eu

34331 commits 64 branches 468 tags 812 MiB

Author	SHA1	Message	Date
Joe Tsai	29ea82d072	encoding/csv: add ParseError.RecordLine CL 72150 fixes #22352 by reverting the problematic parts of that CL where the line number and column number were inconsistent with each other. This CL adds back functionality to address the issue that CL 72150 was trying to solve in the first place. That is, it reports the starting line of the record, so that users have a frame of reference to start with when debugging what went wrong. In the event of gnarly CSV files with multiline quoted strings, a parse failure likely occurs somewhere between the start of the record and the point where the parser finally detected an error. Since ParserError.{Line,Column} reports where the error occurs, we add a RecordLine field to report where the record starts. Also take this time to cleanup and modernize TestRead. Fixes #19019 Fixes #22352 Change-Id: I16cebf0b81922c35f75804c7073e9cddbfd11a04 Reviewed-on: https://go-review.googlesource.com/72310 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-10-21 01:32:28 +00:00
Joe Tsai	89ccfe4962	encoding/csv: simplify and optimize Reader The Reader implementation is slow because it operates on a rune-by-rune basis via bufio.Reader.ReadRune. We speed this up by operating on entire lines that we read from bufio.Reader.ReadSlice. In order to ensure that we read the full line, we augment ReadSlice in our Reader.readLine method to automatically expand the slice if bufio.ErrBufferFull is every hit. This change happens to fix #19410 because it no longer relies on rune-by-rune parsing and only searches for the relevant delimiter rune. In order to keep column accounting simple and consistent, this change reverts parts of CL 52830. This CL is an alternative to CL 36270 and builds on some of the ideas from that change by Diogo Pinela. name old time/op new time/op delta Read-8 3.12µs ± 1% 2.54µs ± 2% -18.76% (p=0.000 n=10+9) ReadWithFieldsPerRecord-8 3.12µs ± 1% 2.53µs ± 1% -18.91% (p=0.000 n=9+9) ReadWithoutFieldsPerRecord-8 3.13µs ± 0% 2.57µs ± 3% -18.07% (p=0.000 n=10+10) ReadLargeFields-8 52.3µs ± 1% 5.3µs ± 2% -89.93% (p=0.000 n=10+9) ReadReuseRecord-8 2.05µs ± 1% 1.40µs ± 1% -31.48% (p=0.000 n=10+9) ReadReuseRecordWithFieldsPerRecord-8 2.05µs ± 1% 1.41µs ± 0% -31.03% (p=0.000 n=10+9) ReadReuseRecordWithoutFieldsPerRecord-8 2.06µs ± 1% 1.40µs ± 1% -31.70% (p=0.000 n=9+10) ReadReuseRecordLargeFields-8 50.9µs ± 0% 4.1µs ± 3% -92.01% (p=0.000 n=10+10) name old alloc/op new alloc/op Read-8 664B ± 0% 664B ± 0% ReadWithFieldsPerRecord-8 664B ± 0% 664B ± 0% ReadWithoutFieldsPerRecord-8 664B ± 0% 664B ± 0% ReadLargeFields-8 3.94kB ± 0% 3.94kB ± 0% ReadReuseRecord-8 24.0B ± 0% 24.0B ± 0% ReadReuseRecordWithFieldsPerRecord-8 24.0B ± 0% 24.0B ± 0% ReadReuseRecordWithoutFieldsPerRecord-8 24.0B ± 0% 24.0B ± 0% ReadReuseRecordLargeFields-8 2.98kB ± 0% 2.98kB ± 0% name old allocs/op new allocs/op Read-8 18.0 ± 0% 18.0 ± 0% ReadWithFieldsPerRecord-8 18.0 ± 0% 18.0 ± 0% ReadWithoutFieldsPerRecord-8 18.0 ± 0% 18.0 ± 0% ReadLargeFields-8 24.0 ± 0% 24.0 ± 0% ReadReuseRecord-8 8.00 ± 0% 8.00 ± 0% ReadReuseRecordWithFieldsPerRecord-8 8.00 ± 0% 8.00 ± 0% ReadReuseRecordWithoutFieldsPerRecord-8 8.00 ± 0% 8.00 ± 0% ReadReuseRecordLargeFields-8 12.0 ± 0% 12.0 ± 0% Updates #22352 Updates #19019 Fixes #16791 Fixes #19410 Change-Id: I31c27cfcc56880e6abac262f36c947179b550bbf Reviewed-on: https://go-review.googlesource.com/72150 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-10-20 23:20:46 +00:00
Justin Nuß	9fbc06e6aa	encoding/csv: preserve \r\n in quoted fields The parser mistakenly assumed it could always fold \r\n into \n, which is not true since a \r\n inside a quoted fields has no special meaning and should be kept as is. Fix this by not folding \r\n to \n inside quotes fields. Fixes #21201 Change-Id: Ifebc302e49cf63e0a027ee90f088dbc050a2b7a6 Reviewed-on: https://go-review.googlesource.com/52810 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-08-14 18:42:20 +00:00
Justin Nuß	5d14ac74f6	encoding/csv: report line start line in errors Errors returned by Reader contain the line where the Reader originally encountered the error. This can be suboptimal since that line does not always correspond with the line the current record/field started at. This can easily happen with LazyQuotes as seen in #19019, but also happens for example when a quoted fields has no closing quote and the parser hits EOF before it finds another quote. When this happens finding the erroneous field can be somewhat complicated and time consuming, and in most cases it would be better to report the line where the record started. This change updates Reader to keep track of the line on which a record begins and uses it for errors instead of the current line, making it easier to find errors. Although a user-visible change, this should have no impact on existing code, since most users don't explicitly work with the line in the error and probably already expect the new behaviour. Updates #19019 Change-Id: Ic9bc70fad2651c69435d614d537e7a9266819b05 Reviewed-on: https://go-review.googlesource.com/52830 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-08-14 04:45:38 +00:00
Justin Nuß	2181653be6	encoding/csv: add option to reuse slices returned by Read In many cases the records returned by Reader.Read will only be used between calls to Read and become garbage once a new record is read. In this case, instead of allocating a new slice on each call to Read, we can reuse the last allocated slice for successive calls to avoid unnecessary allocations. This change adds a new field ReuseRecord to the Reader struct to enable this reuse. ReuseRecord is false by default to avoid breaking existing code which dependss on the current behaviour. I also added 4 new benchmarks, corresponding to the existing Read benchmarks, which set ReuseRecord to true. Benchstat on my local machine (old is ReuseRecord = false, new is ReuseRecord = true) name old time/op new time/op delta Read-8 2.75µs ± 2% 1.88µs ± 1% -31.52% (p=0.000 n=14+15) ReadWithFieldsPerRecord-8 2.75µs ± 0% 1.89µs ± 1% -31.43% (p=0.000 n=13+13) ReadWithoutFieldsPerRecord-8 2.77µs ± 1% 1.88µs ± 1% -32.06% (p=0.000 n=15+15) ReadLargeFields-8 55.4µs ± 1% 54.2µs ± 0% -2.07% (p=0.000 n=15+14) name old alloc/op new alloc/op delta Read-8 664B ± 0% 24B ± 0% -96.39% (p=0.000 n=15+15) ReadWithFieldsPerRecord-8 664B ± 0% 24B ± 0% -96.39% (p=0.000 n=15+15) ReadWithoutFieldsPerRecord-8 664B ± 0% 24B ± 0% -96.39% (p=0.000 n=15+15) ReadLargeFields-8 3.94kB ± 0% 2.98kB ± 0% -24.39% (p=0.000 n=15+15) name old allocs/op new allocs/op delta Read-8 18.0 ± 0% 8.0 ± 0% -55.56% (p=0.000 n=15+15) ReadWithFieldsPerRecord-8 18.0 ± 0% 8.0 ± 0% -55.56% (p=0.000 n=15+15) ReadWithoutFieldsPerRecord-8 18.0 ± 0% 8.0 ± 0% -55.56% (p=0.000 n=15+15) ReadLargeFields-8 24.0 ± 0% 12.0 ± 0% -50.00% (p=0.000 n=15+15) Fixes #19721 Change-Id: I79b14128bb9bb3465f53f40f93b1b528a9da6f58 Reviewed-on: https://go-review.googlesource.com/41730 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-04-26 15:55:56 +00:00
Brad Fitzpatrick	efaa36017e	encoding/csv: update and add CSV reading benchmarks Benchmarks broken off from https://golang.org/cl/24723 and modified to allocate less in the places we're not trying to measure. Updates #16791 Change-Id: I508e4cfeac60322d56f1d71ff1912f6a6f183a63 Reviewed-on: https://go-review.googlesource.com/30357 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-10-05 04:29:07 +00:00
Damien Neil	ab89378cb7	encoding/csv: skip blank lines when FieldsPerRecord >= 0 Fixes #11050. Change-Id: Ie5d16960a1f829af947d82a63fe414924cd02ff6 Reviewed-on: https://go-review.googlesource.com/10666 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>	2015-06-12 18:21:12 +00:00
Justin Nuß	2db58f8f2d	encoding/csv: Preallocate records slice Currently parseRecord will always start with a nil slice and then resize the slice on append. For input with a fixed number of fields per record we can preallocate the slice to avoid having to resize the slice. This change implements this optimization by using FieldsPerRecord as capacity if it's > 0 and also adds a benchmark to better show the differences. benchmark old ns/op new ns/op delta BenchmarkRead 19741 17909 -9.28% benchmark old allocs new allocs delta BenchmarkRead 59 41 -30.51% benchmark old bytes new bytes delta BenchmarkRead 6276 5844 -6.88% Change-Id: I7c2abc9c80a23571369bcfcc99a8ffc474eae7ab Reviewed-on: https://go-review.googlesource.com/8880 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-26 16:28:51 +00:00
Russ Cox	c007ce824d	build: move package sources from src/pkg to src Preparation was in CL 134570043. This CL contains only the effect of 'hg mv src/pkg/* src'. For more about the move, see golang.org/s/go14nopkg.	2014-09-08 00:08:51 -04:00

Renamed from src/pkg/encoding/csv/reader_test.go (Browse further)

9 commits