mirror of
				https://github.com/python/cpython.git
				synced 2025-10-26 03:04:41 +00:00 
			
		
		
		
	
		
			
	
	
		
			69 lines
		
	
	
	
		
			2.5 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
		
		
			
		
	
	
			69 lines
		
	
	
	
		
			2.5 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
|   | stringbench is a set of performance tests comparing byte string | ||
|  | operations with unicode operations.  The two string implementations | ||
|  | are loosely based on each other and sometimes the algorithm for one is | ||
|  | faster than the other. | ||
|  | 
 | ||
|  | These test set was started at the Need For Speed sprint in Reykjavik | ||
|  | to identify which string methods could be sped up quickly and to | ||
|  | identify obvious places for improvement. | ||
|  | 
 | ||
|  | Here is an example of a benchmark | ||
|  | 
 | ||
|  | 
 | ||
|  | @bench('"Andrew".startswith("A")', 'startswith single character', 1000) | ||
|  | def startswith_single(STR): | ||
|  |     s1 = STR("Andrew") | ||
|  |     s2 = STR("A") | ||
|  |     s1_startswith = s1.startswith | ||
|  |     for x in _RANGE_1000: | ||
|  |         s1_startswith(s2) | ||
|  | 
 | ||
|  | The bench decorator takes three parameters.  The first is a short | ||
|  | description of how the code works.  In most cases this is Python code | ||
|  | snippet.  It is not the code which is actually run because the real | ||
|  | code is hand-optimized to focus on the method being tested. | ||
|  | 
 | ||
|  | The second parameter is a group title.  All benchmarks with the same | ||
|  | group title are listed together.  This lets you compare different | ||
|  | implementations of the same algorithm, such as "t in s" | ||
|  | vs. "s.find(t)". | ||
|  | 
 | ||
|  | The last is a count.  Each benchmark loops over the algorithm either | ||
|  | 100 or 1000 times, depending on the algorithm performance.  The output | ||
|  | time is the time per benchmark call so the reader needs a way to know | ||
|  | how to scale the performance. | ||
|  | 
 | ||
|  | These parameters become function attributes. | ||
|  | 
 | ||
|  | 
 | ||
|  | Here is an example of the output | ||
|  | 
 | ||
|  | 
 | ||
|  | ========== count newlines | ||
|  | 38.54   41.60   92.7    ...text.with.2000.newlines.count("\n") (*100) | ||
|  | ========== early match, single character | ||
|  | 1.14    1.18    96.8    ("A"*1000).find("A") (*1000) | ||
|  | 0.44    0.41    105.6   "A" in "A"*1000 (*1000) | ||
|  | 1.15    1.17    98.1    ("A"*1000).index("A") (*1000) | ||
|  | 
 | ||
|  | The first column is the run time in milliseconds for byte strings. | ||
|  | The second is the run time for unicode strings.  The third is a | ||
|  | percentage; byte time / unicode time.  It's the percentage by which | ||
|  | unicode is faster than byte strings. | ||
|  | 
 | ||
|  | The last column contains the code snippet and the repeat count for the | ||
|  | internal benchmark loop. | ||
|  | 
 | ||
|  | The times are computed with 'timeit.py' which repeats the test more | ||
|  | and more times until the total time takes over 0.2 seconds, returning | ||
|  | the best time for a single iteration. | ||
|  | 
 | ||
|  | The final line of the output is the cumulative time for byte and | ||
|  | unicode strings, and the overall performance of unicode relative to | ||
|  | bytes.  For example | ||
|  | 
 | ||
|  | 4079.83 5432.25 75.1    TOTAL | ||
|  | 
 | ||
|  | However, this has no meaning as it evenly weights every test. | ||
|  | 
 |