mirror of
				https://github.com/python/cpython.git
				synced 2025-10-31 21:51:50 +00:00 
			
		
		
		
	
		
			
	
	
		
			108 lines
		
	
	
	
		
			5.9 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
		
		
			
		
	
	
			108 lines
		
	
	
	
		
			5.9 KiB
		
	
	
	
		
			Text
		
	
	
	
	
	
|   | Before 2.3.3, Python's cyclic gc didn't pay any attention to weakrefs. | ||
|  | Segfaults in Zope3 resulted. | ||
|  | 
 | ||
|  | weakrefs in Python are designed to, at worst, let *other* objects learn | ||
|  | that a given object has died, via a callback function.  The weakly | ||
|  | referenced object itself is not passed to the callback, and the presumption | ||
|  | is that the weakly referenced object is unreachable trash at the time the | ||
|  | callback is invoked. | ||
|  | 
 | ||
|  | That's usually true, but not always.  Suppose a weakly referenced object | ||
|  | becomes part of a clump of cyclic trash.  When enough cycles are broken by | ||
|  | cyclic gc that the object is reclaimed, the callback is invoked.  If it's | ||
|  | possible for the callback to get at objects in the cycle(s), then it may be | ||
|  | possible for those objects to access (via strong references in the cycle) | ||
|  | the weakly referenced object being torn down, or other objects in the cycle | ||
|  | that have already suffered a tp_clear() call.  There's no guarantee that an | ||
|  | object is in a sane state after tp_clear().  Bad things (including | ||
|  | segfaults) can happen right then, during the callback's execution, or can | ||
|  | happen at any later time if the callback manages to resurrect an insane | ||
|  | object. | ||
|  | 
 | ||
|  | Note that if it's possible for the callback to get at objects in the trash | ||
|  | cycles, it must also be the case that the callback itself is part of the | ||
|  | trash cycles.  Else the callback would have acted as an external root to | ||
|  | the current collection, and nothing reachable from it would be in cyclic | ||
|  | trash either. | ||
|  | 
 | ||
|  | More, if the callback itself is in cyclic trash, then the weakref to which | ||
|  | the callback is attached must also be trash, and for the same kind of | ||
|  | reason:  if the weakref acted as an external root, then the callback could | ||
|  | not have been cyclic trash. | ||
|  | 
 | ||
|  | So a problem here requires that a weakref, that weakref's callback, and the | ||
|  | weakly referenced object, all be in cyclic trash at the same time.  This | ||
|  | isn't easy to stumble into by accident while Python is running, and, indeed, | ||
|  | it took quite a while to dream up failing test cases.  Zope3 saw segfaults | ||
|  | during shutdown, during the second call of gc in Py_Finalize, after most | ||
|  | modules had been torn down.  That creates many trash cycles (esp. those | ||
|  | involving new-style classes), making the problem much more likely.  Once you | ||
|  | know what's required to provoke the problem, though, it's easy to create | ||
|  | tests that segfault before shutdown. | ||
|  | 
 | ||
|  | In 2.3.3, before breaking cycles, we first clear all the weakrefs with | ||
|  | callbacks in cyclic trash.  Since the weakrefs *are* trash, and there's no | ||
|  | defined-- or even predictable --order in which tp_clear() gets called on | ||
|  | cyclic trash, it's defensible to first clear weakrefs with callbacks.  It's | ||
|  | a feature of Python's weakrefs too that when a weakref goes away, the | ||
|  | callback (if any) associated with it is thrown away too, unexecuted. | ||
|  | 
 | ||
|  | Just that much is almost enough to prevent problems, by throwing away | ||
|  | *almost* all the weakref callbacks that could get triggered by gc.  The | ||
|  | problem remaining is that clearing a weakref with a callback decrefs the | ||
|  | callback object, and the callback object may *itself* be weakly referenced, | ||
|  | via another weakref with another callback.  So the process of clearing | ||
|  | weakrefs can trigger callbacks attached to other weakrefs, and those | ||
|  | latter weakrefs may or may not be part of cyclic trash. | ||
|  | 
 | ||
|  | So, to prevent any Python code from running while gc is invoking tp_clear() | ||
|  | on all the objects in cyclic trash, it's not quite enough just to invoke | ||
|  | tp_clear() on weakrefs with callbacks first.  Instead the weakref module | ||
|  | grew a new private function (_PyWeakref_ClearRef) that does only part of | ||
|  | tp_clear():  it removes the weakref from the weakly-referenced object's list | ||
|  | of weakrefs, but does not decref the callback object.  So calling | ||
|  | _PyWeakref_ClearRef(wr) ensures that wr's callback object will never | ||
|  | trigger, and (unlike weakref's tp_clear()) also prevents any callback | ||
|  | associated *with* wr's callback object from triggering. | ||
|  | 
 | ||
|  | Then we can call tp_clear on all the cyclic objects and never trigger | ||
|  | Python code. | ||
|  | 
 | ||
|  | After we do that, the callback objects still need to be decref'ed.  Callbacks | ||
|  | (if any) *on* the callback objects that were also part of cyclic trash won't | ||
|  | get invoked, because we cleared all trash weakrefs with callbacks at the | ||
|  | start.  Callbacks on the callback objects that were not part of cyclic trash | ||
|  | acted as external roots to everything reachable from them, so nothing | ||
|  | reachable from them was part of cyclic trash, so gc didn't do any damage to | ||
|  | objects reachable from them, and it's safe to call them at the end of gc. | ||
|  | 
 | ||
|  | An alternative would have been to treat objects with callbacks like objects | ||
|  | with __del__ methods, refusing to collect them, appending them to gc.garbage | ||
|  | instead.  That would have been much easier.  Jim Fulton gave a strong | ||
|  | argument against that (on Python-Dev): | ||
|  | 
 | ||
|  |     There's a big difference between __del__ and weakref callbacks. | ||
|  |     The __del__ method is "internal" to a design.  When you design a | ||
|  |     class with a del method, you know you have to avoid including the | ||
|  |     class in cycles. | ||
|  | 
 | ||
|  |     Now, suppose you have a design that makes has no __del__ methods but | ||
|  |     that does use cyclic data structures.  You reason about the design, | ||
|  |     run tests, and convince yourself you don't have a leak. | ||
|  | 
 | ||
|  |     Now, suppose some external code creates a weakref to one of your | ||
|  |     objects.  All of a sudden, you start leaking.  You can look at your | ||
|  |     code all you want and you won't find a reason for the leak. | ||
|  | 
 | ||
|  | IOW, a class designer can out-think __del__ problems, but has no control | ||
|  | over who creates weakrefs to his classes or class instances.  The class | ||
|  | user has little chance either of predicting when the weakrefs he creates | ||
|  | may end up in cycles. | ||
|  | 
 | ||
|  | Callbacks on weakref callbacks are executed in an arbitrary order, and | ||
|  | that's not good (a primary reason not to collect cycles with objects with | ||
|  | __del__ methods is to avoid running finalizers in an arbitrary order). | ||
|  | However, a weakref callback on a weakref callback has got to be rare. | ||
|  | It's possible to do such a thing, so gc has to be robust against it, but | ||
|  | I doubt anyone has done it outside the test case I wrote for it. |