| 
									
										
										
										
											2022-08-23 13:55:43 +01:00
										 |  |  | # Object layout
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Common header
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Each Python object starts with two fields: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | * ob_refcnt | 
					
						
							|  |  |  | * ob_type | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-01-17 04:41:15 +05:30
										 |  |  | which form the header common to all Python objects, for all versions, | 
					
						
							| 
									
										
										
										
											2022-08-23 13:55:43 +01:00
										 |  |  | and hold the reference count and class of the object, respectively. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ## Pre-header
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Since the introduction of the cycle GC, there has also been a pre-header. | 
					
						
							|  |  |  | Before 3.11, this pre-header was two words in size. | 
					
						
							|  |  |  | It should be considered opaque to all code except the cycle GC. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-02 11:59:21 +01:00
										 |  |  | ### 3.13
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | In 3.13, the values array is embedded into the object, so there is no | 
					
						
							|  |  |  | need for a values pointer (it is just a fixed offset into the object). | 
					
						
							|  |  |  | So the pre-header is these two fields: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | * weakreflist | 
					
						
							|  |  |  | * dict_pointer | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | If the object has no physical dictionary, then the ``dict_pointer`` | 
					
						
							|  |  |  | is set to `NULL`. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-08-21 15:52:04 +01:00
										 |  |  | In 3.13 only objects with no additional data could have inline values. | 
					
						
							|  |  |  | That is, instances of classes with `tp_basicsize == sizeof(PyObject)`. | 
					
						
							|  |  |  | In 3.14, any object whose class has `tp_itemsize == 0` can have inline values. | 
					
						
							|  |  |  | In both versions, the inline values starts `tp_basicsize` bytes after the object. | 
					
						
							| 
									
										
										
										
											2024-04-02 11:59:21 +01:00
										 |  |  | 
 | 
					
						
							|  |  |  | <details> | 
					
						
							|  |  |  | <summary> 3.12 </summary> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### 3.12
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | In 3.12, the pointer to the list of weak references is added to the | 
					
						
							|  |  |  | pre-header. In order to make space for it, the ``dict`` and ``values`` | 
					
						
							|  |  |  | pointers are combined into a single tagged pointer: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | * weakreflist | 
					
						
							|  |  |  | * dict_or_values | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | If the object has no physical dictionary, then the ``dict_or_values`` | 
					
						
							|  |  |  | has its low bit set to one, and points to the values array. | 
					
						
							|  |  |  | If the object has a physical dictionary, then the ``dict_or_values`` | 
					
						
							|  |  |  | has its low bit set to zero, and points to the dictionary. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The untagged form is chosen for the dictionary pointer, rather than | 
					
						
							|  |  |  | the values pointer, to enable the (legacy) C-API function | 
					
						
							|  |  |  | `_PyObject_GetDictPtr(PyObject *obj)` to work. | 
					
						
							|  |  |  | </details> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | <details> | 
					
						
							|  |  |  | <summary> 3.11 </summary> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### 3.11
 | 
					
						
							| 
									
										
										
										
											2022-08-23 13:55:43 +01:00
										 |  |  | 
 | 
					
						
							|  |  |  | In 3.11 the pre-header was extended to include pointers to the VM managed ``__dict__``. | 
					
						
							|  |  |  | The reason for moving the ``__dict__`` to the pre-header is that it allows | 
					
						
							|  |  |  | faster access, as it is at a fixed offset, and it also allows object's | 
					
						
							|  |  |  | dictionaries to be lazily created when the ``__dict__`` attribute is | 
					
						
							|  |  |  | specifically asked for. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | In the 3.11 the non-GC part of the pre-header consists of two pointers: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | * dict | 
					
						
							|  |  |  | * values | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The values pointer refers to the ``PyDictValues`` array which holds the | 
					
						
							|  |  |  | values of the objects's attributes. | 
					
						
							|  |  |  | Should the dictionary be needed, then ``values`` is set to ``NULL`` | 
					
						
							|  |  |  | and the ``dict`` field points to the dictionary. | 
					
						
							| 
									
										
										
										
											2024-04-02 11:59:21 +01:00
										 |  |  | </details> | 
					
						
							| 
									
										
										
										
											2022-08-23 13:55:43 +01:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-02 11:59:21 +01:00
										 |  |  | ## Layout of a "normal" Python object
 | 
					
						
							| 
									
										
										
										
											2022-08-23 13:55:43 +01:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-02 11:59:21 +01:00
										 |  |  | A "normal" Python object is one that doesn't inherit from a builtin | 
					
						
							|  |  |  | class, doesn't have slots. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### 3.13
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | In 3.13 the values are embedded into the object, as follows: | 
					
						
							| 
									
										
										
										
											2022-08-23 13:55:43 +01:00
										 |  |  | 
 | 
					
						
							|  |  |  | * weakreflist | 
					
						
							|  |  |  | * dict_or_values | 
					
						
							| 
									
										
										
										
											2024-04-02 11:59:21 +01:00
										 |  |  | * GC 1 | 
					
						
							|  |  |  | * GC 2 | 
					
						
							|  |  |  | * ob_refcnt | 
					
						
							|  |  |  | * ob_type | 
					
						
							|  |  |  | * Inlined values: | 
					
						
							|  |  |  |   * Flags | 
					
						
							|  |  |  |   * values 0 | 
					
						
							|  |  |  |   * values 1 | 
					
						
							|  |  |  |   * ... | 
					
						
							|  |  |  |   * Insertion order bytes | 
					
						
							| 
									
										
										
										
											2022-08-23 13:55:43 +01:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-02 11:59:21 +01:00
										 |  |  | This has all the advantages of the layout used in 3.12, plus: | 
					
						
							|  |  |  | * Access to values is even faster as there is one less load | 
					
						
							|  |  |  | * Fast access is mostly maintained when the `__dict__` is materialized | 
					
						
							| 
									
										
										
										
											2022-08-23 13:55:43 +01:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-02 11:59:21 +01:00
										 |  |  |  | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | For objects with opaque parts defined by a C extension, | 
					
						
							|  |  |  | the layout is much the same as for 3.12 | 
					
						
							| 
									
										
										
										
											2022-08-23 13:55:43 +01:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-02 11:59:21 +01:00
										 |  |  |  | 
					
						
							| 
									
										
										
										
											2022-08-23 13:55:43 +01:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2024-04-02 11:59:21 +01:00
										 |  |  | 
 | 
					
						
							|  |  |  | <details> | 
					
						
							|  |  |  | <summary> 3.12 </summary> | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | ### 3.12:
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | In 3.12, the header and pre-header form the entire object for "normal" | 
					
						
							|  |  |  | Python objects: | 
					
						
							| 
									
										
										
										
											2022-08-23 13:55:43 +01:00
										 |  |  | 
 | 
					
						
							|  |  |  | * weakreflist | 
					
						
							|  |  |  | * dict_or_values | 
					
						
							|  |  |  | * GC 1 | 
					
						
							|  |  |  | * GC 2 | 
					
						
							|  |  |  | * ob_refcnt | 
					
						
							|  |  |  | * ob_type | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |  | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | There are several advantages to this layout: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | * It allows lazy `__dict__`s, as described above. | 
					
						
							|  |  |  | * The regular layout allows us to create tailored traversal and deallocation | 
					
						
							|  |  |  |   functions based on layout, rather than inheritance. | 
					
						
							|  |  |  | * Multiple inheritance works properly, | 
					
						
							|  |  |  |   as the weakrefs and dict are always at the same offset. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The full layout object, with an opaque part defined by a C extension, | 
					
						
							|  |  |  | and `__slots__` looks like this: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |  | 
					
						
							| 
									
										
										
										
											2024-04-02 11:59:21 +01:00
										 |  |  | </details> | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2022-08-23 13:55:43 +01:00
										 |  |  | 
 |