mirror of
				https://github.com/python/cpython.git
				synced 2025-11-03 23:21:29 +00:00 
			
		
		
		
	
		
			
	
	
		
			202 lines
		
	
	
	
		
			8.4 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
		
		
			
		
	
	
			202 lines
		
	
	
	
		
			8.4 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
| 
								 | 
							
								\chapter{Memory Management \label{memory}}
							 | 
						||
| 
								 | 
							
								\sectionauthor{Vladimir Marangozov}{Vladimir.Marangozov@inrialpes.fr}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\section{Overview \label{memoryOverview}}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Memory management in Python involves a private heap containing all
							 | 
						||
| 
								 | 
							
								Python objects and data structures. The management of this private
							 | 
						||
| 
								 | 
							
								heap is ensured internally by the \emph{Python memory manager}.  The
							 | 
						||
| 
								 | 
							
								Python memory manager has different components which deal with various
							 | 
						||
| 
								 | 
							
								dynamic storage management aspects, like sharing, segmentation,
							 | 
						||
| 
								 | 
							
								preallocation or caching.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								At the lowest level, a raw memory allocator ensures that there is
							 | 
						||
| 
								 | 
							
								enough room in the private heap for storing all Python-related data
							 | 
						||
| 
								 | 
							
								by interacting with the memory manager of the operating system. On top
							 | 
						||
| 
								 | 
							
								of the raw memory allocator, several object-specific allocators
							 | 
						||
| 
								 | 
							
								operate on the same heap and implement distinct memory management
							 | 
						||
| 
								 | 
							
								policies adapted to the peculiarities of every object type. For
							 | 
						||
| 
								 | 
							
								example, integer objects are managed differently within the heap than
							 | 
						||
| 
								 | 
							
								strings, tuples or dictionaries because integers imply different
							 | 
						||
| 
								 | 
							
								storage requirements and speed/space tradeoffs. The Python memory
							 | 
						||
| 
								 | 
							
								manager thus delegates some of the work to the object-specific
							 | 
						||
| 
								 | 
							
								allocators, but ensures that the latter operate within the bounds of
							 | 
						||
| 
								 | 
							
								the private heap.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								It is important to understand that the management of the Python heap
							 | 
						||
| 
								 | 
							
								is performed by the interpreter itself and that the user has no
							 | 
						||
| 
								 | 
							
								control on it, even if she regularly manipulates object pointers to
							 | 
						||
| 
								 | 
							
								memory blocks inside that heap.  The allocation of heap space for
							 | 
						||
| 
								 | 
							
								Python objects and other internal buffers is performed on demand by
							 | 
						||
| 
								 | 
							
								the Python memory manager through the Python/C API functions listed in
							 | 
						||
| 
								 | 
							
								this document.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								To avoid memory corruption, extension writers should never try to
							 | 
						||
| 
								 | 
							
								operate on Python objects with the functions exported by the C
							 | 
						||
| 
								 | 
							
								library: \cfunction{malloc()}\ttindex{malloc()},
							 | 
						||
| 
								 | 
							
								\cfunction{calloc()}\ttindex{calloc()},
							 | 
						||
| 
								 | 
							
								\cfunction{realloc()}\ttindex{realloc()} and
							 | 
						||
| 
								 | 
							
								\cfunction{free()}\ttindex{free()}.  This will result in 
							 | 
						||
| 
								 | 
							
								mixed calls between the C allocator and the Python memory manager
							 | 
						||
| 
								 | 
							
								with fatal consequences, because they implement different algorithms
							 | 
						||
| 
								 | 
							
								and operate on different heaps.  However, one may safely allocate and
							 | 
						||
| 
								 | 
							
								release memory blocks with the C library allocator for individual
							 | 
						||
| 
								 | 
							
								purposes, as shown in the following example:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\begin{verbatim}
							 | 
						||
| 
								 | 
							
								    PyObject *res;
							 | 
						||
| 
								 | 
							
								    char *buf = (char *) malloc(BUFSIZ); /* for I/O */
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								    if (buf == NULL)
							 | 
						||
| 
								 | 
							
								        return PyErr_NoMemory();
							 | 
						||
| 
								 | 
							
								    ...Do some I/O operation involving buf...
							 | 
						||
| 
								 | 
							
								    res = PyString_FromString(buf);
							 | 
						||
| 
								 | 
							
								    free(buf); /* malloc'ed */
							 | 
						||
| 
								 | 
							
								    return res;
							 | 
						||
| 
								 | 
							
								\end{verbatim}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								In this example, the memory request for the I/O buffer is handled by
							 | 
						||
| 
								 | 
							
								the C library allocator. The Python memory manager is involved only
							 | 
						||
| 
								 | 
							
								in the allocation of the string object returned as a result.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								In most situations, however, it is recommended to allocate memory from
							 | 
						||
| 
								 | 
							
								the Python heap specifically because the latter is under control of
							 | 
						||
| 
								 | 
							
								the Python memory manager. For example, this is required when the
							 | 
						||
| 
								 | 
							
								interpreter is extended with new object types written in C. Another
							 | 
						||
| 
								 | 
							
								reason for using the Python heap is the desire to \emph{inform} the
							 | 
						||
| 
								 | 
							
								Python memory manager about the memory needs of the extension module.
							 | 
						||
| 
								 | 
							
								Even when the requested memory is used exclusively for internal,
							 | 
						||
| 
								 | 
							
								highly-specific purposes, delegating all memory requests to the Python
							 | 
						||
| 
								 | 
							
								memory manager causes the interpreter to have a more accurate image of
							 | 
						||
| 
								 | 
							
								its memory footprint as a whole. Consequently, under certain
							 | 
						||
| 
								 | 
							
								circumstances, the Python memory manager may or may not trigger
							 | 
						||
| 
								 | 
							
								appropriate actions, like garbage collection, memory compaction or
							 | 
						||
| 
								 | 
							
								other preventive procedures. Note that by using the C library
							 | 
						||
| 
								 | 
							
								allocator as shown in the previous example, the allocated memory for
							 | 
						||
| 
								 | 
							
								the I/O buffer escapes completely the Python memory manager.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\section{Memory Interface \label{memoryInterface}}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The following function sets, modeled after the ANSI C standard, are
							 | 
						||
| 
								 | 
							
								available for allocating and releasing memory from the Python heap:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\begin{cfuncdesc}{void*}{PyMem_Malloc}{size_t n}
							 | 
						||
| 
								 | 
							
								  Allocates \var{n} bytes and returns a pointer of type \ctype{void*}
							 | 
						||
| 
								 | 
							
								  to the allocated memory, or \NULL{} if the request fails.
							 | 
						||
| 
								 | 
							
								  Requesting zero bytes returns a non-\NULL{} pointer.
							 | 
						||
| 
								 | 
							
								  The memory will not have been initialized in any way.
							 | 
						||
| 
								 | 
							
								\end{cfuncdesc}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\begin{cfuncdesc}{void*}{PyMem_Realloc}{void *p, size_t n}
							 | 
						||
| 
								 | 
							
								  Resizes the memory block pointed to by \var{p} to \var{n} bytes.
							 | 
						||
| 
								 | 
							
								  The contents will be unchanged to the minimum of the old and the new
							 | 
						||
| 
								 | 
							
								  sizes. If \var{p} is \NULL, the call is equivalent to
							 | 
						||
| 
								 | 
							
								  \cfunction{PyMem_Malloc(\var{n})}; if \var{n} is equal to zero, the
							 | 
						||
| 
								 | 
							
								  memory block is resized but is not freed, and the returned pointer
							 | 
						||
| 
								 | 
							
								  is non-\NULL.  Unless \var{p} is \NULL, it must have been
							 | 
						||
| 
								 | 
							
								  returned by a previous call to \cfunction{PyMem_Malloc()} or
							 | 
						||
| 
								 | 
							
								  \cfunction{PyMem_Realloc()}.
							 | 
						||
| 
								 | 
							
								\end{cfuncdesc}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\begin{cfuncdesc}{void}{PyMem_Free}{void *p}
							 | 
						||
| 
								 | 
							
								  Frees the memory block pointed to by \var{p}, which must have been
							 | 
						||
| 
								 | 
							
								  returned by a previous call to \cfunction{PyMem_Malloc()} or
							 | 
						||
| 
								 | 
							
								  \cfunction{PyMem_Realloc()}.  Otherwise, or if
							 | 
						||
| 
								 | 
							
								  \cfunction{PyMem_Free(p)} has been called before, undefined
							 | 
						||
| 
								 | 
							
								  behaviour occurs. If \var{p} is \NULL, no operation is performed.
							 | 
						||
| 
								 | 
							
								\end{cfuncdesc}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The following type-oriented macros are provided for convenience.  Note 
							 | 
						||
| 
								 | 
							
								that \var{TYPE} refers to any C type.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\begin{cfuncdesc}{\var{TYPE}*}{PyMem_New}{TYPE, size_t n}
							 | 
						||
| 
								 | 
							
								  Same as \cfunction{PyMem_Malloc()}, but allocates \code{(\var{n} *
							 | 
						||
| 
								 | 
							
								  sizeof(\var{TYPE}))} bytes of memory.  Returns a pointer cast to
							 | 
						||
| 
								 | 
							
								  \ctype{\var{TYPE}*}.  The memory will not have been initialized in
							 | 
						||
| 
								 | 
							
								  any way.
							 | 
						||
| 
								 | 
							
								\end{cfuncdesc}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\begin{cfuncdesc}{\var{TYPE}*}{PyMem_Resize}{void *p, TYPE, size_t n}
							 | 
						||
| 
								 | 
							
								  Same as \cfunction{PyMem_Realloc()}, but the memory block is resized
							 | 
						||
| 
								 | 
							
								  to \code{(\var{n} * sizeof(\var{TYPE}))} bytes.  Returns a pointer
							 | 
						||
| 
								 | 
							
								  cast to \ctype{\var{TYPE}*}.
							 | 
						||
| 
								 | 
							
								\end{cfuncdesc}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\begin{cfuncdesc}{void}{PyMem_Del}{void *p}
							 | 
						||
| 
								 | 
							
								  Same as \cfunction{PyMem_Free()}.
							 | 
						||
| 
								 | 
							
								\end{cfuncdesc}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								In addition, the following macro sets are provided for calling the
							 | 
						||
| 
								 | 
							
								Python memory allocator directly, without involving the C API functions
							 | 
						||
| 
								 | 
							
								listed above. However, note that their use does not preserve binary
							 | 
						||
| 
								 | 
							
								compatibility accross Python versions and is therefore deprecated in
							 | 
						||
| 
								 | 
							
								extension modules.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\cfunction{PyMem_MALLOC()}, \cfunction{PyMem_REALLOC()}, \cfunction{PyMem_FREE()}.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\cfunction{PyMem_NEW()}, \cfunction{PyMem_RESIZE()}, \cfunction{PyMem_DEL()}.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\section{Examples \label{memoryExamples}}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Here is the example from section \ref{memoryOverview}, rewritten so
							 | 
						||
| 
								 | 
							
								that the I/O buffer is allocated from the Python heap by using the
							 | 
						||
| 
								 | 
							
								first function set:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\begin{verbatim}
							 | 
						||
| 
								 | 
							
								    PyObject *res;
							 | 
						||
| 
								 | 
							
								    char *buf = (char *) PyMem_Malloc(BUFSIZ); /* for I/O */
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								    if (buf == NULL)
							 | 
						||
| 
								 | 
							
								        return PyErr_NoMemory();
							 | 
						||
| 
								 | 
							
								    /* ...Do some I/O operation involving buf... */
							 | 
						||
| 
								 | 
							
								    res = PyString_FromString(buf);
							 | 
						||
| 
								 | 
							
								    PyMem_Free(buf); /* allocated with PyMem_Malloc */
							 | 
						||
| 
								 | 
							
								    return res;
							 | 
						||
| 
								 | 
							
								\end{verbatim}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The same code using the type-oriented function set:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\begin{verbatim}
							 | 
						||
| 
								 | 
							
								    PyObject *res;
							 | 
						||
| 
								 | 
							
								    char *buf = PyMem_New(char, BUFSIZ); /* for I/O */
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								    if (buf == NULL)
							 | 
						||
| 
								 | 
							
								        return PyErr_NoMemory();
							 | 
						||
| 
								 | 
							
								    /* ...Do some I/O operation involving buf... */
							 | 
						||
| 
								 | 
							
								    res = PyString_FromString(buf);
							 | 
						||
| 
								 | 
							
								    PyMem_Del(buf); /* allocated with PyMem_New */
							 | 
						||
| 
								 | 
							
								    return res;
							 | 
						||
| 
								 | 
							
								\end{verbatim}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Note that in the two examples above, the buffer is always
							 | 
						||
| 
								 | 
							
								manipulated via functions belonging to the same set. Indeed, it
							 | 
						||
| 
								 | 
							
								is required to use the same memory API family for a given
							 | 
						||
| 
								 | 
							
								memory block, so that the risk of mixing different allocators is
							 | 
						||
| 
								 | 
							
								reduced to a minimum. The following code sequence contains two errors,
							 | 
						||
| 
								 | 
							
								one of which is labeled as \emph{fatal} because it mixes two different
							 | 
						||
| 
								 | 
							
								allocators operating on different heaps.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								\begin{verbatim}
							 | 
						||
| 
								 | 
							
								char *buf1 = PyMem_New(char, BUFSIZ);
							 | 
						||
| 
								 | 
							
								char *buf2 = (char *) malloc(BUFSIZ);
							 | 
						||
| 
								 | 
							
								char *buf3 = (char *) PyMem_Malloc(BUFSIZ);
							 | 
						||
| 
								 | 
							
								...
							 | 
						||
| 
								 | 
							
								PyMem_Del(buf3);  /* Wrong -- should be PyMem_Free() */
							 | 
						||
| 
								 | 
							
								free(buf2);       /* Right -- allocated via malloc() */
							 | 
						||
| 
								 | 
							
								free(buf1);       /* Fatal -- should be PyMem_Del()  */
							 | 
						||
| 
								 | 
							
								\end{verbatim}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								In addition to the functions aimed at handling raw memory blocks from
							 | 
						||
| 
								 | 
							
								the Python heap, objects in Python are allocated and released with
							 | 
						||
| 
								 | 
							
								\cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()} and
							 | 
						||
| 
								 | 
							
								\cfunction{PyObject_Del()}, or with their corresponding macros
							 | 
						||
| 
								 | 
							
								\cfunction{PyObject_NEW()}, \cfunction{PyObject_NEW_VAR()} and
							 | 
						||
| 
								 | 
							
								\cfunction{PyObject_DEL()}.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								These will be explained in the next chapter on defining and
							 | 
						||
| 
								 | 
							
								implementing new object types in C.
							 |