mirror of
				https://github.com/python/cpython.git
				synced 2025-11-03 23:21:29 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			203 lines
		
	
	
	
		
			8.5 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
			
		
		
	
	
			203 lines
		
	
	
	
		
			8.5 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
\chapter{Memory Management \label{memory}}
 | 
						|
\sectionauthor{Vladimir Marangozov}{Vladimir.Marangozov@inrialpes.fr}
 | 
						|
 | 
						|
 | 
						|
\section{Overview \label{memoryOverview}}
 | 
						|
 | 
						|
Memory management in Python involves a private heap containing all
 | 
						|
Python objects and data structures. The management of this private
 | 
						|
heap is ensured internally by the \emph{Python memory manager}.  The
 | 
						|
Python memory manager has different components which deal with various
 | 
						|
dynamic storage management aspects, like sharing, segmentation,
 | 
						|
preallocation or caching.
 | 
						|
 | 
						|
At the lowest level, a raw memory allocator ensures that there is
 | 
						|
enough room in the private heap for storing all Python-related data
 | 
						|
by interacting with the memory manager of the operating system. On top
 | 
						|
of the raw memory allocator, several object-specific allocators
 | 
						|
operate on the same heap and implement distinct memory management
 | 
						|
policies adapted to the peculiarities of every object type. For
 | 
						|
example, integer objects are managed differently within the heap than
 | 
						|
strings, tuples or dictionaries because integers imply different
 | 
						|
storage requirements and speed/space tradeoffs. The Python memory
 | 
						|
manager thus delegates some of the work to the object-specific
 | 
						|
allocators, but ensures that the latter operate within the bounds of
 | 
						|
the private heap.
 | 
						|
 | 
						|
It is important to understand that the management of the Python heap
 | 
						|
is performed by the interpreter itself and that the user has no
 | 
						|
control over it, even if she regularly manipulates object pointers to
 | 
						|
memory blocks inside that heap.  The allocation of heap space for
 | 
						|
Python objects and other internal buffers is performed on demand by
 | 
						|
the Python memory manager through the Python/C API functions listed in
 | 
						|
this document.
 | 
						|
 | 
						|
To avoid memory corruption, extension writers should never try to
 | 
						|
operate on Python objects with the functions exported by the C
 | 
						|
library: \cfunction{malloc()}\ttindex{malloc()},
 | 
						|
\cfunction{calloc()}\ttindex{calloc()},
 | 
						|
\cfunction{realloc()}\ttindex{realloc()} and
 | 
						|
\cfunction{free()}\ttindex{free()}.  This will result in 
 | 
						|
mixed calls between the C allocator and the Python memory manager
 | 
						|
with fatal consequences, because they implement different algorithms
 | 
						|
and operate on different heaps.  However, one may safely allocate and
 | 
						|
release memory blocks with the C library allocator for individual
 | 
						|
purposes, as shown in the following example:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
    PyObject *res;
 | 
						|
    char *buf = (char *) malloc(BUFSIZ); /* for I/O */
 | 
						|
 | 
						|
    if (buf == NULL)
 | 
						|
        return PyErr_NoMemory();
 | 
						|
    ...Do some I/O operation involving buf...
 | 
						|
    res = PyString_FromString(buf);
 | 
						|
    free(buf); /* malloc'ed */
 | 
						|
    return res;
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
In this example, the memory request for the I/O buffer is handled by
 | 
						|
the C library allocator. The Python memory manager is involved only
 | 
						|
in the allocation of the string object returned as a result.
 | 
						|
 | 
						|
In most situations, however, it is recommended to allocate memory from
 | 
						|
the Python heap specifically because the latter is under control of
 | 
						|
the Python memory manager. For example, this is required when the
 | 
						|
interpreter is extended with new object types written in C. Another
 | 
						|
reason for using the Python heap is the desire to \emph{inform} the
 | 
						|
Python memory manager about the memory needs of the extension module.
 | 
						|
Even when the requested memory is used exclusively for internal,
 | 
						|
highly-specific purposes, delegating all memory requests to the Python
 | 
						|
memory manager causes the interpreter to have a more accurate image of
 | 
						|
its memory footprint as a whole. Consequently, under certain
 | 
						|
circumstances, the Python memory manager may or may not trigger
 | 
						|
appropriate actions, like garbage collection, memory compaction or
 | 
						|
other preventive procedures. Note that by using the C library
 | 
						|
allocator as shown in the previous example, the allocated memory for
 | 
						|
the I/O buffer escapes completely the Python memory manager.
 | 
						|
 | 
						|
 | 
						|
\section{Memory Interface \label{memoryInterface}}
 | 
						|
 | 
						|
The following function sets, modeled after the ANSI C standard,
 | 
						|
but specifying  behavior when requesting zero bytes,
 | 
						|
are available for allocating and releasing memory from the Python heap:
 | 
						|
 | 
						|
 | 
						|
\begin{cfuncdesc}{void*}{PyMem_Malloc}{size_t n}
 | 
						|
  Allocates \var{n} bytes and returns a pointer of type \ctype{void*}
 | 
						|
  to the allocated memory, or \NULL{} if the request fails.
 | 
						|
  Requesting zero bytes returns a distinct non-\NULL{} pointer if
 | 
						|
  possible, as if \cfunction{PyMem_Malloc(1)} had been called instead.
 | 
						|
  The memory will not have been initialized in any way.
 | 
						|
\end{cfuncdesc}
 | 
						|
 | 
						|
\begin{cfuncdesc}{void*}{PyMem_Realloc}{void *p, size_t n}
 | 
						|
  Resizes the memory block pointed to by \var{p} to \var{n} bytes.
 | 
						|
  The contents will be unchanged to the minimum of the old and the new
 | 
						|
  sizes. If \var{p} is \NULL, the call is equivalent to
 | 
						|
  \cfunction{PyMem_Malloc(\var{n})}; else if \var{n} is equal to zero, the
 | 
						|
  memory block is resized but is not freed, and the returned pointer
 | 
						|
  is non-\NULL.  Unless \var{p} is \NULL, it must have been
 | 
						|
  returned by a previous call to \cfunction{PyMem_Malloc()} or
 | 
						|
  \cfunction{PyMem_Realloc()}.
 | 
						|
\end{cfuncdesc}
 | 
						|
 | 
						|
\begin{cfuncdesc}{void}{PyMem_Free}{void *p}
 | 
						|
  Frees the memory block pointed to by \var{p}, which must have been
 | 
						|
  returned by a previous call to \cfunction{PyMem_Malloc()} or
 | 
						|
  \cfunction{PyMem_Realloc()}.  Otherwise, or if
 | 
						|
  \cfunction{PyMem_Free(p)} has been called before, undefined
 | 
						|
  behavior occurs. If \var{p} is \NULL, no operation is performed.
 | 
						|
\end{cfuncdesc}
 | 
						|
 | 
						|
The following type-oriented macros are provided for convenience.  Note 
 | 
						|
that \var{TYPE} refers to any C type.
 | 
						|
 | 
						|
\begin{cfuncdesc}{\var{TYPE}*}{PyMem_New}{TYPE, size_t n}
 | 
						|
  Same as \cfunction{PyMem_Malloc()}, but allocates \code{(\var{n} *
 | 
						|
  sizeof(\var{TYPE}))} bytes of memory.  Returns a pointer cast to
 | 
						|
  \ctype{\var{TYPE}*}.  The memory will not have been initialized in
 | 
						|
  any way.
 | 
						|
\end{cfuncdesc}
 | 
						|
 | 
						|
\begin{cfuncdesc}{\var{TYPE}*}{PyMem_Resize}{void *p, TYPE, size_t n}
 | 
						|
  Same as \cfunction{PyMem_Realloc()}, but the memory block is resized
 | 
						|
  to \code{(\var{n} * sizeof(\var{TYPE}))} bytes.  Returns a pointer
 | 
						|
  cast to \ctype{\var{TYPE}*}.
 | 
						|
\end{cfuncdesc}
 | 
						|
 | 
						|
\begin{cfuncdesc}{void}{PyMem_Del}{void *p}
 | 
						|
  Same as \cfunction{PyMem_Free()}.
 | 
						|
\end{cfuncdesc}
 | 
						|
 | 
						|
In addition, the following macro sets are provided for calling the
 | 
						|
Python memory allocator directly, without involving the C API functions
 | 
						|
listed above. However, note that their use does not preserve binary
 | 
						|
compatibility accross Python versions and is therefore deprecated in
 | 
						|
extension modules.
 | 
						|
 | 
						|
\cfunction{PyMem_MALLOC()}, \cfunction{PyMem_REALLOC()}, \cfunction{PyMem_FREE()}.
 | 
						|
 | 
						|
\cfunction{PyMem_NEW()}, \cfunction{PyMem_RESIZE()}, \cfunction{PyMem_DEL()}.
 | 
						|
 | 
						|
 | 
						|
\section{Examples \label{memoryExamples}}
 | 
						|
 | 
						|
Here is the example from section \ref{memoryOverview}, rewritten so
 | 
						|
that the I/O buffer is allocated from the Python heap by using the
 | 
						|
first function set:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
    PyObject *res;
 | 
						|
    char *buf = (char *) PyMem_Malloc(BUFSIZ); /* for I/O */
 | 
						|
 | 
						|
    if (buf == NULL)
 | 
						|
        return PyErr_NoMemory();
 | 
						|
    /* ...Do some I/O operation involving buf... */
 | 
						|
    res = PyString_FromString(buf);
 | 
						|
    PyMem_Free(buf); /* allocated with PyMem_Malloc */
 | 
						|
    return res;
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
The same code using the type-oriented function set:
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
    PyObject *res;
 | 
						|
    char *buf = PyMem_New(char, BUFSIZ); /* for I/O */
 | 
						|
 | 
						|
    if (buf == NULL)
 | 
						|
        return PyErr_NoMemory();
 | 
						|
    /* ...Do some I/O operation involving buf... */
 | 
						|
    res = PyString_FromString(buf);
 | 
						|
    PyMem_Del(buf); /* allocated with PyMem_New */
 | 
						|
    return res;
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
Note that in the two examples above, the buffer is always
 | 
						|
manipulated via functions belonging to the same set. Indeed, it
 | 
						|
is required to use the same memory API family for a given
 | 
						|
memory block, so that the risk of mixing different allocators is
 | 
						|
reduced to a minimum. The following code sequence contains two errors,
 | 
						|
one of which is labeled as \emph{fatal} because it mixes two different
 | 
						|
allocators operating on different heaps.
 | 
						|
 | 
						|
\begin{verbatim}
 | 
						|
char *buf1 = PyMem_New(char, BUFSIZ);
 | 
						|
char *buf2 = (char *) malloc(BUFSIZ);
 | 
						|
char *buf3 = (char *) PyMem_Malloc(BUFSIZ);
 | 
						|
...
 | 
						|
PyMem_Del(buf3);  /* Wrong -- should be PyMem_Free() */
 | 
						|
free(buf2);       /* Right -- allocated via malloc() */
 | 
						|
free(buf1);       /* Fatal -- should be PyMem_Del()  */
 | 
						|
\end{verbatim}
 | 
						|
 | 
						|
In addition to the functions aimed at handling raw memory blocks from
 | 
						|
the Python heap, objects in Python are allocated and released with
 | 
						|
\cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()} and
 | 
						|
\cfunction{PyObject_Del()}, or with their corresponding macros
 | 
						|
\cfunction{PyObject_NEW()}, \cfunction{PyObject_NEW_VAR()} and
 | 
						|
\cfunction{PyObject_DEL()}.
 | 
						|
 | 
						|
These will be explained in the next chapter on defining and
 | 
						|
implementing new object types in C.
 |