| 
									
										
										
										
											2001-10-12 19:01:43 +00:00
										 |  |  | \chapter{Introduction \label{intro}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The Application Programmer's Interface to Python gives C and | 
					
						
							|  |  |  | \Cpp{} programmers access to the Python interpreter at a variety of | 
					
						
							| 
									
										
										
										
											2001-11-28 07:26:15 +00:00
										 |  |  | levels.  The API is equally usable from \Cpp, but for brevity it is | 
					
						
							| 
									
										
										
										
											2001-10-12 19:01:43 +00:00
										 |  |  | generally referred to as the Python/C API.  There are two | 
					
						
							|  |  |  | fundamentally different reasons for using the Python/C API.  The first | 
					
						
							|  |  |  | reason is to write \emph{extension modules} for specific purposes; | 
					
						
							|  |  |  | these are C modules that extend the Python interpreter.  This is | 
					
						
							|  |  |  | probably the most common use.  The second reason is to use Python as a | 
					
						
							|  |  |  | component in a larger application; this technique is generally | 
					
						
							|  |  |  | referred to as \dfn{embedding} Python in an application. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Writing an extension module is a relatively well-understood process,  | 
					
						
							|  |  |  | where a ``cookbook'' approach works well.  There are several tools  | 
					
						
							|  |  |  | that automate the process to some extent.  While people have embedded  | 
					
						
							|  |  |  | Python in other applications since its early existence, the process of  | 
					
						
							|  |  |  | embedding Python is less straightforward than writing an extension.   | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Many API functions are useful independent of whether you're embedding  | 
					
						
							|  |  |  | or extending Python; moreover, most applications that embed Python  | 
					
						
							|  |  |  | will need to provide a custom extension as well, so it's probably a  | 
					
						
							|  |  |  | good idea to become familiar with writing an extension before  | 
					
						
							|  |  |  | attempting to embed Python in a real application. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \section{Include Files \label{includes}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | All function, type and macro definitions needed to use the Python/C | 
					
						
							|  |  |  | API are included in your code by the following line: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							|  |  |  | #include "Python.h" | 
					
						
							|  |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | This implies inclusion of the following standard headers: | 
					
						
							|  |  |  | \code{<stdio.h>}, \code{<string.h>}, \code{<errno.h>}, | 
					
						
							|  |  |  | \code{<limits.h>}, and \code{<stdlib.h>} (if available). | 
					
						
							|  |  |  | Since Python may define some pre-processor definitions which affect | 
					
						
							|  |  |  | the standard headers on some systems, you must include \file{Python.h} | 
					
						
							|  |  |  | before any standard headers are included. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | All user visible names defined by Python.h (except those defined by | 
					
						
							|  |  |  | the included standard headers) have one of the prefixes \samp{Py} or | 
					
						
							|  |  |  | \samp{_Py}.  Names beginning with \samp{_Py} are for internal use by | 
					
						
							|  |  |  | the Python implementation and should not be used by extension writers. | 
					
						
							|  |  |  | Structure member names do not have a reserved prefix. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \strong{Important:} user code should never define names that begin | 
					
						
							|  |  |  | with \samp{Py} or \samp{_Py}.  This confuses the reader, and | 
					
						
							|  |  |  | jeopardizes the portability of the user code to future Python | 
					
						
							|  |  |  | versions, which may define additional names beginning with one of | 
					
						
							|  |  |  | these prefixes. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The header files are typically installed with Python.  On \UNIX, these  | 
					
						
							|  |  |  | are located in the directories | 
					
						
							|  |  |  | \file{\envvar{prefix}/include/python\var{version}/} and | 
					
						
							|  |  |  | \file{\envvar{exec_prefix}/include/python\var{version}/}, where | 
					
						
							|  |  |  | \envvar{prefix} and \envvar{exec_prefix} are defined by the | 
					
						
							|  |  |  | corresponding parameters to Python's \program{configure} script and | 
					
						
							|  |  |  | \var{version} is \code{sys.version[:3]}.  On Windows, the headers are | 
					
						
							|  |  |  | installed in \file{\envvar{prefix}/include}, where \envvar{prefix} is | 
					
						
							|  |  |  | the installation directory specified to the installer. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | To include the headers, place both directories (if different) on your | 
					
						
							|  |  |  | compiler's search path for includes.  Do \emph{not} place the parent | 
					
						
							|  |  |  | directories on the search path and then use | 
					
						
							|  |  |  | \samp{\#include <python\shortversion/Python.h>}; this will break on | 
					
						
							|  |  |  | multi-platform builds since the platform independent headers under | 
					
						
							|  |  |  | \envvar{prefix} include the platform specific headers from | 
					
						
							|  |  |  | \envvar{exec_prefix}. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \Cpp{} users should note that though the API is defined entirely using | 
					
						
							|  |  |  | C, the header files do properly declare the entry points to be | 
					
						
							|  |  |  | \code{extern "C"}, so there is no need to do anything special to use | 
					
						
							|  |  |  | the API from \Cpp. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \section{Objects, Types and Reference Counts \label{objects}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Most Python/C API functions have one or more arguments as well as a | 
					
						
							|  |  |  | return value of type \ctype{PyObject*}.  This type is a pointer | 
					
						
							|  |  |  | to an opaque data type representing an arbitrary Python | 
					
						
							|  |  |  | object.  Since all Python object types are treated the same way by the | 
					
						
							|  |  |  | Python language in most situations (e.g., assignments, scope rules, | 
					
						
							|  |  |  | and argument passing), it is only fitting that they should be | 
					
						
							|  |  |  | represented by a single C type.  Almost all Python objects live on the | 
					
						
							|  |  |  | heap: you never declare an automatic or static variable of type | 
					
						
							|  |  |  | \ctype{PyObject}, only pointer variables of type \ctype{PyObject*} can  | 
					
						
							|  |  |  | be declared.  The sole exception are the type objects\obindex{type}; | 
					
						
							|  |  |  | since these must never be deallocated, they are typically static | 
					
						
							|  |  |  | \ctype{PyTypeObject} objects. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | All Python objects (even Python integers) have a \dfn{type} and a | 
					
						
							|  |  |  | \dfn{reference count}.  An object's type determines what kind of object  | 
					
						
							|  |  |  | it is (e.g., an integer, a list, or a user-defined function; there are  | 
					
						
							|  |  |  | many more as explained in the \citetitle[../ref/ref.html]{Python | 
					
						
							|  |  |  | Reference Manual}).  For each of the well-known types there is a macro | 
					
						
							|  |  |  | to check whether an object is of that type; for instance, | 
					
						
							|  |  |  | \samp{PyList_Check(\var{a})} is true if (and only if) the object | 
					
						
							|  |  |  | pointed to by \var{a} is a Python list. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Reference Counts \label{refcounts}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The reference count is important because today's computers have a  | 
					
						
							|  |  |  | finite (and often severely limited) memory size; it counts how many  | 
					
						
							|  |  |  | different places there are that have a reference to an object.  Such a  | 
					
						
							|  |  |  | place could be another object, or a global (or static) C variable, or  | 
					
						
							|  |  |  | a local variable in some C function.  When an object's reference count  | 
					
						
							|  |  |  | becomes zero, the object is deallocated.  If it contains references to  | 
					
						
							|  |  |  | other objects, their reference count is decremented.  Those other  | 
					
						
							|  |  |  | objects may be deallocated in turn, if this decrement makes their  | 
					
						
							|  |  |  | reference count become zero, and so on.  (There's an obvious problem  | 
					
						
							|  |  |  | with objects that reference each other here; for now, the solution is  | 
					
						
							|  |  |  | ``don't do that.'') | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Reference counts are always manipulated explicitly.  The normal way is  | 
					
						
							|  |  |  | to use the macro \cfunction{Py_INCREF()}\ttindex{Py_INCREF()} to | 
					
						
							|  |  |  | increment an object's reference count by one, and | 
					
						
							|  |  |  | \cfunction{Py_DECREF()}\ttindex{Py_DECREF()} to decrement it by   | 
					
						
							|  |  |  | one.  The \cfunction{Py_DECREF()} macro is considerably more complex | 
					
						
							|  |  |  | than the incref one, since it must check whether the reference count | 
					
						
							|  |  |  | becomes zero and then cause the object's deallocator to be called. | 
					
						
							|  |  |  | The deallocator is a function pointer contained in the object's type | 
					
						
							|  |  |  | structure.  The type-specific deallocator takes care of decrementing | 
					
						
							|  |  |  | the reference counts for other objects contained in the object if this | 
					
						
							|  |  |  | is a compound object type, such as a list, as well as performing any | 
					
						
							|  |  |  | additional finalization that's needed.  There's no chance that the | 
					
						
							|  |  |  | reference count can overflow; at least as many bits are used to hold | 
					
						
							|  |  |  | the reference count as there are distinct memory locations in virtual | 
					
						
							|  |  |  | memory (assuming \code{sizeof(long) >= sizeof(char*)}).  Thus, the | 
					
						
							|  |  |  | reference count increment is a simple operation. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | It is not necessary to increment an object's reference count for every  | 
					
						
							|  |  |  | local variable that contains a pointer to an object.  In theory, the  | 
					
						
							|  |  |  | object's reference count goes up by one when the variable is made to  | 
					
						
							|  |  |  | point to it and it goes down by one when the variable goes out of  | 
					
						
							|  |  |  | scope.  However, these two cancel each other out, so at the end the  | 
					
						
							|  |  |  | reference count hasn't changed.  The only real reason to use the  | 
					
						
							|  |  |  | reference count is to prevent the object from being deallocated as  | 
					
						
							|  |  |  | long as our variable is pointing to it.  If we know that there is at  | 
					
						
							|  |  |  | least one other reference to the object that lives at least as long as  | 
					
						
							|  |  |  | our variable, there is no need to increment the reference count  | 
					
						
							|  |  |  | temporarily.  An important situation where this arises is in objects  | 
					
						
							|  |  |  | that are passed as arguments to C functions in an extension module  | 
					
						
							|  |  |  | that are called from Python; the call mechanism guarantees to hold a  | 
					
						
							|  |  |  | reference to every argument for the duration of the call. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | However, a common pitfall is to extract an object from a list and | 
					
						
							|  |  |  | hold on to it for a while without incrementing its reference count. | 
					
						
							|  |  |  | Some other operation might conceivably remove the object from the | 
					
						
							|  |  |  | list, decrementing its reference count and possible deallocating it. | 
					
						
							|  |  |  | The real danger is that innocent-looking operations may invoke | 
					
						
							|  |  |  | arbitrary Python code which could do this; there is a code path which | 
					
						
							|  |  |  | allows control to flow back to the user from a \cfunction{Py_DECREF()}, | 
					
						
							|  |  |  | so almost any operation is potentially dangerous. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | A safe approach is to always use the generic operations (functions  | 
					
						
							|  |  |  | whose name begins with \samp{PyObject_}, \samp{PyNumber_}, | 
					
						
							|  |  |  | \samp{PySequence_} or \samp{PyMapping_}).  These operations always | 
					
						
							|  |  |  | increment the reference count of the object they return.  This leaves | 
					
						
							|  |  |  | the caller with the responsibility to call | 
					
						
							|  |  |  | \cfunction{Py_DECREF()} when they are done with the result; this soon | 
					
						
							|  |  |  | becomes second nature. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsubsection{Reference Count Details \label{refcountDetails}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The reference count behavior of functions in the Python/C API is best  | 
					
						
							|  |  |  | explained in terms of \emph{ownership of references}.  Note that we  | 
					
						
							|  |  |  | talk of owning references, never of owning objects; objects are always  | 
					
						
							|  |  |  | shared!  When a function owns a reference, it has to dispose of it  | 
					
						
							|  |  |  | properly --- either by passing ownership on (usually to its caller) or  | 
					
						
							|  |  |  | by calling \cfunction{Py_DECREF()} or \cfunction{Py_XDECREF()}.  When | 
					
						
							|  |  |  | a function passes ownership of a reference on to its caller, the | 
					
						
							|  |  |  | caller is said to receive a \emph{new} reference.  When no ownership | 
					
						
							|  |  |  | is transferred, the caller is said to \emph{borrow} the reference. | 
					
						
							|  |  |  | Nothing needs to be done for a borrowed reference. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Conversely, when a calling function passes it a reference to an  | 
					
						
							|  |  |  | object, there are two possibilities: the function \emph{steals} a  | 
					
						
							|  |  |  | reference to the object, or it does not.  Few functions steal  | 
					
						
							|  |  |  | references; the two notable exceptions are | 
					
						
							|  |  |  | \cfunction{PyList_SetItem()}\ttindex{PyList_SetItem()} and | 
					
						
							|  |  |  | \cfunction{PyTuple_SetItem()}\ttindex{PyTuple_SetItem()}, which  | 
					
						
							|  |  |  | steal a reference to the item (but not to the tuple or list into which | 
					
						
							|  |  |  | the item is put!).  These functions were designed to steal a reference | 
					
						
							|  |  |  | because of a common idiom for populating a tuple or list with newly | 
					
						
							|  |  |  | created objects; for example, the code to create the tuple \code{(1, | 
					
						
							|  |  |  | 2, "three")} could look like this (forgetting about error handling for | 
					
						
							|  |  |  | the moment; a better way to code this is shown below): | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							|  |  |  | PyObject *t; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | t = PyTuple_New(3); | 
					
						
							|  |  |  | PyTuple_SetItem(t, 0, PyInt_FromLong(1L)); | 
					
						
							|  |  |  | PyTuple_SetItem(t, 1, PyInt_FromLong(2L)); | 
					
						
							|  |  |  | PyTuple_SetItem(t, 2, PyString_FromString("three")); | 
					
						
							|  |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Incidentally, \cfunction{PyTuple_SetItem()} is the \emph{only} way to | 
					
						
							|  |  |  | set tuple items; \cfunction{PySequence_SetItem()} and | 
					
						
							|  |  |  | \cfunction{PyObject_SetItem()} refuse to do this since tuples are an | 
					
						
							|  |  |  | immutable data type.  You should only use | 
					
						
							|  |  |  | \cfunction{PyTuple_SetItem()} for tuples that you are creating | 
					
						
							|  |  |  | yourself. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Equivalent code for populating a list can be written using  | 
					
						
							|  |  |  | \cfunction{PyList_New()} and \cfunction{PyList_SetItem()}.  Such code | 
					
						
							|  |  |  | can also use \cfunction{PySequence_SetItem()}; this illustrates the | 
					
						
							|  |  |  | difference between the two (the extra \cfunction{Py_DECREF()} calls): | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							|  |  |  | PyObject *l, *x; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | l = PyList_New(3); | 
					
						
							|  |  |  | x = PyInt_FromLong(1L); | 
					
						
							|  |  |  | PySequence_SetItem(l, 0, x); Py_DECREF(x); | 
					
						
							|  |  |  | x = PyInt_FromLong(2L); | 
					
						
							|  |  |  | PySequence_SetItem(l, 1, x); Py_DECREF(x); | 
					
						
							|  |  |  | x = PyString_FromString("three"); | 
					
						
							|  |  |  | PySequence_SetItem(l, 2, x); Py_DECREF(x); | 
					
						
							|  |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | You might find it strange that the ``recommended'' approach takes more | 
					
						
							|  |  |  | code.  However, in practice, you will rarely use these ways of | 
					
						
							|  |  |  | creating and populating a tuple or list.  There's a generic function, | 
					
						
							|  |  |  | \cfunction{Py_BuildValue()}, that can create most common objects from | 
					
						
							|  |  |  | C values, directed by a \dfn{format string}.  For example, the | 
					
						
							|  |  |  | above two blocks of code could be replaced by the following (which | 
					
						
							|  |  |  | also takes care of the error checking): | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							|  |  |  | PyObject *t, *l; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | t = Py_BuildValue("(iis)", 1, 2, "three"); | 
					
						
							|  |  |  | l = Py_BuildValue("[iis]", 1, 2, "three"); | 
					
						
							|  |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | It is much more common to use \cfunction{PyObject_SetItem()} and | 
					
						
							|  |  |  | friends with items whose references you are only borrowing, like | 
					
						
							|  |  |  | arguments that were passed in to the function you are writing.  In | 
					
						
							|  |  |  | that case, their behaviour regarding reference counts is much saner, | 
					
						
							|  |  |  | since you don't have to increment a reference count so you can give a | 
					
						
							|  |  |  | reference away (``have it be stolen'').  For example, this function | 
					
						
							|  |  |  | sets all items of a list (actually, any mutable sequence) to a given | 
					
						
							|  |  |  | item: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											2001-10-25 15:53:44 +00:00
										 |  |  | int | 
					
						
							|  |  |  | set_all(PyObject *target, PyObject *item) | 
					
						
							| 
									
										
										
										
											2001-10-12 19:01:43 +00:00
										 |  |  | { | 
					
						
							|  |  |  |     int i, n; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     n = PyObject_Length(target); | 
					
						
							|  |  |  |     if (n < 0) | 
					
						
							|  |  |  |         return -1; | 
					
						
							|  |  |  |     for (i = 0; i < n; i++) { | 
					
						
							|  |  |  |         if (PyObject_SetItem(target, i, item) < 0) | 
					
						
							|  |  |  |             return -1; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return 0; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | \end{verbatim} | 
					
						
							|  |  |  | \ttindex{set_all()} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The situation is slightly different for function return values.   | 
					
						
							|  |  |  | While passing a reference to most functions does not change your  | 
					
						
							|  |  |  | ownership responsibilities for that reference, many functions that  | 
					
						
							|  |  |  | return a referece to an object give you ownership of the reference. | 
					
						
							|  |  |  | The reason is simple: in many cases, the returned object is created  | 
					
						
							|  |  |  | on the fly, and the reference you get is the only reference to the  | 
					
						
							|  |  |  | object.  Therefore, the generic functions that return object  | 
					
						
							|  |  |  | references, like \cfunction{PyObject_GetItem()} and  | 
					
						
							|  |  |  | \cfunction{PySequence_GetItem()}, always return a new reference (the | 
					
						
							|  |  |  | caller becomes the owner of the reference). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | It is important to realize that whether you own a reference returned  | 
					
						
							|  |  |  | by a function depends on which function you call only --- \emph{the | 
					
						
							| 
									
										
										
										
											2003-10-13 17:47:30 +00:00
										 |  |  | plumage} (the type of the object passed as an | 
					
						
							| 
									
										
										
										
											2001-10-12 19:01:43 +00:00
										 |  |  | argument to the function) \emph{doesn't enter into it!}  Thus, if you  | 
					
						
							|  |  |  | extract an item from a list using \cfunction{PyList_GetItem()}, you | 
					
						
							|  |  |  | don't own the reference --- but if you obtain the same item from the | 
					
						
							|  |  |  | same list using \cfunction{PySequence_GetItem()} (which happens to | 
					
						
							|  |  |  | take exactly the same arguments), you do own a reference to the | 
					
						
							|  |  |  | returned object. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Here is an example of how you could write a function that computes the | 
					
						
							|  |  |  | sum of the items in a list of integers; once using  | 
					
						
							|  |  |  | \cfunction{PyList_GetItem()}\ttindex{PyList_GetItem()}, and once using | 
					
						
							|  |  |  | \cfunction{PySequence_GetItem()}\ttindex{PySequence_GetItem()}. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											2001-10-25 15:53:44 +00:00
										 |  |  | long | 
					
						
							|  |  |  | sum_list(PyObject *list) | 
					
						
							| 
									
										
										
										
											2001-10-12 19:01:43 +00:00
										 |  |  | { | 
					
						
							|  |  |  |     int i, n; | 
					
						
							|  |  |  |     long total = 0; | 
					
						
							|  |  |  |     PyObject *item; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     n = PyList_Size(list); | 
					
						
							|  |  |  |     if (n < 0) | 
					
						
							|  |  |  |         return -1; /* Not a list */ | 
					
						
							|  |  |  |     for (i = 0; i < n; i++) { | 
					
						
							|  |  |  |         item = PyList_GetItem(list, i); /* Can't fail */ | 
					
						
							|  |  |  |         if (!PyInt_Check(item)) continue; /* Skip non-integers */ | 
					
						
							|  |  |  |         total += PyInt_AsLong(item); | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return total; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | \end{verbatim} | 
					
						
							|  |  |  | \ttindex{sum_list()} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											2001-10-25 15:53:44 +00:00
										 |  |  | long | 
					
						
							|  |  |  | sum_sequence(PyObject *sequence) | 
					
						
							| 
									
										
										
										
											2001-10-12 19:01:43 +00:00
										 |  |  | { | 
					
						
							|  |  |  |     int i, n; | 
					
						
							|  |  |  |     long total = 0; | 
					
						
							|  |  |  |     PyObject *item; | 
					
						
							|  |  |  |     n = PySequence_Length(sequence); | 
					
						
							|  |  |  |     if (n < 0) | 
					
						
							|  |  |  |         return -1; /* Has no length */ | 
					
						
							|  |  |  |     for (i = 0; i < n; i++) { | 
					
						
							|  |  |  |         item = PySequence_GetItem(sequence, i); | 
					
						
							|  |  |  |         if (item == NULL) | 
					
						
							|  |  |  |             return -1; /* Not a sequence, or other failure */ | 
					
						
							|  |  |  |         if (PyInt_Check(item)) | 
					
						
							|  |  |  |             total += PyInt_AsLong(item); | 
					
						
							|  |  |  |         Py_DECREF(item); /* Discard reference ownership */ | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     return total; | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | \end{verbatim} | 
					
						
							|  |  |  | \ttindex{sum_sequence()} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Types \label{types}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | There are few other data types that play a significant role in  | 
					
						
							|  |  |  | the Python/C API; most are simple C types such as \ctype{int},  | 
					
						
							|  |  |  | \ctype{long}, \ctype{double} and \ctype{char*}.  A few structure types  | 
					
						
							|  |  |  | are used to describe static tables used to list the functions exported  | 
					
						
							|  |  |  | by a module or the data attributes of a new object type, and another | 
					
						
							|  |  |  | is used to describe the value of a complex number.  These will  | 
					
						
							|  |  |  | be discussed together with the functions that use them. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \section{Exceptions \label{exceptions}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The Python programmer only needs to deal with exceptions if specific  | 
					
						
							|  |  |  | error handling is required; unhandled exceptions are automatically  | 
					
						
							|  |  |  | propagated to the caller, then to the caller's caller, and so on, until | 
					
						
							|  |  |  | they reach the top-level interpreter, where they are reported to the  | 
					
						
							|  |  |  | user accompanied by a stack traceback. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | For C programmers, however, error checking always has to be explicit.   | 
					
						
							|  |  |  | All functions in the Python/C API can raise exceptions, unless an  | 
					
						
							|  |  |  | explicit claim is made otherwise in a function's documentation.  In  | 
					
						
							|  |  |  | general, when a function encounters an error, it sets an exception,  | 
					
						
							|  |  |  | discards any object references that it owns, and returns an  | 
					
						
							|  |  |  | error indicator --- usually \NULL{} or \code{-1}.  A few functions  | 
					
						
							|  |  |  | return a Boolean true/false result, with false indicating an error. | 
					
						
							|  |  |  | Very few functions return no explicit error indicator or have an  | 
					
						
							|  |  |  | ambiguous return value, and require explicit testing for errors with  | 
					
						
							|  |  |  | \cfunction{PyErr_Occurred()}\ttindex{PyErr_Occurred()}. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Exception state is maintained in per-thread storage (this is  | 
					
						
							|  |  |  | equivalent to using global storage in an unthreaded application).  A  | 
					
						
							|  |  |  | thread can be in one of two states: an exception has occurred, or not. | 
					
						
							|  |  |  | The function \cfunction{PyErr_Occurred()} can be used to check for | 
					
						
							|  |  |  | this: it returns a borrowed reference to the exception type object | 
					
						
							|  |  |  | when an exception has occurred, and \NULL{} otherwise.  There are a | 
					
						
							|  |  |  | number of functions to set the exception state: | 
					
						
							|  |  |  | \cfunction{PyErr_SetString()}\ttindex{PyErr_SetString()} is the most | 
					
						
							|  |  |  | common (though not the most general) function to set the exception | 
					
						
							|  |  |  | state, and \cfunction{PyErr_Clear()}\ttindex{PyErr_Clear()} clears the | 
					
						
							|  |  |  | exception state. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The full exception state consists of three objects (all of which can  | 
					
						
							|  |  |  | be \NULL): the exception type, the corresponding exception  | 
					
						
							|  |  |  | value, and the traceback.  These have the same meanings as the Python | 
					
						
							|  |  |  | \withsubitem{(in module sys)}{ | 
					
						
							|  |  |  |   \ttindex{exc_type}\ttindex{exc_value}\ttindex{exc_traceback}} | 
					
						
							|  |  |  | objects \code{sys.exc_type}, \code{sys.exc_value}, and | 
					
						
							|  |  |  | \code{sys.exc_traceback}; however, they are not the same: the Python | 
					
						
							|  |  |  | objects represent the last exception being handled by a Python  | 
					
						
							|  |  |  | \keyword{try} \ldots\ \keyword{except} statement, while the C level | 
					
						
							|  |  |  | exception state only exists while an exception is being passed on | 
					
						
							|  |  |  | between C functions until it reaches the Python bytecode interpreter's  | 
					
						
							|  |  |  | main loop, which takes care of transferring it to \code{sys.exc_type} | 
					
						
							|  |  |  | and friends. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Note that starting with Python 1.5, the preferred, thread-safe way to  | 
					
						
							|  |  |  | access the exception state from Python code is to call the function | 
					
						
							|  |  |  | \withsubitem{(in module sys)}{\ttindex{exc_info()}} | 
					
						
							|  |  |  | \function{sys.exc_info()}, which returns the per-thread exception state  | 
					
						
							|  |  |  | for Python code.  Also, the semantics of both ways to access the  | 
					
						
							|  |  |  | exception state have changed so that a function which catches an  | 
					
						
							|  |  |  | exception will save and restore its thread's exception state so as to  | 
					
						
							|  |  |  | preserve the exception state of its caller.  This prevents common bugs  | 
					
						
							|  |  |  | in exception handling code caused by an innocent-looking function  | 
					
						
							|  |  |  | overwriting the exception being handled; it also reduces the often  | 
					
						
							|  |  |  | unwanted lifetime extension for objects that are referenced by the  | 
					
						
							|  |  |  | stack frames in the traceback. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | As a general principle, a function that calls another function to  | 
					
						
							|  |  |  | perform some task should check whether the called function raised an  | 
					
						
							|  |  |  | exception, and if so, pass the exception state on to its caller.  It  | 
					
						
							|  |  |  | should discard any object references that it owns, and return an  | 
					
						
							|  |  |  | error indicator, but it should \emph{not} set another exception --- | 
					
						
							|  |  |  | that would overwrite the exception that was just raised, and lose | 
					
						
							|  |  |  | important information about the exact cause of the error. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | A simple example of detecting exceptions and passing them on is shown | 
					
						
							|  |  |  | in the \cfunction{sum_sequence()}\ttindex{sum_sequence()} example | 
					
						
							|  |  |  | above.  It so happens that that example doesn't need to clean up any | 
					
						
							|  |  |  | owned references when it detects an error.  The following example | 
					
						
							|  |  |  | function shows some error cleanup.  First, to remind you why you like | 
					
						
							|  |  |  | Python, we show the equivalent Python code: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							|  |  |  | def incr_item(dict, key): | 
					
						
							|  |  |  |     try: | 
					
						
							|  |  |  |         item = dict[key] | 
					
						
							|  |  |  |     except KeyError: | 
					
						
							|  |  |  |         item = 0 | 
					
						
							|  |  |  |     dict[key] = item + 1 | 
					
						
							|  |  |  | \end{verbatim} | 
					
						
							|  |  |  | \ttindex{incr_item()} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Here is the corresponding C code, in all its glory: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											2001-10-25 15:53:44 +00:00
										 |  |  | int | 
					
						
							|  |  |  | incr_item(PyObject *dict, PyObject *key) | 
					
						
							| 
									
										
										
										
											2001-10-12 19:01:43 +00:00
										 |  |  | { | 
					
						
							|  |  |  |     /* Objects all initialized to NULL for Py_XDECREF */ | 
					
						
							|  |  |  |     PyObject *item = NULL, *const_one = NULL, *incremented_item = NULL; | 
					
						
							|  |  |  |     int rv = -1; /* Return value initialized to -1 (failure) */ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     item = PyObject_GetItem(dict, key); | 
					
						
							|  |  |  |     if (item == NULL) { | 
					
						
							|  |  |  |         /* Handle KeyError only: */ | 
					
						
							|  |  |  |         if (!PyErr_ExceptionMatches(PyExc_KeyError)) | 
					
						
							|  |  |  |             goto error; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |         /* Clear the error and use zero: */ | 
					
						
							|  |  |  |         PyErr_Clear(); | 
					
						
							|  |  |  |         item = PyInt_FromLong(0L); | 
					
						
							|  |  |  |         if (item == NULL) | 
					
						
							|  |  |  |             goto error; | 
					
						
							|  |  |  |     } | 
					
						
							|  |  |  |     const_one = PyInt_FromLong(1L); | 
					
						
							|  |  |  |     if (const_one == NULL) | 
					
						
							|  |  |  |         goto error; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     incremented_item = PyNumber_Add(item, const_one); | 
					
						
							|  |  |  |     if (incremented_item == NULL) | 
					
						
							|  |  |  |         goto error; | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     if (PyObject_SetItem(dict, key, incremented_item) < 0) | 
					
						
							|  |  |  |         goto error; | 
					
						
							|  |  |  |     rv = 0; /* Success */ | 
					
						
							|  |  |  |     /* Continue with cleanup code */ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |  error: | 
					
						
							|  |  |  |     /* Cleanup code, shared by success and failure path */ | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     /* Use Py_XDECREF() to ignore NULL references */ | 
					
						
							|  |  |  |     Py_XDECREF(item); | 
					
						
							|  |  |  |     Py_XDECREF(const_one); | 
					
						
							|  |  |  |     Py_XDECREF(incremented_item); | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     return rv; /* -1 for error, 0 for success */ | 
					
						
							|  |  |  | } | 
					
						
							|  |  |  | \end{verbatim} | 
					
						
							|  |  |  | \ttindex{incr_item()} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | This example represents an endorsed use of the \keyword{goto} statement  | 
					
						
							|  |  |  | in C!  It illustrates the use of | 
					
						
							|  |  |  | \cfunction{PyErr_ExceptionMatches()}\ttindex{PyErr_ExceptionMatches()} and | 
					
						
							|  |  |  | \cfunction{PyErr_Clear()}\ttindex{PyErr_Clear()} to | 
					
						
							|  |  |  | handle specific exceptions, and the use of | 
					
						
							|  |  |  | \cfunction{Py_XDECREF()}\ttindex{Py_XDECREF()} to | 
					
						
							|  |  |  | dispose of owned references that may be \NULL{} (note the | 
					
						
							|  |  |  | \character{X} in the name; \cfunction{Py_DECREF()} would crash when | 
					
						
							|  |  |  | confronted with a \NULL{} reference).  It is important that the | 
					
						
							|  |  |  | variables used to hold owned references are initialized to \NULL{} for | 
					
						
							|  |  |  | this to work; likewise, the proposed return value is initialized to | 
					
						
							|  |  |  | \code{-1} (failure) and only set to success after the final call made | 
					
						
							|  |  |  | is successful. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \section{Embedding Python \label{embedding}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The one important task that only embedders (as opposed to extension | 
					
						
							|  |  |  | writers) of the Python interpreter have to worry about is the | 
					
						
							|  |  |  | initialization, and possibly the finalization, of the Python | 
					
						
							|  |  |  | interpreter.  Most functionality of the interpreter can only be used | 
					
						
							|  |  |  | after the interpreter has been initialized. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The basic initialization function is | 
					
						
							|  |  |  | \cfunction{Py_Initialize()}\ttindex{Py_Initialize()}. | 
					
						
							|  |  |  | This initializes the table of loaded modules, and creates the | 
					
						
							|  |  |  | fundamental modules \module{__builtin__}\refbimodindex{__builtin__}, | 
					
						
							|  |  |  | \module{__main__}\refbimodindex{__main__}, \module{sys}\refbimodindex{sys}, | 
					
						
							|  |  |  | and \module{exceptions}.\refbimodindex{exceptions}  It also initializes | 
					
						
							|  |  |  | the module search path (\code{sys.path}).%
 | 
					
						
							|  |  |  | \indexiii{module}{search}{path} | 
					
						
							|  |  |  | \withsubitem{(in module sys)}{\ttindex{path}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \cfunction{Py_Initialize()} does not set the ``script argument list''  | 
					
						
							|  |  |  | (\code{sys.argv}).  If this variable is needed by Python code that  | 
					
						
							|  |  |  | will be executed later, it must be set explicitly with a call to  | 
					
						
							|  |  |  | \code{PySys_SetArgv(\var{argc}, | 
					
						
							|  |  |  | \var{argv})}\ttindex{PySys_SetArgv()} subsequent to the call to | 
					
						
							|  |  |  | \cfunction{Py_Initialize()}. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | On most systems (in particular, on \UNIX{} and Windows, although the | 
					
						
							|  |  |  | details are slightly different), | 
					
						
							|  |  |  | \cfunction{Py_Initialize()} calculates the module search path based | 
					
						
							|  |  |  | upon its best guess for the location of the standard Python | 
					
						
							|  |  |  | interpreter executable, assuming that the Python library is found in a | 
					
						
							|  |  |  | fixed location relative to the Python interpreter executable.  In | 
					
						
							|  |  |  | particular, it looks for a directory named | 
					
						
							|  |  |  | \file{lib/python\shortversion} relative to the parent directory where | 
					
						
							|  |  |  | the executable named \file{python} is found on the shell command | 
					
						
							|  |  |  | search path (the environment variable \envvar{PATH}). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | For instance, if the Python executable is found in | 
					
						
							|  |  |  | \file{/usr/local/bin/python}, it will assume that the libraries are in | 
					
						
							|  |  |  | \file{/usr/local/lib/python\shortversion}.  (In fact, this particular path | 
					
						
							|  |  |  | is also the ``fallback'' location, used when no executable file named | 
					
						
							|  |  |  | \file{python} is found along \envvar{PATH}.)  The user can override | 
					
						
							|  |  |  | this behavior by setting the environment variable \envvar{PYTHONHOME}, | 
					
						
							|  |  |  | or insert additional directories in front of the standard path by | 
					
						
							|  |  |  | setting \envvar{PYTHONPATH}. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The embedding application can steer the search by calling  | 
					
						
							|  |  |  | \code{Py_SetProgramName(\var{file})}\ttindex{Py_SetProgramName()} \emph{before} calling  | 
					
						
							|  |  |  | \cfunction{Py_Initialize()}.  Note that \envvar{PYTHONHOME} still | 
					
						
							|  |  |  | overrides this and \envvar{PYTHONPATH} is still inserted in front of | 
					
						
							|  |  |  | the standard path.  An application that requires total control has to | 
					
						
							|  |  |  | provide its own implementation of | 
					
						
							|  |  |  | \cfunction{Py_GetPath()}\ttindex{Py_GetPath()}, | 
					
						
							|  |  |  | \cfunction{Py_GetPrefix()}\ttindex{Py_GetPrefix()}, | 
					
						
							|  |  |  | \cfunction{Py_GetExecPrefix()}\ttindex{Py_GetExecPrefix()}, and | 
					
						
							|  |  |  | \cfunction{Py_GetProgramFullPath()}\ttindex{Py_GetProgramFullPath()} (all | 
					
						
							|  |  |  | defined in \file{Modules/getpath.c}). | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | Sometimes, it is desirable to ``uninitialize'' Python.  For instance,  | 
					
						
							|  |  |  | the application may want to start over (make another call to  | 
					
						
							|  |  |  | \cfunction{Py_Initialize()}) or the application is simply done with its  | 
					
						
							|  |  |  | use of Python and wants to free all memory allocated by Python.  This | 
					
						
							|  |  |  | can be accomplished by calling \cfunction{Py_Finalize()}.  The function | 
					
						
							|  |  |  | \cfunction{Py_IsInitialized()}\ttindex{Py_IsInitialized()} returns | 
					
						
							|  |  |  | true if Python is currently in the initialized state.  More | 
					
						
							|  |  |  | information about these functions is given in a later chapter. |