cpython/Doc/ext/embedding.tex

\chapter{Embedding Python in Another Application
     \label{embedding}}

The previous chapters discussed how to extend Python, that is, how to
extend the functionality of Python by attaching a library of C
functions to it.  It is also possible to do it the other way around:
enrich your C/\Cpp{} application by embedding Python in it.  Embedding
provides your application with the ability to implement some of the
functionality of your application in Python rather than C or \Cpp.
This can be used for many purposes; one example would be to allow
users to tailor the application to their needs by writing some scripts
in Python.  You can also use it yourself if some of the functionality
can be written in Python more easily.

Embedding Python is similar to extending it, but not quite.  The
difference is that when you extend Python, the main program of the
application is still the Python interpreter, while if you embed
Python, the main program may have nothing to do with Python ---
instead, some parts of the application occasionally call the Python
interpreter to run some Python code.

So if you are embedding Python, you are providing your own main
program.  One of the things this main program has to do is initialize
the Python interpreter.  At the very least, you have to call the
function \cfunction{Py_Initialize()} (on Mac OS, call
\cfunction{PyMac_Initialize()} instead).  There are optional calls to
pass command line arguments to Python.  Then later you can call the
interpreter from any part of the application.

There are several different ways to call the interpreter: you can pass
a string containing Python statements to
\cfunction{PyRun_SimpleString()}, or you can pass a stdio file pointer
and a file name (for identification in error messages only) to
\cfunction{PyRun_SimpleFile()}.  You can also call the lower-level
operations described in the previous chapters to construct and use
Python objects.

A simple demo of embedding Python can be found in the directory
\file{Demo/embed/} of the source distribution.


\begin{seealso}
  \seetitle[../api/api.html]{Python/C API Reference Manual}{The
            details of Python's C interface are given in this manual.
            A great deal of necessary information can be found here.}
\end{seealso}


\section{Very High Level Embedding
         \label{high-level-embedding}}

The simplest form of embedding Python is the use of the very
high level interface. This interface is intended to execute a
Python script without needing to interact with the application
directly. This can for example be used to perform some operation
on a file.

\begin{verbatim}
#include <Python.h>

int
main(int argc, char *argv[])
{
  Py_Initialize();
  PyRun_SimpleString("from time import time,ctime\n"
                     "print 'Today is',ctime(time())\n");
  Py_Finalize();
  return 0;
}
\end{verbatim}

The above code first initializes the Python interpreter with
\cfunction{Py_Initialize()}, followed by the execution of a hard-coded
Python script that print the date and time.  Afterwards, the
\cfunction{Py_Finalize()} call shuts the interpreter down, followed by
the end of the program.  In a real program, you may want to get the
Python script from another source, perhaps a text-editor routine, a
file, or a database.  Getting the Python code from a file can better
be done by using the \cfunction{PyRun_SimpleFile()} function, which
saves you the trouble of allocating memory space and loading the file
contents.


\section{Beyond Very High Level Embedding: An overview
         \label{lower-level-embedding}}

The high level interface gives you the ability to execute
arbitrary pieces of Python code from your application, but
exchanging data values is quite cumbersome to say the least. If
you want that, you should use lower level calls. At the cost of
having to write more C code, you can achieve almost anything.

It should be noted that extending Python and embedding Python
is quite the same activity, despite the different intent. Most
topics discussed in the previous chapters are still valid. To
show this, consider what the extension code from Python to C
really does:

\begin{enumerate}
    \item Convert data values from Python to C,
    \item Perform a function call to a C routine using the
        converted values, and
    \item Convert the data values from the call from C to Python.
\end{enumerate}

When embedding Python, the interface code does:

\begin{enumerate}
    \item Convert data values from C to Python,
    \item Perform a function call to a Python interface routine
        using the converted values, and
    \item Convert the data values from the call from Python to C.
\end{enumerate}

As you can see, the data conversion steps are simply swapped to
accomodate the different direction of the cross-language transfer.
The only difference is the routine that you call between both
data conversions. When extending, you call a C routine, when
embedding, you call a Python routine.

This chapter will not discuss how to convert data from Python
to C and vice versa.  Also, proper use of references and dealing
with errors is assumed to be understood.  Since these aspects do not
differ from extending the interpreter, you can refer to earlier
chapters for the required information.


\section{Pure Embedding
         \label{pure-embedding}}

The first program aims to execute a function in a Python
script. Like in the section about the very high level interface,
the Python interpreter does not directly interact with the
application (but that will change in th next section).

The code to run a function defined in a Python script is:

\verbatiminput{run-func.c}

This code loads a Python script using \code{argv[1]}, and calls the
function named in \code{argv[2]}.  Its integer arguments are the other
values of the \code{argv} array.  If you compile and link this
program (let's call the finished executable \program{call}), and use
it to execute a Python script, such as:

\begin{verbatim}
def multiply(a,b):
    print "Thy shall add", a, "times", b
    c = 0
    for i in range(0, a):
        c = c + b
    return c
\end{verbatim}

then the result should be:

\begin{verbatim}
$ call multiply 3 2
Thy shall add 3 times 2
Result of call: 6
\end{verbatim} % $

Although the program is quite large for its functionality, most of the
code is for data conversion between Python and C, and for error
reporting.  The interesting part with respect to embedding Python
starts with

\begin{verbatim}
    Py_Initialize();
    pName = PyString_FromString(argv[1]);
    /* Error checking of pName left out */
    pModule = PyImport_Import(pName);
\end{verbatim}

After initializing the interpreter, the script is loaded using
\cfunction{PyImport_Import()}.  This routine needs a Python string
as its argument, which is constructed using the
\cfunction{PyString_FromString()} data conversion routine.

\begin{verbatim}
    pFunc = PyObject_GetAttrString(pModule, argv[2]);
    /* pFunc is a new reference */

    if (pFunc && PyCallable_Check(pFunc)) {
        ...
    }
    Py_XDECREF(pFunc);
\end{verbatim}

Once the script is loaded, the name we're looking for is retrieved
using \cfunction{PyObject_GetAttrString()}.  If the name exists, and
the object retunred is callable, you can safely assume that it is a
function.  The program then proceeds by constructing a tuple of
arguments as normal.  The call to the Python function is then made
with:

\begin{verbatim}
    pValue = PyObject_CallObject(pFunc, pArgs);
\end{verbatim}

Upon return of the function, \code{pValue} is either \NULL{} or it
contains a reference to the return value of the function.  Be sure to
release the reference after examining the value.


\section{Extending Embedded Python
         \label{extending-with-embedding}}

Until now, the embedded Python interpreter had no access to
functionality from the application itself.  The Python API allows this
by extending the embedded interpreter.  That is, the embedded
interpreter gets extended with routines provided by the application.
While it sounds complex, it is not so bad.  Simply forget for a while
that the application starts the Python interpreter.  Instead, consider
the application to be a set of subroutines, and write some glue code
that gives Python access to those routines, just like you would write
a normal Python extension.  For example:

\begin{verbatim}
static int numargs=0;

/* Return the number of arguments of the application command line */
static PyObject*
emb_numargs(PyObject *self, PyObject *args)
{
    if(!PyArg_ParseTuple(args, ":numargs"))
        return NULL;
    return Py_BuildValue("i", numargs);
}

static PyMethodDef EmbMethods[] = {
    {"numargs", emb_numargs, METH_VARARGS,
     "Return the number of arguments received by the process."},
    {NULL, NULL, 0, NULL}
};
\end{verbatim}

Insert the above code just above the \cfunction{main()} function.
Also, insert the following two statements directly after
\cfunction{Py_Initialize()}:

\begin{verbatim}
    numargs = argc;
    Py_InitModule("emb", EmbMethods);
\end{verbatim}

These two lines initialize the \code{numargs} variable, and make the
\function{emb.numargs()} function accessible to the embedded Python
interpreter.  With these extensions, the Python script can do things
like

\begin{verbatim}
import emb
print "Number of arguments", emb.numargs()
\end{verbatim}

In a real application, the methods will expose an API of the
application to Python.


%\section{For the future}
%
%You don't happen to have a nice library to get textual
%equivalents of numeric values do you :-) ?
%Callbacks here ? (I may be using information from that section
%?!)
%threads
%code examples do not really behave well if errors happen
% (what to watch out for)


\section{Embedding Python in \Cpp
     \label{embeddingInCplusplus}}

It is also possible to embed Python in a \Cpp{} program; precisely how this
is done will depend on the details of the \Cpp{} system used; in general you
will need to write the main program in \Cpp, and use the \Cpp{} compiler
to compile and link your program.  There is no need to recompile Python
itself using \Cpp.


\section{Linking Requirements
         \label{link-reqs}}

While the \program{configure} script shipped with the Python sources
will correctly build Python to export the symbols needed by
dynamically linked extensions, this is not automatically inherited by
applications which embed the Python library statically, at least on
\UNIX.  This is an issue when the application is linked to the static
runtime library (\file{libpython.a}) and needs to load dynamic
extensions (implemented as \file{.so} files).

The problem is that some entry points are defined by the Python
runtime solely for extension modules to use.  If the embedding
application does not use any of these entry points, some linkers will
not include those entries in the symbol table of the finished
executable.  Some additional options are needed to inform the linker
not to remove these symbols.

Determining the right options to use for any given platform can be
quite difficult, but fortunately the Python configuration already has
those values.  To retrieve them from an installed Python interpreter,
start an interactive interpreter and have a short session like this:

\begin{verbatim}
>>> import distutils.sysconfig
>>> distutils.sysconfig.get_config_var('LINKFORSHARED')
'-Xlinker -export-dynamic'
\end{verbatim}
\refstmodindex{distutils.sysconfig}

The contents of the string presented will be the options that should
be used.  If the string is empty, there's no need to add any
additional options.  The \constant{LINKFORSHARED} definition
corresponds to the variable of the same name in Python's top-level
\file{Makefile}.
Split "Extending & Embedding" into separate files, one per chapter. 2001-08-20 19:30:29 +00:00			`\chapter{Embedding Python in Another Application`
			`\label{embedding}}`

			`The previous chapters discussed how to extend Python, that is, how to`
			`extend the functionality of Python by attaching a library of C`
			`functions to it. It is also possible to do it the other way around:`
			`enrich your C/\Cpp{} application by embedding Python in it. Embedding`
			`provides your application with the ability to implement some of the`
			`functionality of your application in Python rather than C or \Cpp.`
			`This can be used for many purposes; one example would be to allow`
			`users to tailor the application to their needs by writing some scripts`
			`in Python. You can also use it yourself if some of the functionality`
			`can be written in Python more easily.`

			`Embedding Python is similar to extending it, but not quite. The`
			`difference is that when you extend Python, the main program of the`
			`application is still the Python interpreter, while if you embed`
			`Python, the main program may have nothing to do with Python ---`
			`instead, some parts of the application occasionally call the Python`
			`interpreter to run some Python code.`

			`So if you are embedding Python, you are providing your own main`
			`program. One of the things this main program has to do is initialize`
			`the Python interpreter. At the very least, you have to call the`
			`function \cfunction{Py_Initialize()} (on Mac OS, call`
			`\cfunction{PyMac_Initialize()} instead). There are optional calls to`
			`pass command line arguments to Python. Then later you can call the`
			`interpreter from any part of the application.`

			`There are several different ways to call the interpreter: you can pass`
			`a string containing Python statements to`
			`\cfunction{PyRun_SimpleString()}, or you can pass a stdio file pointer`
			`and a file name (for identification in error messages only) to`
			`\cfunction{PyRun_SimpleFile()}. You can also call the lower-level`
			`operations described in the previous chapters to construct and use`
			`Python objects.`

			`A simple demo of embedding Python can be found in the directory`
			`\file{Demo/embed/} of the source distribution.`


			`\begin{seealso}`
			`\seetitle[../api/api.html]{Python/C API Reference Manual}{The`
			`details of Python's C interface are given in this manual.`
			`A great deal of necessary information can be found here.}`
			`\end{seealso}`


			`\section{Very High Level Embedding`
			`\label{high-level-embedding}}`

			`The simplest form of embedding Python is the use of the very`
			`high level interface. This interface is intended to execute a`
			`Python script without needing to interact with the application`
			`directly. This can for example be used to perform some operation`
			`on a file.`

			`\begin{verbatim}`
			`#include <Python.h>`

Use the standard argument convention for main(), and conform to the Python/C style guide. 2001-09-06 16:17:24 +00:00			`int`
			`main(int argc, char *argv[])`
Split "Extending & Embedding" into separate files, one per chapter. 2001-08-20 19:30:29 +00:00			`{`
			`Py_Initialize();`
			`PyRun_SimpleString("from time import time,ctime\n"`
			`"print 'Today is',ctime(time())\n");`
			`Py_Finalize();`
			`return 0;`
			`}`
			`\end{verbatim}`

			`The above code first initializes the Python interpreter with`
			`\cfunction{Py_Initialize()}, followed by the execution of a hard-coded`
			`Python script that print the date and time. Afterwards, the`
			`\cfunction{Py_Finalize()} call shuts the interpreter down, followed by`
			`the end of the program. In a real program, you may want to get the`
			`Python script from another source, perhaps a text-editor routine, a`
			`file, or a database. Getting the Python code from a file can better`
			`be done by using the \cfunction{PyRun_SimpleFile()} function, which`
			`saves you the trouble of allocating memory space and loading the file`
			`contents.`


			`\section{Beyond Very High Level Embedding: An overview`
			`\label{lower-level-embedding}}`

			`The high level interface gives you the ability to execute`
			`arbitrary pieces of Python code from your application, but`
			`exchanging data values is quite cumbersome to say the least. If`
			`you want that, you should use lower level calls. At the cost of`
			`having to write more C code, you can achieve almost anything.`

			`It should be noted that extending Python and embedding Python`
			`is quite the same activity, despite the different intent. Most`
			`topics discussed in the previous chapters are still valid. To`
			`show this, consider what the extension code from Python to C`
			`really does:`

			`\begin{enumerate}`
			`\item Convert data values from Python to C,`
			`\item Perform a function call to a C routine using the`
			`converted values, and`
			`\item Convert the data values from the call from C to Python.`
			`\end{enumerate}`

			`When embedding Python, the interface code does:`

			`\begin{enumerate}`
			`\item Convert data values from C to Python,`
			`\item Perform a function call to a Python interface routine`
			`using the converted values, and`
			`\item Convert the data values from the call from Python to C.`
			`\end{enumerate}`

			`As you can see, the data conversion steps are simply swapped to`
			`accomodate the different direction of the cross-language transfer.`
			`The only difference is the routine that you call between both`
			`data conversions. When extending, you call a C routine, when`
			`embedding, you call a Python routine.`

			`This chapter will not discuss how to convert data from Python`
			`to C and vice versa. Also, proper use of references and dealing`
			`with errors is assumed to be understood. Since these aspects do not`
			`differ from extending the interpreter, you can refer to earlier`
			`chapters for the required information.`


			`\section{Pure Embedding`
			`\label{pure-embedding}}`

			`The first program aims to execute a function in a Python`
			`script. Like in the section about the very high level interface,`
			`the Python interpreter does not directly interact with the`
			`application (but that will change in th next section).`

			`The code to run a function defined in a Python script is:`

			`\verbatiminput{run-func.c}`

			`This code loads a Python script using \code{argv[1]}, and calls the`
			`function named in \code{argv[2]}. Its integer arguments are the other`
			`values of the \code{argv} array. If you compile and link this`
			`program (let's call the finished executable \program{call}), and use`
			`it to execute a Python script, such as:`

			`\begin{verbatim}`
			`def multiply(a,b):`
			`print "Thy shall add", a, "times", b`
			`c = 0`
			`for i in range(0, a):`
			`c = c + b`
			`return c`
			`\end{verbatim}`

			`then the result should be:`

			`\begin{verbatim}`
			`$ call multiply 3 2`
			`Thy shall add 3 times 2`
			`Result of call: 6`
			`\end{verbatim} % $`

			`Although the program is quite large for its functionality, most of the`
			`code is for data conversion between Python and C, and for error`
			`reporting. The interesting part with respect to embedding Python`
			`starts with`

			`\begin{verbatim}`
			`Py_Initialize();`
			`pName = PyString_FromString(argv[1]);`
			`/* Error checking of pName left out */`
			`pModule = PyImport_Import(pName);`
			`\end{verbatim}`

			`After initializing the interpreter, the script is loaded using`
			`\cfunction{PyImport_Import()}. This routine needs a Python string`
			`as its argument, which is constructed using the`
			`\cfunction{PyString_FromString()} data conversion routine.`

			`\begin{verbatim}`
Change example of retrieving & calling a Python function to not use PyModule_GetDict(), which is also more flexible: it does not assume that the "module" is a real module. 2002-04-12 19:04:17 +00:00			`pFunc = PyObject_GetAttrString(pModule, argv[2]);`
			`/* pFunc is a new reference */`
Split "Extending & Embedding" into separate files, one per chapter. 2001-08-20 19:30:29 +00:00
			`if (pFunc && PyCallable_Check(pFunc)) {`
			`...`
			`}`
Change example of retrieving & calling a Python function to not use PyModule_GetDict(), which is also more flexible: it does not assume that the "module" is a real module. 2002-04-12 19:04:17 +00:00			`Py_XDECREF(pFunc);`
Split "Extending & Embedding" into separate files, one per chapter. 2001-08-20 19:30:29 +00:00			`\end{verbatim}`

Change example of retrieving & calling a Python function to not use PyModule_GetDict(), which is also more flexible: it does not assume that the "module" is a real module. 2002-04-12 19:04:17 +00:00			`Once the script is loaded, the name we're looking for is retrieved`
			`using \cfunction{PyObject_GetAttrString()}. If the name exists, and`
			`the object retunred is callable, you can safely assume that it is a`
			`function. The program then proceeds by constructing a tuple of`
			`arguments as normal. The call to the Python function is then made`
			`with:`
Split "Extending & Embedding" into separate files, one per chapter. 2001-08-20 19:30:29 +00:00
			`\begin{verbatim}`
			`pValue = PyObject_CallObject(pFunc, pArgs);`
			`\end{verbatim}`

			`Upon return of the function, \code{pValue} is either \NULL{} or it`
			`contains a reference to the return value of the function. Be sure to`
			`release the reference after examining the value.`


			`\section{Extending Embedded Python`
			`\label{extending-with-embedding}}`

			`Until now, the embedded Python interpreter had no access to`
			`functionality from the application itself. The Python API allows this`
			`by extending the embedded interpreter. That is, the embedded`
			`interpreter gets extended with routines provided by the application.`
			`While it sounds complex, it is not so bad. Simply forget for a while`
			`that the application starts the Python interpreter. Instead, consider`
			`the application to be a set of subroutines, and write some glue code`
			`that gives Python access to those routines, just like you would write`
			`a normal Python extension. For example:`

			`\begin{verbatim}`
			`static int numargs=0;`

			`/* Return the number of arguments of the application command line */`
			`static PyObject*`
			`emb_numargs(PyObject self, PyObject args)`
			`{`
			`if(!PyArg_ParseTuple(args, ":numargs"))`
			`return NULL;`
			`return Py_BuildValue("i", numargs);`
			`}`

Exhibit good form in C code: always provide docstrings in method tables, and always fill in all slots of table entries. Fixed a few minor markup errors. 2001-11-17 06:50:42 +00:00			`static PyMethodDef EmbMethods[] = {`
			`{"numargs", emb_numargs, METH_VARARGS,`
			`"Return the number of arguments received by the process."},`
			`{NULL, NULL, 0, NULL}`
Split "Extending & Embedding" into separate files, one per chapter. 2001-08-20 19:30:29 +00:00			`};`
			`\end{verbatim}`

			`Insert the above code just above the \cfunction{main()} function.`
			`Also, insert the following two statements directly after`
			`\cfunction{Py_Initialize()}:`

			`\begin{verbatim}`
			`numargs = argc;`
			`Py_InitModule("emb", EmbMethods);`
			`\end{verbatim}`

			`These two lines initialize the \code{numargs} variable, and make the`
			`\function{emb.numargs()} function accessible to the embedded Python`
			`interpreter. With these extensions, the Python script can do things`
			`like`

			`\begin{verbatim}`
			`import emb`
			`print "Number of arguments", emb.numargs()`
			`\end{verbatim}`

			`In a real application, the methods will expose an API of the`
			`application to Python.`


			`%\section{For the future}`
			`%`
			`%You don't happen to have a nice library to get textual`
			`%equivalents of numeric values do you :-) ?`
			`%Callbacks here ? (I may be using information from that section`
			`%?!)`
			`%threads`
			`%code examples do not really behave well if errors happen`
			`% (what to watch out for)`


Clean up some markup cruft. A number of the macros that take no parameters (like \UNIX) are commonly entered using an empty group to separate the markup from a following inter-word space; this is not needed when the next character is punctuation, or the markup is the last thing in the enclosing group. These cases were marked inconsistently; the empty group is now only used when needed. 2001-11-28 07:26:15 +00:00			`\section{Embedding Python in \Cpp`
Split "Extending & Embedding" into separate files, one per chapter. 2001-08-20 19:30:29 +00:00			`\label{embeddingInCplusplus}}`

			`It is also possible to embed Python in a \Cpp{} program; precisely how this`
			`is done will depend on the details of the \Cpp{} system used; in general you`
Clean up some markup cruft. A number of the macros that take no parameters (like \UNIX) are commonly entered using an empty group to separate the markup from a following inter-word space; this is not needed when the next character is punctuation, or the markup is the last thing in the enclosing group. These cases were marked inconsistently; the empty group is now only used when needed. 2001-11-28 07:26:15 +00:00			`will need to write the main program in \Cpp, and use the \Cpp{} compiler`
Split "Extending & Embedding" into separate files, one per chapter. 2001-08-20 19:30:29 +00:00			`to compile and link your program. There is no need to recompile Python`
Clean up some markup cruft. A number of the macros that take no parameters (like \UNIX) are commonly entered using an empty group to separate the markup from a following inter-word space; this is not needed when the next character is punctuation, or the markup is the last thing in the enclosing group. These cases were marked inconsistently; the empty group is now only used when needed. 2001-11-28 07:26:15 +00:00			`itself using \Cpp.`
Split "Extending & Embedding" into separate files, one per chapter. 2001-08-20 19:30:29 +00:00

			`\section{Linking Requirements`
			`\label{link-reqs}}`

			`While the \program{configure} script shipped with the Python sources`
			`will correctly build Python to export the symbols needed by`
			`dynamically linked extensions, this is not automatically inherited by`
			`applications which embed the Python library statically, at least on`
			`\UNIX. This is an issue when the application is linked to the static`
			`runtime library (\file{libpython.a}) and needs to load dynamic`
			`extensions (implemented as \file{.so} files).`

			`The problem is that some entry points are defined by the Python`
			`runtime solely for extension modules to use. If the embedding`
			`application does not use any of these entry points, some linkers will`
			`not include those entries in the symbol table of the finished`
			`executable. Some additional options are needed to inform the linker`
			`not to remove these symbols.`

			`Determining the right options to use for any given platform can be`
			`quite difficult, but fortunately the Python configuration already has`
			`those values. To retrieve them from an installed Python interpreter,`
			`start an interactive interpreter and have a short session like this:`

			`\begin{verbatim}`
			`>>> import distutils.sysconfig`
			`>>> distutils.sysconfig.get_config_var('LINKFORSHARED')`
			`'-Xlinker -export-dynamic'`
			`\end{verbatim}`
			`\refstmodindex{distutils.sysconfig}`

			`The contents of the string presented will be the options that should`
			`be used. If the string is empty, there's no need to add any`
			`additional options. The \constant{LINKFORSHARED} definition`
			`corresponds to the variable of the same name in Python's top-level`
			`\file{Makefile}.`