This commit is contained in:
Petr Viktorin 2025-12-08 06:50:51 +01:00 committed by GitHub
commit cbcf5ad2d1
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 787 additions and 384 deletions

View file

@ -107,6 +107,46 @@ header files properly declare the entry points to be ``extern "C"``. As a result
there is no need to do anything special to use the API from C++.
.. _capi-system-includes:
System includes
---------------
:file:`Python.h` includes several standard header files.
C extensions should include the standard headers that they use,
and should not rely on these implicit includes.
The implicit includes are:
* ``<assert.h>``
* ``<intrin.h>`` (on Windows)
* ``<inttypes.h>``
* ``<limits.h>``
* ``<math.h>``
* ``<stdarg.h>``
* ``<wchar.h>``
* ``<sys/types.h>`` (if present)
The following are included for backwards compatibility, unless using
:ref:`Limited API <limited-c-api>` 3.13 or newer:
* ``<ctype.h>``
* ``<unistd.h>`` (on POSIX)
The following are included for backwards compatibility, unless using
:ref:`Limited API <limited-c-api>` 3.11 or newer:
* ``<errno.h>``
* ``<stdio.h>``
* ``<stdlib.h>``
* ``<string.h>``
.. note::
Since Python may define some pre-processor definitions which affect the standard
headers on some systems, you *must* include :file:`Python.h` before any standard
headers are included.
Useful macros
=============

View file

@ -3,154 +3,20 @@
.. _extending-intro:
******************************
Extending Python with C or C++
******************************
********************************
Using the C API: Assorted topics
********************************
It is quite easy to add new built-in modules to Python, if you know how to
program in C. Such :dfn:`extension modules` can do two things that can't be
done directly in Python: they can implement new built-in object types, and they
can call C library functions and system calls.
To support extensions, the Python API (Application Programmers Interface)
defines a set of functions, macros and variables that provide access to most
aspects of the Python run-time system. The Python API is incorporated in a C
source file by including the header ``"Python.h"``.
The compilation of an extension module depends on its intended use as well as on
your system setup; details are given in later chapters.
.. note::
The C extension interface is specific to CPython, and extension modules do
not work on other Python implementations. In many cases, it is possible to
avoid writing C extensions and preserve portability to other implementations.
For example, if your use case is calling C library functions or system calls,
you should consider using the :mod:`ctypes` module or the `cffi
<https://cffi.readthedocs.io/>`_ library rather than writing
custom C code.
These modules let you write Python code to interface with C code and are more
portable between implementations of Python than writing and compiling a C
extension module.
.. _extending-simpleexample:
A Simple Example
================
Let's create an extension module called ``spam`` (the favorite food of Monty
Python fans...) and let's say we want to create a Python interface to the C
library function :c:func:`system` [#]_. This function takes a null-terminated
character string as argument and returns an integer. We want this function to
be callable from Python as follows:
.. code-block:: pycon
>>> import spam
>>> status = spam.system("ls -l")
Begin by creating a file :file:`spammodule.c`. (Historically, if a module is
called ``spam``, the C file containing its implementation is called
:file:`spammodule.c`; if the module name is very long, like ``spammify``, the
module name can be just :file:`spammify.c`.)
The first two lines of our file can be::
#define PY_SSIZE_T_CLEAN
#include <Python.h>
which pulls in the Python API (you can add a comment describing the purpose of
the module and a copyright notice if you like).
.. note::
Since Python may define some pre-processor definitions which affect the standard
headers on some systems, you *must* include :file:`Python.h` before any standard
headers are included.
``#define PY_SSIZE_T_CLEAN`` was used to indicate that ``Py_ssize_t`` should be
used in some APIs instead of ``int``.
It is not necessary since Python 3.13, but we keep it here for backward compatibility.
See :ref:`arg-parsing-string-and-buffers` for a description of this macro.
All user-visible symbols defined by :file:`Python.h` have a prefix of ``Py`` or
``PY``, except those defined in standard header files.
.. tip::
For backward compatibility, :file:`Python.h` includes several standard header files.
C extensions should include the standard headers that they use,
and should not rely on these implicit includes.
If using the limited C API version 3.13 or newer, the implicit includes are:
* ``<assert.h>``
* ``<intrin.h>`` (on Windows)
* ``<inttypes.h>``
* ``<limits.h>``
* ``<math.h>``
* ``<stdarg.h>``
* ``<wchar.h>``
* ``<sys/types.h>`` (if present)
If :c:macro:`Py_LIMITED_API` is not defined, or is set to version 3.12 or older,
the headers below are also included:
* ``<ctype.h>``
* ``<unistd.h>`` (on POSIX)
If :c:macro:`Py_LIMITED_API` is not defined, or is set to version 3.10 or older,
the headers below are also included:
* ``<errno.h>``
* ``<stdio.h>``
* ``<stdlib.h>``
* ``<string.h>``
The next thing we add to our module file is the C function that will be called
when the Python expression ``spam.system(string)`` is evaluated (we'll see
shortly how it ends up being called)::
static PyObject *
spam_system(PyObject *self, PyObject *args)
{
const char *command;
int sts;
if (!PyArg_ParseTuple(args, "s", &command))
return NULL;
sts = system(command);
return PyLong_FromLong(sts);
}
There is a straightforward translation from the argument list in Python (for
example, the single expression ``"ls -l"``) to the arguments passed to the C
function. The C function always has two arguments, conventionally named *self*
and *args*.
The *self* argument points to the module object for module-level functions;
for a method it would point to the object instance.
The *args* argument will be a pointer to a Python tuple object containing the
arguments. Each item of the tuple corresponds to an argument in the call's
argument list. The arguments are Python objects --- in order to do anything
with them in our C function we have to convert them to C values. The function
:c:func:`PyArg_ParseTuple` in the Python API checks the argument types and
converts them to C values. It uses a template string to determine the required
types of the arguments as well as the types of the C variables into which to
store the converted values. More about this later.
:c:func:`PyArg_ParseTuple` returns true (nonzero) if all arguments have the right
type and its components have been stored in the variables whose addresses are
passed. It returns false (zero) if an invalid argument list was passed. In the
latter case it also raises an appropriate exception so the calling function can
return ``NULL`` immediately (as we saw in the example).
The :ref:`tutorial <first-extension-module>` walked you through
creating a C API extension module, but left many areas unexplained.
This document looks at several concepts that you'll need to learn
in order to write more complex extensions.
.. _extending-errors:
Intermezzo: Errors and Exceptions
=================================
Errors and Exceptions
=====================
An important convention throughout the Python interpreter is the following: when
a function fails, it should set an exception condition and return an error value
@ -321,194 +187,14 @@ call to :c:func:`PyErr_SetString` as shown below::
}
.. _backtoexample:
Back to the Example
===================
Going back to our example function, you should now be able to understand this
statement::
if (!PyArg_ParseTuple(args, "s", &command))
return NULL;
It returns ``NULL`` (the error indicator for functions returning object pointers)
if an error is detected in the argument list, relying on the exception set by
:c:func:`PyArg_ParseTuple`. Otherwise the string value of the argument has been
copied to the local variable :c:data:`!command`. This is a pointer assignment and
you are not supposed to modify the string to which it points (so in Standard C,
the variable :c:data:`!command` should properly be declared as ``const char
*command``).
The next statement is a call to the Unix function :c:func:`system`, passing it
the string we just got from :c:func:`PyArg_ParseTuple`::
sts = system(command);
Our :func:`!spam.system` function must return the value of :c:data:`!sts` as a
Python object. This is done using the function :c:func:`PyLong_FromLong`. ::
return PyLong_FromLong(sts);
In this case, it will return an integer object. (Yes, even integers are objects
on the heap in Python!)
If you have a C function that returns no useful argument (a function returning
:c:expr:`void`), the corresponding Python function must return ``None``. You
need this idiom to do so (which is implemented by the :c:macro:`Py_RETURN_NONE`
macro)::
Py_INCREF(Py_None);
return Py_None;
:c:data:`Py_None` is the C name for the special Python object ``None``. It is a
genuine Python object rather than a ``NULL`` pointer, which means "error" in most
contexts, as we have seen.
.. _methodtable:
The Module's Method Table and Initialization Function
=====================================================
I promised to show how :c:func:`!spam_system` is called from Python programs.
First, we need to list its name and address in a "method table"::
static PyMethodDef spam_methods[] = {
...
{"system", spam_system, METH_VARARGS,
"Execute a shell command."},
...
{NULL, NULL, 0, NULL} /* Sentinel */
};
Note the third entry (``METH_VARARGS``). This is a flag telling the interpreter
the calling convention to be used for the C function. It should normally always
be ``METH_VARARGS`` or ``METH_VARARGS | METH_KEYWORDS``; a value of ``0`` means
that an obsolete variant of :c:func:`PyArg_ParseTuple` is used.
When using only ``METH_VARARGS``, the function should expect the Python-level
parameters to be passed in as a tuple acceptable for parsing via
:c:func:`PyArg_ParseTuple`; more information on this function is provided below.
The :c:macro:`METH_KEYWORDS` bit may be set in the third field if keyword
arguments should be passed to the function. In this case, the C function should
accept a third ``PyObject *`` parameter which will be a dictionary of keywords.
Use :c:func:`PyArg_ParseTupleAndKeywords` to parse the arguments to such a
function.
The method table must be referenced in the module definition structure::
static struct PyModuleDef spam_module = {
...
.m_methods = spam_methods,
...
};
This structure, in turn, must be passed to the interpreter in the module's
initialization function. The initialization function must be named
:c:func:`!PyInit_name`, where *name* is the name of the module, and should be the
only non-\ ``static`` item defined in the module file::
PyMODINIT_FUNC
PyInit_spam(void)
{
return PyModuleDef_Init(&spam_module);
}
Note that :c:macro:`PyMODINIT_FUNC` declares the function as ``PyObject *`` return type,
declares any special linkage declarations required by the platform, and for C++
declares the function as ``extern "C"``.
:c:func:`!PyInit_spam` is called when each interpreter imports its module
:mod:`!spam` for the first time. (See below for comments about embedding Python.)
A pointer to the module definition must be returned via :c:func:`PyModuleDef_Init`,
so that the import machinery can create the module and store it in ``sys.modules``.
When embedding Python, the :c:func:`!PyInit_spam` function is not called
automatically unless there's an entry in the :c:data:`PyImport_Inittab` table.
To add the module to the initialization table, use :c:func:`PyImport_AppendInittab`,
optionally followed by an import of the module::
#define PY_SSIZE_T_CLEAN
#include <Python.h>
int
main(int argc, char *argv[])
{
PyStatus status;
PyConfig config;
PyConfig_InitPythonConfig(&config);
/* Add a built-in module, before Py_Initialize */
if (PyImport_AppendInittab("spam", PyInit_spam) == -1) {
fprintf(stderr, "Error: could not extend in-built modules table\n");
exit(1);
}
/* Pass argv[0] to the Python interpreter */
status = PyConfig_SetBytesString(&config, &config.program_name, argv[0]);
if (PyStatus_Exception(status)) {
goto exception;
}
/* Initialize the Python interpreter. Required.
If this step fails, it will be a fatal error. */
status = Py_InitializeFromConfig(&config);
if (PyStatus_Exception(status)) {
goto exception;
}
PyConfig_Clear(&config);
/* Optionally import the module; alternatively,
import can be deferred until the embedded script
imports it. */
PyObject *pmodule = PyImport_ImportModule("spam");
if (!pmodule) {
PyErr_Print();
fprintf(stderr, "Error: could not import module 'spam'\n");
}
// ... use Python C API here ...
return 0;
exception:
PyConfig_Clear(&config);
Py_ExitStatusException(status);
}
.. note::
If you declare a global variable or a local static one, the module may
experience unintended side-effects on re-initialisation, for example when
removing entries from ``sys.modules`` or importing compiled modules into
multiple interpreters within a process
(or following a :c:func:`fork` without an intervening :c:func:`exec`).
If module state is not yet fully :ref:`isolated <isolating-extensions-howto>`,
authors should consider marking the module as having no support for subinterpreters
(via :c:macro:`Py_MOD_MULTIPLE_INTERPRETERS_NOT_SUPPORTED`).
A more substantial example module is included in the Python source distribution
as :file:`Modules/xxlimited.c`. This file may be used as a template or simply
read as an example.
.. _compilation:
Compilation and Linkage
=======================
Embedding an extension
======================
There are two more things to do before you can use your new extension: compiling
and linking it with the Python system. If you use dynamic loading, the details
may depend on the style of dynamic loading your system uses; see the chapters
about building extension modules (chapter :ref:`building`) and additional
information that pertains only to building on Windows (chapter
:ref:`building-on-windows`) for more information about this.
If you can't use dynamic loading, or if you want to make your module a permanent
If you want to make your module a permanent
part of the Python interpreter, you will have to change the configuration setup
and rebuild the interpreter. Luckily, this is very simple on Unix: just place
and rebuild the interpreter. On Unix, place
your file (:file:`spammodule.c` for example) in the :file:`Modules/` directory
of an unpacked source distribution, add a line to the file
:file:`Modules/Setup.local` describing your file:
@ -536,7 +222,7 @@ on the line in the configuration file as well, for instance:
Calling Python Functions from C
===============================
So far we have concentrated on making C functions callable from Python. The
The tutorial concentrated on making C functions callable from Python. The
reverse is also useful: calling Python functions from C. This is especially the
case for libraries that support so-called "callback" functions. If a C
interface makes use of callbacks, the equivalent Python often needs to provide a
@ -581,7 +267,7 @@ be part of a module definition::
}
This function must be registered with the interpreter using the
:c:macro:`METH_VARARGS` flag; this is described in section :ref:`methodtable`. The
:c:macro:`METH_VARARGS` flag in :c:type:`PyMethodDef.ml_flags`. The
:c:func:`PyArg_ParseTuple` function and its arguments are documented in section
:ref:`parsetuple`.
@ -676,14 +362,21 @@ the above example, we use :c:func:`Py_BuildValue` to construct the dictionary. :
Py_DECREF(result);
.. index:: single: PyArg_ParseTuple (C function)
.. _parsetuple:
Extracting Parameters in Extension Functions
============================================
.. index:: single: PyArg_ParseTuple (C function)
The :ref:`tutorial <first-extension-module>` uses a ":c:data:`METH_O`"
function, which is limited to a single Python argument.
If you want more, you can use :c:data:`METH_VARARGS` instead.
With this flag, the C function will receive a *tuple* of arguments
instead of a single object.
The :c:func:`PyArg_ParseTuple` function is declared as follows::
For unpacking the tuple, CPython provides the :c:func:`PyArg_ParseTuple`
function, declared as follows::
int PyArg_ParseTuple(PyObject *arg, const char *format, ...);
@ -693,6 +386,19 @@ whose syntax is explained in :ref:`arg-parsing` in the Python/C API Reference
Manual. The remaining arguments must be addresses of variables whose type is
determined by the format string.
For example, to receive a single Python :py:class:`str` object and turn it
into a C buffer, you would use ``"s"`` as the format string::
const char *command;
if (!PyArg_ParseTuple(args, "s", &command)) {
return NULL;
}
If an error is detected in the argument list, :c:func:`!PyArg_ParseTuple`
returns ``NULL`` (the error indicator for functions returning object pointers);
your function may return ``NULL``, relying on the exception set by
:c:func:`PyArg_ParseTuple`.
Note that while :c:func:`PyArg_ParseTuple` checks that the Python arguments have
the required types, it cannot check the validity of the addresses of C variables
passed to the call: if you make mistakes there, your code will probably crash or
@ -703,7 +409,6 @@ Note that any Python object references which are provided to the caller are
Some example calls::
#define PY_SSIZE_T_CLEAN
#include <Python.h>
::
@ -773,6 +478,17 @@ Some example calls::
Keyword Parameters for Extension Functions
==========================================
If you also want your function to accept *keyword* arguments,
use the :c:data:`METH_KEYWORDS` flag in combination with
:c:data:`METH_VARARGS`.
(It can also be used with other flags; see its documentation for the allowed
combinations.)
In this case, the C function should accept a third ``PyObject *`` parameter
which will be a dictionary of keywords.
Use :c:func:`PyArg_ParseTupleAndKeywords` to parse the arguments to such a
function.
.. index:: single: PyArg_ParseTupleAndKeywords (C function)
The :c:func:`PyArg_ParseTupleAndKeywords` function is declared as follows::
@ -833,19 +549,6 @@ Philbrick (philbrick@hks.com)::
{NULL, NULL, 0, NULL} /* sentinel */
};
static struct PyModuleDef keywdarg_module = {
.m_base = PyModuleDef_HEAD_INIT,
.m_name = "keywdarg",
.m_size = 0,
.m_methods = keywdarg_methods,
};
PyMODINIT_FUNC
PyInit_keywdarg(void)
{
return PyModuleDef_Init(&keywdarg_module);
}
.. _buildvalue:
@ -986,11 +689,11 @@ needed. Ownership of a reference can be transferred. There are three ways to
dispose of an owned reference: pass it on, store it, or call :c:func:`Py_DECREF`.
Forgetting to dispose of an owned reference creates a memory leak.
It is also possible to :dfn:`borrow` [#]_ a reference to an object. The
It is also possible to :dfn:`borrow` [#borrow]_ a reference to an object. The
borrower of a reference should not call :c:func:`Py_DECREF`. The borrower must
not hold on to the object longer than the owner from which it was borrowed.
Using a borrowed reference after the owner has disposed of it risks using freed
memory and should be avoided completely [#]_.
memory and should be avoided completely [#dont-check-refcount]_.
The advantage of borrowing over owning a reference is that you don't need to
take care of disposing of the reference on all possible paths through the code
@ -1169,7 +872,7 @@ checking.
The C function calling mechanism guarantees that the argument list passed to C
functions (``args`` in the examples) is never ``NULL`` --- in fact it guarantees
that it is always a tuple [#]_.
that it is always a tuple [#old-calling-convention]_.
It is a severe error to ever let a ``NULL`` pointer "escape" to the Python user.
@ -1226,8 +929,8 @@ the module whose functions one wishes to call might not have been loaded yet!
Portability therefore requires not to make any assumptions about symbol
visibility. This means that all symbols in extension modules should be declared
``static``, except for the module's initialization function, in order to
avoid name clashes with other extension modules (as discussed in section
:ref:`methodtable`). And it means that symbols that *should* be accessible from
avoid name clashes with other extension modules. And it means that symbols
that *should* be accessible from
other extension modules must be exported in a different way.
Python provides a special mechanism to pass C-level information (pointers) from
@ -1269,8 +972,9 @@ file corresponding to the module provides a macro that takes care of importing
the module and retrieving its C API pointers; client modules only have to call
this macro before accessing the C API.
The exporting module is a modification of the :mod:`!spam` module from section
:ref:`extending-simpleexample`. The function :func:`!spam.system` does not call
The exporting module is a modification of the :mod:`!spam` module from the
:ref:`tutorial <first-extension-module>`.
The function :func:`!spam.system` does not call
the C library function :c:func:`system` directly, but a function
:c:func:`!PySpam_System`, which would of course do something more complicated in
reality (such as adding "spam" to every command). This function
@ -1412,15 +1116,14 @@ code distribution).
.. rubric:: Footnotes
.. [#] An interface for this function already exists in the standard module :mod:`os`
--- it was chosen as a simple and straightforward example.
.. [#borrow] The metaphor of "borrowing" a reference is not completely correct:
the owner still has a copy of the reference.
.. [#] The metaphor of "borrowing" a reference is not completely correct: the owner
still has a copy of the reference.
.. [#] Checking that the reference count is at least 1 **does not work** --- the
.. [#dont-check-refcount] Checking that the reference count is at least 1
**does not work** --- the
reference count itself could be in freed memory and may thus be reused for
another object!
.. [#] These guarantees don't hold when you use the "old" style calling convention ---
.. [#old-calling-convention] These guarantees don't hold when you use the
"old" style calling convention ---
this is still found in much existing code.

View file

@ -0,0 +1,569 @@
.. highlight:: c
.. _extending-simpleexample:
.. _first-extension-module:
*********************************
Your first C API extension module
*********************************
This tutorial will take you through creating a simple
Python extension module written in C or C++.
It assumes basic knowledge about Python: you should be able to
define functions in Python code before starting to write them in C.
See :ref:`tutorial-index` for an introduction to Python itself.
The tutorial should be useful for anyone who can write a basic C library.
While we will mention several concepts that a C beginner would not be expected
to know, like ``static`` functions or linkage declarations, understanding these
is not necessary for success.
We will focus on giving you a "feel" of what Python's C API is like.
It will not teach you important concepts, like error handling
and reference counting, which are covered in later chapters.
As you write the code, you will need to compile it.
Prepare to spend some time choosing, installing and configuring a build tool,
since CPython itself does not include one.
We will assume that you use a Unix-like system (including macOS and
Linux), or Windows.
On other systems, you might need to adjust some details -- for example,
a system command name.
.. note::
This tutorial uses API that was added in CPython 3.15.
To create an extension that's compatible with earlier versions of CPython,
please follow an earlier version of this documentation.
This tutorial uses some syntax added in C11 and C++20.
If your extension needs to be compatible with earlier standards,
please follow tutorials in documentation for Python 3.14 or below.
What we'll do
=============
Let's create an extension module called ``spam`` [#why-spam]_,
which will include a Python interface to the C
standard library function :c:func:`system`.
This function is defined in ``stdlib.h``.
It takes a C string as argument, runs the argument as a system
command, and returns a result value as an integer.
A manual page for ``system`` might summarize it this way::
#include <stdlib.h>
int system(const char *command);
Note that like many functions in the C standard library,
this function is already exposed in Python.
In production, use :py:func:`os.system` or :py:func:`subprocess.run`
rather than the module you'll write here.
We want this function to be callable from Python as follows:
.. code-block:: pycon
>>> import spam
>>> status = spam.system("whoami")
User Name
>>> status
0
.. note::
The system command ``whoami`` prints out your username.
It's useful in tutorials like this one because it has the same name on
both Unix and Windows.
Warming up your build tool
==========================
Begin by creating a file named :file:`spammodule.c`. [#why-spammodule]_
Now, while the file is empty, we'll compile it.
Choose a build tool such as Setuptools or Meson, and follow its instructions
to compile and install the empty :file:`spammodule.c` as a C extension module.
This will ensure that your build tool works, so that you can make
and test incremental changes as you follow the rest of the text.
.. note:: Workaround for missing ``PyInit``
If your build tool output complains about missing ``PyInit_spam``,
add the following function to your module for now:
.. code-block:: c
// A workaround
void *PyInit_spam(void) { return NULL; }
This is a shim for an old-style :ref:`initialization function <extension-export-hook>`,
which was required in extension modules for CPython 3.14 and below.
Current CPython will not call it, but some build tools may still assume that
all extension modules need to define it.
If you use this workaround, you will get the exception
``SystemError: initialization of spam failed without raising an exception``
instead of an :py:exc:`ImportError` in the next step.
.. note::
Using a third-party build tool is heavily recommended, as in will take
care of various details of your platform and Python installation,
of naming the resulting extension, and, later, of distributing your work.
If you don't want to use a tool, you can try to run your compiler directly.
The following command should work for many flavors of Linux, and generate
a ``spam.so`` file that you need to put in a directory
in :py:attr:`sys.path`:
.. code-block:: sh
gcc --shared spammodule.c -o spam.so
When your extension is compiled and installed, start Python and try to import
your extension.
This should fail with the following exception:
.. code-block:: pycon
>>> import spam
Traceback (most recent call last):
...
ImportError: dynamic module does not define module export function (PyModExport_spam or PyInit_spam)
Including the Header
====================
Now, add the first line to your file: include :file:`Python.h` to pull in
all declarations of the Python C API:
.. literalinclude:: ../includes/capi-extension/spammodule-01.c
:start-at: <Python.h>
:end-at: <Python.h>
Next, include the header for the :c:func:`system` function:
.. literalinclude:: ../includes/capi-extension/spammodule-01.c
:start-at: <stdlib.h>
:end-at: <stdlib.h>
Be sure to put this, and any other standard library includes, *after*
:file:`Python.h`.
On some systems, Python may define some pre-processor definitions
that affect the standard headers.
.. tip::
The ``<stdlib.h>`` include is technically not necessary.
:file:`Python.h` :ref:`includes several standard header files <capi-system-includes>`
for its own use and for backwards compatibility,
and ``stdlib`` is one of them.
However, it is good practice to explicitly include what you need.
With the includes in place, compile and import the extension again.
You should get the same exception as with the empty file.
.. note::
Third-party build tools should handle pointing the compiler to
the CPython headers and libraries, and setting appropriate options.
If you are running the compiler directly, you will need to do this yourself.
If your installation of Python comes with a corresponding ``python-config``
command, you can run something like:
.. code-block:: shell
gcc --shared $(python-config --cflags --ldflags) spammodule.c -o spam.so
Module export hook
==================
The exception you got when you tried to import the module told you that Python
is looking for a "module export function", also known as a
:ref:`module export hook <extension-export-hook>`.
Let's define one.
First, add a prototype below the ``#include`` lines:
.. literalinclude:: ../includes/capi-extension/spammodule-01.c
:start-after: /// Export hook prototype
:end-before: ///
.. tip::
The prototype is not strictly necessary, but some modern compilers emit
warnings without it.
It's generally better to add the prototype than to disable the warning.
The :c:macro:`PyMODEXPORT_FUNC` macro declares the function's
return type, and adds any special linkage declarations needed
to make the function visible and usable when CPython loads it.
After the prototype, add the function itself.
For now, make it return ``NULL``:
.. code-block:: c
PyMODEXPORT_FUNC
PyModExport_spam(void)
{
return NULL;
}
Compile and load the module again.
You should get a different error this time.
.. code-block:: pycon
>>> import spam
Traceback (most recent call last):
...
SystemError: module export hook for module 'spam' failed without setting an exception
Simply returning ``NULL`` is *not* correct behavior for an export hook,
and CPython complains about it.
That's good -- it means that CPython found the function!
Let's now make it do something useful.
The slot table
==============
Rather than ``NULL``, the export hook should return the information needed to
create a module.
Let's with the basics: the name and docstring.
The information should de defined in as ``static`` array of
:c:type:`PyModuleDef_Slot` entries, which are essentially key-value pairs.
Define this array just before your export hook:
.. code-block:: c
static PyModuleDef_Slot spam_slots[] = {
{Py_mod_name, "spam"},
{Py_mod_doc, "A wonderful module with an example function"},
{0, NULL}
};
For both name and docstring, the values are C strings -- that is,
NUL-terminated UTF-8 encoded byte arrays.
Note the zero-filled sentinel entry at the end.
If you forget it, you'll trigger undefined behavior.
The array is defined as ``static`` -- not visible outside this ``.c`` file.
This will be a common theme.
CPython only needs to access the export hook; all global variables
and all other functions should generally be ``static``, so that they don't
clash with other extensions.
Return this array from your export hook instead of ``NULL``:
.. code-block:: c
:emphasize-lines: 4
PyMODEXPORT_FUNC
PyModExport_spam(void)
{
return spam_slots;
}
Now, recompile and try it out:
.. code-block:: pycon
>>> import spam
>>> print(spam)
<module 'spam' from '/home/encukou/dev/cpython/spam.so'>
You have a extension module!
Try ``help(spam)`` to see the docstring.
The next step will be adding a function.
.. _backtoexample:
Exposing a function
===================
To expose the ``system`` C function directly to Python,
we'll need to write a layer of glue code to convert arguments from Python
objects to C values, and the C return value back to Python.
One of the simplest glue code is a ":c:data:`METH_O`" function,
which takes two Python objects and returns one.
All Pyton objects -- regardless of the Python type -- are represented in C
as pointers to the ``PyObject`` structure.
Add such a function above the slots array::
static PyObject *
spam_system(PyObject *self, PyObject *arg)
{
Py_RETURN_NONE;
}
For now, we'll ignor the arguments, and use the :c:macro:`Py_RETURN_NONE`
macro to properly ``return`` a Python :py:data:`None` object.
Recompile your extension to make sure you don't have syntax errors.
We haven't yet added ``spam_system`` to the module, so you might get a
warning that ``spam_system`` is unused.
.. _methodtable:
Method definitions
------------------
To expose the C function to Python, you will need to provide several pieces of
information in a structure called
:c:type:`PyMethodDef` [#why-pymethoddef]_:
* ``ml_name``: the name of the Python function;
* ``ml_doc``: a docstring;
* ``ml_meth``: the C function to be called; and
* ``ml_flags``: a set of flags describing details like how Python arguments are
passed to the C function.
We'll use :c:data:`METH_O` here -- the flag that matches our
``spam_system`` function's signature.
Because modules typically create several functions, these definitions
need to be collected in an array, with a zero-filled sentinel at the end.
Add this array just below the ``spam_system`` function:
.. literalinclude:: ../includes/capi-extension/spammodule-01.c
:start-after: /// Module method table
:end-before: ///
As with module slots, a zero-filled sentinel marks the end of the array.
Next, we'll add the method to the module.
Add a :c:data:`Py_mod_methods` slot to your a :c:type:`PyMethodDef` array:
.. literalinclude:: ../includes/capi-extension/spammodule-01.c
:start-after: /// Module slot table
:end-before: ///
:emphasize-lines: 6
Recompile your extension again, and test it.
You should now be able to call the function, and get ``None`` back:
.. code-block:: pycon
>>> import spam
>>> print(spam.system)
<built-in function system>
>>> print(spam.system('whoami'))
None
Returning an integer
====================
Now, let's take a look at the return value.
Instead of ``None``, we'll want ``spam.system`` to return a number -- that is,
a Python :py:type:`int` object.
Eventually this will be the exit code of a system command,
but let's start with a fixed value, say, ``3``.
The Python C API provides a function to create a Python :py:type:`int` object
from a C ``int`` values: :c:func:`PyLong_FromLong`. [#why-pylongfromlong]_
To call it, replace the ``Py_RETURN_NONE`` with the following 3 lines:
.. this could be a one-liner, but we want to how the data types here.
.. code-block:: c
:emphasize-lines: 4-6
static PyObject *
spam_system(PyObject *self, PyObject *arg)
{
int status = 3;
PyObject *result = PyLong_FromLong(status);
return result;
}
Recompile and run again, and check that the function now returns 3:
.. code-block:: pycon
>>> import spam
>>> spam.system('whoami')
3
Accepting a string
==================
Finally, let's handle the function argument.
Our C function, :c:func:`!spam_system`, takes two arguments.
The first one, ``PyObject *self``, will be set to the ``spam`` module
object.
This isn't useful in our case, so we'll ignore it.
The other one, ``PyObject *arg``, will be set to the object that the user
called the Python with.
We expect that it should be a Python string.
In order to use the information in it, we will need
to convert it to a C value --- in this case, a C string (``const char *``).
There's a slight type mismatch here: Python's :c:type:`str` objects store
Unicode text, but C strings are arrays of bytes.
So, we'll need to *encode* the data, and we'll use the UTF-8 encoding for it.
(UTF-8 might not always be correct for system commands, but it's what
:py:meth:`str.decode` uses by default,
and the C API has special support for it.)
The function to decode a Python string into a UTF-8 buffer is named
:c:func:`PyUnicode_AsUTF8` [#why-pyunicodeasutf8]_.
Call it like this:
.. code-block:: c
:emphasize-lines: 4
static PyObject *
spam_system(PyObject *self, PyObject *arg)
{
const char *command = PyUnicode_AsUTF8(arg);
int status = 3;
PyObject *result = PyLong_FromLong(status);
return result;
}
If :c:func:`PyUnicode_AsUTF8` is successful, *command* will point to the
resulting array of bytes.
This buffer is managed by the *arg* object, which means we don't need to free
it, but we must follow some rules:
* We should only use the buffer inside the ``spam_system`` function.
When ``spam_system`` returns, *arg* and the buffer it manages might be
garbage-collected.
* We must not modify it. This is why we use ``const``.
If :c:func:`PyUnicode_AsUTF8` was *not* successful, it returns a ``NULL``
pointer.
When calling *any* Python C API, we always need to handle such error cases.
The way to do this in general is left for later chapters of this documentation.
For now, be assured that we are already handling errors from
:c:func:`PyLong_FromLong` correctly.
For the :c:func:`PyUnicode_AsUTF8` call, the correct way to handle errors is
returning ``NULL`` from ``spam_system``.
Add an ``if`` block for this:
.. code-block:: c
:emphasize-lines: 5-7
static PyObject *
spam_system(PyObject *self, PyObject *arg)
{
const char *command = PyUnicode_AsUTF8(arg);
if (command == NULL) {
return NULL;
}
int status = 3;
PyObject *result = PyLong_FromLong(status);
return result;
}
That's it for the setup.
Now, all that is left is calling C library function ``system`` with
the ``char *`` buffer, and using its result instead of the ``3``:
.. code-block:: c
:emphasize-lines: 8
static PyObject *
spam_system(PyObject *self, PyObject *arg)
{
const char *command = PyUnicode_AsUTF8(arg);
if (command == NULL) {
return NULL;
}
int status = system(command);
PyObject *result = PyLong_FromLong(status);
return result;
}
Compile your module, and test:
.. code-block:: pycon
>>> import spam
>>> result = spam.system('whoami')
User Name
>>> result
0
You might also want to test error cases:
.. code-block:: pycon
>>> import spam
>>> result = spam.system('nonexistent-command')
sh: line 1: nonexistent-command: command not found
>>> result
32512
>>> spam.system(3)
Traceback (most recent call last):
...
TypeError: bad argument type for built-in operation
>>> print(spam.system('too', 'many', 'arguments'))
Traceback (most recent call last):
...
TypeError: spam.system() takes exactly one argument (3 given)
The result
==========
Congratulations!
You have written a complete Python C API extension module,
and completed this tutorial!
Here is the entire source file, for your convenience:
.. _extending-spammodule-source:
.. literalinclude:: ../includes/capi-extension/spammodule-01.c
:start-at: ///
.. rubric:: Footnotes
.. [#why-spam] ``spam`` is the favorite food of Monty Python fans...
.. [#why-spammodule] The source file name is entirely up to you,
though some tools can be picky about the ``.c`` extension.
This tutorial uses the traditional ``*module.c`` suffix.
Some people would just use :file:`spam.c` to implement a module
named ``spam``,
projects where Python isn't the primary language might use ``py_spam.c``,
and so on.
.. [#why-pymethoddef] The :c:type:`!PyMethodDef` structure is also used
to create methods of classes, so there's no separate
":c:type:`!PyFunctionDef`".
.. [#why-pylongfromlong] The name :c:func:`PyLong_FromLong`
might not seem obvious.
``PyLong`` refers to a the Python :py:class:`int`, which was originally
called ``long``; the ``FromLong`` refers to the C ``long`` (or ``long int``)
type.
.. [#why-pyunicodeasutf8] Here, ``PyUnicode`` refers to the original name of
the Python :py:class:`str` class: ``unicode``.

View file

@ -5,15 +5,17 @@
##################################################
This document describes how to write modules in C or C++ to extend the Python
interpreter with new modules. Those modules can not only define new functions
but also new object types and their methods. The document also describes how
interpreter with new modules. Those modules can do what Python code does --
define functions, object types and methods -- but also interact with native
libraries or achieve better performance by avoiding the overhead of an
interpreter. The document also describes how
to embed the Python interpreter in another application, for use as an extension
language. Finally, it shows how to compile and link extension modules so that
they can be loaded dynamically (at run time) into the interpreter, if the
underlying operating system supports this feature.
This document assumes basic knowledge about Python. For an informal
introduction to the language, see :ref:`tutorial-index`. :ref:`reference-index`
This document assumes basic knowledge about C and Python. For an informal
introduction to Python, see :ref:`tutorial-index`. :ref:`reference-index`
gives a more formal definition of the language. :ref:`library-index` documents
the existing object types, functions and modules (both built-in and written in
Python) that give the language its wide application range.
@ -21,37 +23,75 @@ Python) that give the language its wide application range.
For a detailed description of the whole Python/C API, see the separate
:ref:`c-api-index`.
To support extensions, Python's C API (Application Programmers Interface)
defines a set of functions, macros and variables that provide access to most
aspects of the Python run-time system. The Python API is incorporated in a C
source file by including the header ``"Python.h"``.
.. note::
The C extension interface is specific to CPython, and extension modules do
not work on other Python implementations. In many cases, it is possible to
avoid writing C extensions and preserve portability to other implementations.
For example, if your use case is calling C library functions or system calls,
you should consider using the :mod:`ctypes` module or the `cffi
<https://cffi.readthedocs.io/>`_ library rather than writing
custom C code.
These modules let you write Python code to interface with C code and are more
portable between implementations of Python than writing and compiling a C
extension module.
.. toctree::
:hidden:
first-extension-module.rst
extending.rst
newtypes_tutorial.rst
newtypes.rst
building.rst
windows.rst
embedding.rst
Recommended third party tools
=============================
This guide only covers the basic tools for creating extensions provided
This document only covers the basic tools for creating extensions provided
as part of this version of CPython. Some :ref:`third party tools
<c-api-tools>` offer both simpler and more sophisticated approaches to creating
C and C++ extensions for Python.
While this document is aimed at extension authors, it should also be helpful to
the authors of such tools.
For example, the tutorial module can serve as a simple test case for a build
tool or sample expected output of a code generator.
Creating extensions without third party tools
=============================================
C API Tutorial
==============
This tutorial describes how to write a simple module in C or C++,
using the Python C API -- that is, using the basic tools provided
as part of this version of CPython.
#. :ref:`first-extension-module`
Guides for intermediate topics
==============================
This section of the guide covers creating C and C++ extensions without
assistance from third party tools. It is intended primarily for creators
of those tools, rather than being a recommended way to create your own
C extensions.
.. seealso::
:pep:`489` -- Multi-phase extension module initialization
.. toctree::
:maxdepth: 2
:numbered:
extending.rst
newtypes_tutorial.rst
newtypes.rst
building.rst
windows.rst
* :ref:`extending-intro`
* :ref:`defining-new-types`
* :ref:`new-types-topics`
* :ref:`building`
* :ref:`building-on-windows`
Embedding the CPython runtime in a larger application
=====================================================
@ -61,8 +101,4 @@ interpreter as the main application, it is desirable to instead embed
the CPython runtime inside a larger application. This section covers
some of the details involved in doing that successfully.
.. toctree::
:maxdepth: 2
:numbered:
embedding.rst
* :ref:`embedding`

View file

@ -0,0 +1,55 @@
/* This file needs to be kept in sync with the tutorial
* at Doc/extending/first-extension-module.rst
*/
/// Includes
#include <Python.h>
#include <stdlib.h>
/// Implementation of spam.system
static PyObject *
spam_system(PyObject *self, PyObject *arg)
{
const char *command = PyUnicode_AsUTF8(arg);
if (command == NULL) {
return NULL;
}
int status = system(command);
PyObject *result = PyLong_FromLong(status);
return result;
}
/// Module method table
static PyMethodDef spam_methods[] = {
{
.ml_name="system",
.ml_meth=spam_system,
.ml_flags=METH_O,
.ml_doc="Execute a shell command.",
},
{NULL, NULL, 0, NULL} /* Sentinel */
};
/// Module slot table
static PyModuleDef_Slot spam_slots[] = {
{Py_mod_name, "spam"},
{Py_mod_doc, "A wonderful module with an example function"},
{Py_mod_methods, spam_methods},
{0, NULL}
};
/// Export hook prototype
PyMODEXPORT_FUNC PyModExport_spam(void);
/// Module export hook
PyMODEXPORT_FUNC
PyModExport_spam(void)
{
return spam_slots;
}