Because Python
itself is coded in C today, compiled Python extensions can
be coded in any language that is C compatible in terms of call stacks and
linking. That includes C, but also C++ with appropriate “extern C”
declarations (which are automatically provided in Python header files).
Regardless of the implementation language, the compiled Python extensions
language can take two forms:
Libraries of tools that look and feel like Python module files
to their clients
Multiple instance objects that behave like standard built-in
types and classes
Generally, C extension modules are used to implement flat function
libraries, and they wind up appearing as importable modules to Python code
(hence their name). C extension types are used to code objects that
generate multiple instances, carry per-
instance
state information, and may
optionally support expression operators just like Python classes. C
extension types can do anything that built-in types and Python-coded
classes can: method calls, addition, indexing, slicing, and so on.
To make the interface work, both C modules and types must provide a
layer of “glue” code that translates calls and data between the two
languages. This layer registers
C-coded
operations with the Python
interpreter as C function pointers. In all cases, the C layer is
responsible for converting arguments passed from Python to C form and for
converting results from C to Python form. Python scripts simply import C
extensions and use them as though they were really coded in Python.
Because C code does all the translation work, the interface is very
seamless and simple in Python scripts.
C modules and types are also responsible for communicating errors
back to Python, detecting errors raised by Python API calls, and managing
garbage-collector reference counters on objects retained by the C layer
indefinitely—Python objects held by your C code won’t be garbage-collected
as long as you make sure their reference counts don’t fall to zero. Once
coded, C modules and types may be linked to Python either statically (by
rebuilding Python) or dynamically (when first imported). Thereafter, the C
extension becomes another toolkit available for use in Python
scripts.
At least that’s
the short story; C modules require C code, and C types
require more of it than we can reasonably present in this chapter.
Although this book can’t teach you C development skills if you don’t
already have them, we need to turn to some code to make this domain more
concrete. Because C modules are simpler, and because C types generally
export a C module with an instance constructor function, let’s start off
by exploring the basics of C module coding with a quick example.
As mentioned, when you add new or existing C components to Python in
the traditional integration model, you need to code an interface (“glue”)
logic layer in C that handles cross-language dispatching and data
translation. The C source file in
Example 20-1
shows how to code one by hand.
It implements a simple C extension module namedhello
for use in Python scripts, with a function
namedmessage
that simply returns its
input string argument with extra text prepended. Python scripts will call
this function as usual, but this one is coded in C, not in Python.
Example 20-1. PP4E\Integrate\Extend\Hello\hello.c
/********************************************************************
* A simple C extension module for Python, called "hello"; compile
* this into a ".so" on python path, import and call hello.message;
********************************************************************/
#include
#include
/* module functions */
static PyObject * /* returns object */
message(PyObject *self, PyObject *args) /* self unused in modules */
{ /* args from Python call */
char *fromPython, result[1024];
if (! PyArg_Parse(args, "(s)", &fromPython)) /* convert Python -> C */
return NULL; /* null=raise exception */
else {
strcpy(result, "Hello, "); /* build up C string */
strcat(result, fromPython); /* add passed Python string */
return Py_BuildValue("s", result); /* convert C -> Python */
}
}
/* registration table */
static PyMethodDef hello_methods[] = {
{"message", message, METH_VARARGS, "func doc"}, /* name, &func, fmt, doc */
{NULL, NULL, 0, NULL} /* end of table marker */
};
/* module definition structure */
static struct PyModuleDef hellomodule = {
PyModuleDef_HEAD_INIT,
"hello", /* name of module */
"mod doc", /* module documentation, may be NULL */
−1, /* size of per-interpreter module state, −1=in global vars */
hello_methods /* link to methods table */
};
/* module initializer */
PyMODINIT_FUNC
PyInit_hello() /* called on first import */
{ /* name matters if loaded dynamically */
return PyModule_Create(&hellomodule);
}
This C module has a 4-part standard structure described by its
comments, which all C modules follow, and which has changed noticeably in
Python 3.X. Ultimately, Python code will call this C file’smessage
function, passing in a string object and
getting back a new string object. First, though, it has to be somehow
linked into the Python interpreter. To use this C file in a Python script,
compile it into a dynamically loadable object file (e.g.,
hello.so
on Linux,
hello.dll
under
Cygwin on Windows) with a makefile like the one listed in
Example 20-2
, and drop the
resulting object file into a directory listed on your module import search
path exactly as though it were a
.py
or
.pyc
file.
Example 20-2. PP4E\Integrate\Extend\Hello\makefile.hello
#############################################################
# Compile hello.c into a shareable object file on Cygwin,
# to be loaded dynamically when first imported by Python.
#############################################################
PYLIB = /usr/local/bin
PYINC = /usr/local/include/python3.1
hello.dll: hello.c
gcc hello.c -g -I$(PYINC) -shared -L$(PYLIB) -lpython3.1 -o hello.dll
clean:
rm -f hello.dll core
This is a Cygwin makefile that usesgcc
to
compile our C code on Windows; other platforms are analogous
but will vary. As we learned in
Chapter 5
,
Cygwin provides a Unix-like environment and libraries on Windows. To work
along with the examples here, either install Cygwin on your Windows
platform, or change the makefiles listed per your compiler and platform
requirements. Be sure to include the path to Python’s install directory
with-I
flags to access Python include
(a.k.a. header) files, as well as the path to the Python binary library
file with-L
flags, if needed; mine
point to
Python 3.1’s
location on my
laptop after building it from its source. Also note that you’ll need tabs
for the indentation in makefile rules if a cut-and-paste from an ebook
substituted or dropped spaces.
Now, to use the makefile in
Example 20-2
to build the extension
module in
Example 20-1
, simply type a
standardmake
command at your shell
(the Cygwin shell is used here, and I add a line break for
clarity):
.../PP4E/Integrate/Extend/Hello$make -f makefile.hello
gcc hello.c -g -I/usr/local/include/python3.1 -shared
-L/usr/local/bin -lpython3.1 -o hello.dll
This generates a shareable object file—a
.dll
under Cygwin on Windows. When compiled this way, Python automatically
loads and links the C module when it is first imported by a Python script.
At import time, the
.dll
binary library file will be
located in a directory on the Python import search path, just like a
.py
file. Because Python always searches the current
working directory on imports, this chapter’s examples will run from the
directory you compile them in (.) without any file copies or moves. In
larger systems, you will generally place compiled extensions in a
directory listed inPYTHONPATH
or
.pth
files instead, or use Python’s
distutils
to install them in the site-packages
subdirectory of the standard library.
Finally, to call the C function from a Python program, simply import
the modulehello
and call itshello.message
function with a string; you’ll get
back a normal Python string:
.../PP4E/Integrate/Extend/Hello$python
>>>import hello
# import a C module
>>>hello.message('world')
# call a C function
'Hello, world'
>>>hello.message('extending')
'Hello, extending'
And that’s it—you’ve just called an integrated C module’s function
from Python. The most important thing to notice here is that the C
function looks exactly as if it were coded in Python. Python callers send
and receive normal string objects from the call; the Python interpreter
handles routing calls to the C function, and the C function itself handles
Python/C data conversion chores.
In fact, there is little to distinguishhello
as a C extension module at all, apart from
its filename. Python code imports the module and fetches its attributes as
if it had been written in Python. C extension modules even respond todir
calls as usual and have the
standard module and filename attributes, though the filename doesn’t end
in a
.py
or
.pyc
this time
around—the only obvious way you can tell it’s a C library:
>>>dir(hello)
# C module attributes
['__doc__', '__file__', '__name__', '__package__', 'message']
>>>hello.__name__, hello.__file__
('hello', 'hello.dll')
>>>hello.message
# a C function object
>>>hello
# a C module object
>>>hello.__doc__
# docstrings in C code
'mod doc'
>>>hello.message.__doc__
'func doc'
>>>hello.message()
# errors work too
TypeError: argument must be sequence of length 1, not 0
Like any module in Python, you can also access the C extension from
a script file. The Python file in
Example 20-3
, for instance, imports and
uses the C extension module in
Example 20-1
.
Example 20-3. PP4E\Integrate\Extend\Hello\hellouse.py
"import and use a C extension library module"
import hello
print(hello.message('C'))
print(hello.message('module ' + hello.__file__))
for i in range(3):
reply = hello.message(str(i))
print(reply)
Run this script as any other—when the script first imports the
modulehello
, Python automatically
finds the C module’s
.dll
object file in a directory
on the module search path and links it into the process dynamically. All
of this script’s output represents strings returned from the C function in
the file
hello.c
:
.../PP4E/Integrate/Extend/Hello$python hellouse.py
Hello, C
Hello, module /cygdrive/c/.../PP4E/Integrate/Extend/Hello/hello.dll
Hello, 0
Hello, 1
Hello, 2
See Python’s manuals for more details on the code in our C module,
as well as tips for compilation and linkage. Of note, as an alternative to
makefiles, also see the
disthello.py
and
disthello-alt.py
files in the examples package.
Here’s a quick peek at the source code of the first of these:
# to build: python disthello.py build
# resulting dll shows up in build subdir
from distutils.core import setup, Extension
setup(ext_modules=[Extension('hello', ['hello.c'])])
This is a Python script that specifies compilation of the C
extension using tools in thedistutils
package—a
standard part of Python that is used to build, install, and distribute
Python extensions coded in Python or C.distutil
’s larger goal is automated and portable
builds and installs for distributed packages, but it also knows how to
build C extensions portably. Systems generally include a
setup.py
which installs in
site-
packages
of the standard library.
Regrettably,distutils
is also too
large to have survived the cleaver applied to this chapter’s material; see
its two manuals in Python’s manuals set for more
details.
As you can probably
tell, manual coding of C extensions can become fairly
involved (this is almost inevitable in C language work). I’ve introduced
the basics in this chapter thus far so that you understand the underlying
structure. But today, C extensions are usually better and more easily
implemented with a tool that generates all the required integration glue
code automatically. There are a variety of such tools for use in the
Python world, including SIP, SWIG, and Boost.Python; we’ll explore
alternatives at the end of this chapter. Among these, the SWIG system is
widely used by Python
developers
.
SWIG—the Simplified Wrapper and Interface Generator, is an open
source system created by Dave
Beazley and now developed by its community, much like
Python. It uses C and C++ type declarations to generate complete C
extension modules that integrate existing libraries for use in Python
scripts. The generated C (and C++) extension modules are complete: they
automatically handle data conversion, error protocols,
reference
-count management, and more.
That is, SWIG is a program that automatically generates all the glue
code needed to plug C and C++ components into Python programs; simply run
SWIG, compile its output, and your extension work is done. You still have
to manage compilation and linking details, but the rest of the C extension
task is largely performed by SWIG.
To use SWIG, instead of writing the C code in the prior section,
write the C function you want to use from Python without any Python
integration logic at all, as though it is to be used from C alone. For
instance,
Example 20-4
is a
recoding of
Example 20-1
as a
straight C function.
Example 20-4. PP4E\Integrate\Extend\HelloLib\hellolib.c
/*********************************************************************
* A simple C library file, with a single function, "message",
* which is to be made available for use in Python programs.
* There is nothing about Python here--this C function can be
* called from a C program, as well as Python (with glue code).
*********************************************************************/
#include
#include
static char result[1024]; /* this isn't exported */
char *
message(char *label) /* this is exported */
{
strcpy(result, "Hello, "); /* build up C string */
strcat(result, label); /* add passed-in label */
return result; /* return a temporary */
}
While you’re at it, define the usual C header file to declare the
function externally, as shown in
Example 20-5
. This is probably
overkill for such a small example, but it will prove a point.
Example 20-5. PP4E\Integrate\Extend\HelloLib\hellolib.h
/********************************************************************
* Define hellolib.c exports to the C namespace, not to Python
* programs--the latter is defined by a method registration
* table in a Python extension module's code, not by this .h;
********************************************************************/
extern char *message(char *label);
Now, instead of all the Python extension glue code shown in the
prior sections, simply write a SWIG type declarations input file, as in
Example 20-6
.
Example 20-6. PP4E\Integrate\Extend\Swig\hellolib.i
/******************************************************
* Swig module description file, for a C lib file.
* Generate by saying "swig -python hellolib.i".
******************************************************/
%module hellowrap
%{
#include
%}
extern char *message(char*); /* or: %include "../HelloLib/hellolib.h" */
/* or: %include hellolib.h, and use -I arg */
This file spells out the C function’s type signature. In general,
SWIG scans files containing ANSI C and C++ declarations. Its input file
can take the form of an interface description file (usually with a
.i
suffix) or a C/C++ header or source file.
Interface files like this one are the most common input form; they can
contain comments in C or
C++
format, type declarations just like standard header files, and SWIG
directives that all start with%
. For
example:
%module
Sets the module’s name as known to Python importers.
%{...%}
Encloses code added to generated wrapper file
verbatim.
extern
statementsDeclare exports in normal ANSI C/C++ syntax.
%include
Makes SWIG scan another file (-I
flags give search paths).
In this example, SWIG could also be made to read the
hellolib.h
header file of
Example 20-5
directly. But one of the
advantages of writing special SWIG input files like
hellolib.i
is that you can pick and choose which
functions are wrapped and exported to Python, and you may use directives
to gain more control over the generation process.
SWIG is a utility program that you run from your build scripts; it
is not a programming language, so there is not much more to show here.
Simply add a step to your makefile that runs SWIG and compile its output
to be linked with Python.
Example 20-7
shows one way to do it
on Cygwin.
Example 20-7. PP4E\Integrate\Extend\Swig\makefile.hellolib-swig
##################################################################
# Use SWIG to integrate hellolib.c for use in Python programs on
# Cygwin. The DLL must have a leading "_" in its name in current
# SWIG (>1.3.13) because also makes a .py without "_" in its name.
##################################################################
PYLIB = /usr/local/bin
PYINC = /usr/local/include/python3.1
CLIB = ../HelloLib
SWIG = /cygdrive/c/temp/swigwin-2.0.0/swig
# the library plus its wrapper
_hellowrap.dll: hellolib_wrap.o $(CLIB)/hellolib.o
gcc -shared hellolib_wrap.o $(CLIB)/hellolib.o \
-L$(PYLIB) -lpython3.1 -o $@
# generated wrapper module code
hellolib_wrap.o: hellolib_wrap.c $(CLIB)/hellolib.h
gcc hellolib_wrap.c -g -I$(CLIB) -I$(PYINC) -c -o $@
hellolib_wrap.c: hellolib.i
$(SWIG) -python -I$(CLIB) hellolib.i
# C library code (in another directory)
$(CLIB)/hellolib.o: $(CLIB)/hellolib.c $(CLIB)/hellolib.h
gcc $(CLIB)/hellolib.c -g -I$(CLIB) -c -o $(CLIB)/hellolib.o
clean:
rm -f *.dll *.o *.pyc core
force:
rm -f *.dll *.o *.pyc core hellolib_wrap.c hellowrap.py
When run on the
hellolib.i
input file by this
makefile, SWIG generates two files:
The generated C extension module glue code file.
A Python module that imports the generated C extension
module.
The former is named for the input file, and the latter per the%module
directive. Really, SWIG
generates two modules today: it uses a
combination
of Python and C code to achieve the integration. Scripts ultimately
import the generated Python module file, which internally imports the
generated and compiled C module. You can wade through this generated
code in the book’s examples distribution if you are so inclined, but it
is prone to change over time and is too generalized to be simple.
To build the C module, the makefile runs SWIG to generate the glue
code; compiles its output; compiles the original C library code if
needed; and then combines the result with the compiled wrapper to
produce
_hellowrap.dll
, the DLL
which
hellowrap.py
will expect to
find when imported by a Python script:
.../PP4E/Integrate/Extend/Swig$dir
hellolib.i makefile.hellolib-swig
.../PP4E/Integrate/Extend/Swig$make -f makefile.hellolib-swig
/cygdrive/c/temp/swigwin-2.0.0/swig -python -I../HelloLib hellolib.i
gcc hellolib_wrap.c -g -I../HelloLib -I/usr/local/include/python3.1
-c -o hellolib_wrap.o
gcc ../HelloLib/hellolib.c -g -I../HelloLib -c -o ../HelloLib/hellolib.o
gcc -shared hellolib_wrap.o ../HelloLib/hellolib.o \
-L/usr/local/bin -lpython3.1 -o _hellowrap.dll
.../PP4E/Integrate/Extend/Swig$dir
_hellowrap.dll hellolib_wrap.c hellowrap.py
hellolib.i hellolib_wrap.o makefile.hellolib-swig
The result is a dynamically loaded C extension module file ready
to be imported by Python code. Like all modules,
_hellowrap.dll
must, along with
hellowrap.py
, be placed in a directory on
your Python module search path (the directory where you compile will
suffice if you run Python there too). Notice that the
.dll
file must be built with a leading
underscore in its name; this is required because SWIG also created the
.py
file of the same name without
the underscore—if named the same, only one could be
imported
, and we need both (scripts import
the
.py
which in turn imports the
.dll
internally
).
As usual in C development, you may have to barter with the
makefile to get it to work on your system. Once you’ve run the makefile,
though, you are finished. The generated C module is used exactly like
the manually coded version shown before, except that SWIG has taken care
of the complicated parts automatically. Function calls in our Python
code are routed through the generated SWIG layer, to the C code in
Example 20-4
, and back again; with
SWIG, this all “just works”:
.../PP4E/Integrate/Extend/Swig$python
>>>import hellowrap
# import glue + library file
>>>hellowrap.message('swig world')
# cwd always searched on imports
'Hello, swig world'
>>>hellowrap.__file__
'hellowrap.py'
>>>dir(hellowrap)
['__builtins__', '__doc__', '__file__', '__name__', '_hellowrap', ... 'message']
>>>hellowrap._hellowrap
In other words, once you learn how to use SWIG, you can often
largely forget the details behind integration coding. In fact, SWIG is
so adept at generating Python glue code that it’s usually easier and
less error prone to code C extensions for Python as purely C- or
C++-based libraries first, and later add them to Python by running their
header files through SWIG, as demonstrated here.
We’ve mostly just scratched the SWIG surface here, and there’s
more for you to learn about it from its Python-specific manual—available
with SWIG at
http://www.swig.org
. Although its
examples in this book are simple, SWIG is powerful enough to integrate
libraries as complex as Windows extensions and commonly used graphics
APIs such as OpenGL. We’ll apply it again later in this chapter, and
explore its “shadow class” model for wrapping C++ classes too. For now,
let’s move on to a more useful extension
example.