Programming Python (198 page)

Read Programming Python Online

Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

BOOK: Programming Python
7.15Mb size Format: txt, pdf, ePub
Other Extending Tools

In closing the
extending topic, I should mention that there are
alternatives to SWIG, many of which have a loyal user base of their own.
This section briefly introduces some of the more popular tools in this
domain today; as usual, search the Web for more details on these and
others. Like SWIG, all of the following began life as third-party tools
installed separately, though Python 2.5 and later incorporates the
ctypes
extension as a standard library module.

SIP

Just as a sip
is a smaller swig in the drinking world, so too is the
SIP system a lighter alternative to SWIG in the Python world (in
fact, it was named on purpose for the joke). According to its web
page, SIP makes it easy to create Python bindings for C and C++
libraries. Originally developed to create the PyQt Python bindings
for the Qt toolkit, it can be used to create bindings for any C or
C++ library. SIP includes a code generator and a Python support
module.

Much like SWIG, the code generator processes a set of
specification files and generates C or C++ code, which is compiled
to create the bindings extension module. The SIP Python module
provides support functions to the automatically generated code.
Unlike SWIG, SIP is specifically designed just for bringing together
Python and C/C++. SWIG also generates wrappers for many other
scripting languages, and so is viewed by some as a more complex
project.

ctypes

The ctypes system
is a foreign function interface (FFI)
module for Python. It allows Python scripts to access
and call compiled functions in a binary library file directly and
dynamically, by writing dispatch code in Python itself, instead of
generating or writing the integration C wrapper code we’ve studied
in this chapter. That is, library glue code is written in pure
Python instead of C. The main advantage is that you don’t need C
code or a C build system to access C functions from a Python script.
The disadvantage is potential speed loss on dispatch, though this
depends upon the alternative measured.

According to its documentation, ctypes allows Python to call
functions exposed from DLLs and shared libraries and has facilities
to create, access, and manipulate complex C datatypes in Python. It
is also possible to implement C callback functions in pure Python,
and an experimental ctypes code generator feature allows automatic
creation of library wrappers from C header files. ctypes works on
Windows, Mac OS X, Linux, Solaris, FreeBSD, and OpenBSD. It may run
on additional systems, provided that the
libffi
package it employs is supported.
For Windows, ctypes contains a
ctypes.com
package, which allows Python
code to call and implement custom COM interfaces. See Python’s
library manuals for more on the ctypes functionality included in the
standard library.

Boost.Python

The Boost.Python system
is a C++ library that enables seamless
interoperability between C++ and the Python programming language
through an IDL-like model. Using it, developers generally write a
small amount of C++ wrapper code to create a shared library for use
in Python scripts. Boost.Python handles references, callbacks, type
mappings, and cleanup tasks. Because it is designed to wrap C++
interfaces nonintrusively, C++ code need not be changed to be
wrapped. Like other tools, this makes the system useful for wrapping
existing libraries, as well as
developing
new extensions from
scratch.

Writing interface code for large libraries can be more
involved than the code generation approaches of SWIG and SIP, but
it’s easier than manually wrapping libraries and may afford greater
control than a fully automated wrapping tool. In addition, the Py++
and older Pyste systems provide Boost.Python code generators, in
which users specify classes and functions to be exported using a
simple interface file. Both use GCC-XML to parse all the headers and
extract the necessary information to generate C++ code.

Cython (and Pyrex)

Cython, a
successor to the
Pyrex system, is a language specifically for writing
Python
extension modules. It
lets you write files that mix Python code and C datatypes as you
wish, and compiles the combination into a C extension for Python. In
principle, developers need not deal with the Python/C API at all,
because Cython takes care of things such as error-checking and
reference counts automatically.

Technically, Cython is a distinct language that is
Python-
like
, with extensions for mixing in C
datatype declarations and C function calls. However, almost any
Python code is also valid Cython code. The Cython compiler converts
Python code into C code, which makes calls to the Python/C API. In
this aspect, Cython is similar to the now much older Python2C
conversion project. By combining Python and C code, Cython offers a
different approach than the generation or coding of integration code
in other systems.

CXX, weave, and more

The
CXX
system
is roughly a C++ version of Python’s usual C API,
which handles reference counters, exception translation, and much of
the type checking and cleanup inherent in C++ extensions. As such,
CXX lets you focus on the
application
-specific parts of your
code. CXX also exposes parts of the C++ Standard Template Library
containers to be compatible with Python sequences.

The
weave
package
allows the inclusion of C/C++ in Python code. It’s
part of the
SciPy package (
http://www.scipy.org
) but is also available as a
standalone system. A page at
http://www.python.org
chronicles additional projects
in this domain, which we don’t have space to mention here.

Other languages: Java, C#, FORTRAN, Objective-C, and
others

Although we’re focused on C and C++ in this chapter, you’ll
also find direct support for mixing Python with other programming
languages in the open source world. This includes languages that are
compiled to binary form like C, as well as some that are not.

For example, by providing full byte code compilers,
the Jython and IronPython systems allow code written
in Python to interface with Java and C#/.NET components in a largely
seamless fashion. Alternatively, the JPype and Python for .NET
projects support Java and C#/.NET integration for normal CPython
(the standard C-based implementation of Python) code, without
requiring alternative byte code compilers.

Moreover, the f2py and PyFort systems
provide integration with FORTRAN
code, and other tools provide access to languages such
as Delphi and Objective-C. Among these, the PyObjC project aims to
provide a bridge between Python and Objective-C; this supports
writing Cocoa GUI applications on
Mac OS X in Python.

Search the Web for details on other language integration
tools. Also look for a wiki page currently at
http://www.python.org
that lists a large number of
other integratable languages, including Prolog, Lisp, TCL, and
more.

Because many of these systems support
bidirectional
control flows—both extending and
embedding—we’ll return to this category at the end of this chapter in the
context of integration at large. First, though, we need to shift our
perspective 180 degrees to explore the other mode of Python/C
integration:
embedding
.

Mixing Python and C++

Python’s standard
implementation is currently coded in C, so all the normal
rules about mixing C programs with C++ programs apply to the Python
interpreter. In fact, there is nothing special about Python in this
context, but here are a few pointers.

When
embedding
Python in a C++ program, there
are no special rules to follow. Simply link in the Python library and
call its functions from C++. Python’s header files automatically wrap
themselves in
extern "C" {...}
declarations to suppress C++ name mangling. Hence, the Python library
looks like any other C component to C++; there is no need to recompile
Python itself with a C++ compiler.

When
extending
Python with C++ components,
Python header files are still C++ friendly, so Python API calls in C++
extensions work like any other C++-to-C call. But be sure to wrap the
parts of your extension code made visible to Python with
extern "C"
declarations so that they can be
called by Python’s C code. For example, to wrap a C++ class, SWIG
generates a C++ extension module that declares its initialization
function this way.

Embedding Python in C: Overview

So far in this chapter, we’ve explored only half of the Python/C
integration picture: calling C services from Python. This mode is perhaps
the most commonly deployed; it allows programmers to speed up operations
by moving them to C and to utilize external libraries by wrapping them in
C extension modules and types. But the inverse can be just as useful:
calling Python from C. By delegating selected components of an application
to embedded Python code, we can open them up to onsite changes without
having to ship or rebuild a system’s full code base.

This section tells this other half of the Python/C integration tale.
It introduces the Python C interfaces that make it possible for programs
written in C-compatible
languages
to
run Python program code. In this mode, Python acts as an embedded control
language (what some call a “macro” language). Although embedding is mostly
presented
in isolation here, keep in
mind that Python’s integration support is best viewed as a whole. A
system’s structure usually determines an appropriate integration approach:
C extensions, embedded code calls, or both. To wrap up, this chapter
concludes by discussing a handful of alternative integration platforms
such as Jython and IronPython, which offer broad integration
possibilities.

The C Embedding API

The first thing
you should know about Python’s embedded-call API is that
it is less structured than the extension interfaces. Embedding Python in
C may require a bit more creativity on your part than extending: you
must pick tools from a general collection of calls to implement the
Python integration instead of coding to a boilerplate structure. The
upside of this loose structure is that programs can combine embedding
calls and strategies to build up arbitrary integration
architectures.

The lack of a more rigid model for embedding is largely the result
of a less clear-cut goal. When
extending
Python,
there is a distinct separation for Python and C responsibilities and a
clear structure for the integration. C modules and types are required to
fit the Python module/type model by conforming to standard extension
structures. This makes the integration seamless for Python clients: C
extensions look like Python objects and handle most of the work. It also
supports automation tools such as SWIG.

But when Python is
embedded
, the structure
isn’t as obvious; because C is the enclosing level, there is no clear
way to know what model the embedded Python code should fit. C may want
to run objects fetched from modules, strings fetched from files or
parsed out of documents, and so on. Instead of deciding what C can and
cannot do, Python provides a collection of general embedding interface
tools, which you use and structure according to your embedding
goals.

Most of these tools correspond to tools available to Python
programs.
Table 20-1
lists some of the more
common API calls used for embedding, as well as their Python
equivalents. In general, if you can figure out how to accomplish your
embedding goals in pure Python code, you can probably
find C API tools that achieve the same
results.

Table 20-1. Common API functions

C API
call

Python
equivalent

PyImport_ImportModule

import module
,
__import__

PyImport_GetModuleDict

sys.modules

PyModule_GetDict

module
.
__dict__

PyDict_GetItemString

dict[key]

PyDict_SetItemString

dict[key]=val

PyDict_New

dict = {}

PyObject_GetAttrString

getattr(obj, attr)

PyObject_SetAttrString

setattr(obj, attr, val)

PyObject_CallObject

funcobj(*argstuple)

PyEval_CallObject

funcobj(*argstuple)

PyRun_String

eval(exprstr)
,
exec(stmtstr)

PyRun_File

exec(open(filename().read())

Because
embedding relies on API call selection, becoming familiar
with the Python C API is fundamental to the embedding task. This chapter
presents a handful of representative embedding examples and discusses
common API calls, but it does not provide a comprehensive list of all
tools in the API. Once you’ve mastered the examples here, you’ll
probably need to consult Python’s integration manuals for more details
on available calls in this domain. As mentioned previously, Python
offers two standard manuals for C/C++ integration programmers:
Extending and Embedding
, an integration tutorial;
and
Python/C API
, the Python runtime library
reference.

You can find the most recent releases of these
manuals at
http://www.python.org
,
and possibly installed on your computer alongside Python itself. Beyond
this chapter, these manuals are likely to be your best resource for
up-to-date and complete Python API tool information.

What Is Embedded Code?

Before we jump into details, let’s get a handle on some of the
core ideas in the embedding domain. When this book speaks of “embedded”
Python code, it simply means any Python program structure that can be
executed from C with a direct in-process function call interface.
Generally speaking, embedded Python code can take a variety of
forms:

Code strings

C programs can
represent Python programs as character strings and
run them as either expressions or statements (much like using the
eval
and
exec
built-in functions in
Python).

Callable objects

C programs can
load or reference Python callable objects such as
functions, methods, and classes, and call them with argument list
objects (much like
func(*pargs,
*kargs)
Python syntax).

Code files

C programs
can execute entire Python program files by importing
modules and running script files through the API or general system
calls (e.g.,
popen
).

The Python binary library is usually what is physically embedded
and linked in the C program. The actual Python code run from C can come
from a wide variety of sources:

  • Code strings might be loaded from files, obtained from an
    interactive user at a console or GUI, fetched from persistent
    databases and shelves, parsed out of HTML or XML files, read over
    sockets, built or hardcoded in a C program, passed to C extension
    functions from Python registration code, and so on.

  • Callable objects might be fetched from Python modules,
    returned from other
    Python
    API
    calls, passed to C extension functions from Python registration
    code, and so on.

  • Code files simply exist as files, modules, and executable
    scripts in the filesystem.

Registration is a technique commonly used in callback scenarios
that we will explore in more detail later in this chapter. But
especially for strings of code, there are as many possible sources as
there are for C character strings in general. For example, C programs
can construct arbitrary Python code dynamically by building and running
strings.

Finally, once you have some Python code to run, you need a way to
communicate with it: the Python code may need to use inputs passed in
from the C layer and may want to generate outputs to communicate results
back to C. In fact, embedding generally becomes interesting only when
the embedded code has access to the enclosing C layer. Usually, the form
of the embedded code suggests its communication media:

  • Code strings that are Python expressions return an expression
    result as their output. In addition, both inputs and outputs can
    take the form of global variables in the namespace in which a code
    string is run; C may set variables to serve as input, run Python
    code, and fetch variables as the code’s result. Inputs and outputs
    can also be passed with exported C extension
    function
    calls
    —Python code may use C module or type interfaces
    that we met earlier in this chapter to get or set variables in the
    enclosing C layer. Communications schemes are often combined; for
    instance, C may preassign global names to objects that export both
    state and interface functions for use in the embedded Python
    code.
    [
    72
    ]

  • Callable objects may accept inputs as function arguments and
    produce results as function return values. Passed-in mutable
    arguments (e.g., lists, dictionaries, class instances) can be used
    as both input and output for the embedded code—changes made in
    Python are retained in objects held by C. Objects can also make use
    of the global variable and C extension functions interface
    techniques described for strings to communicate with C.

  • Code files can communicate with most of the same techniques as
    code strings; when run as separate programs, files can also employ
    Inter-Process Communication (IPC) techniques.

Naturally, all embedded code forms can also communicate with C
using general system-level tools: files, sockets, pipes, and so on.
These techniques are generally less direct and slower, though. Here, we
are still interested in in-process function
call integration.

[
72
]
For a concrete example, consider the discussion of
server-side templating languages in the Internet part of this
book. Such systems usually fetch Python code embedded in an HTML
web page file, assign global variables in a namespace to objects
that give access to the web browser’s environment, and run the
Python code in the namespace where the objects were assigned. I
worked on a project where we did something similar, but Python
code was embedded in XML documents, and objects that were
preassigned to globals in the code’s namespace represented
widgets in a GUI. At the bottom, it was simply Python code
embedded in and run by C code.

Other books

My Runaway Heart by Miriam Minger
The Shipwreck by Campbell, Glynnis
Desire Unchained by Larissa Ione
Mullumbimby by Melissa Lucashenko
Exiles by Alex Irvine
Life Penalty by Joy Fielding