Programming Python (106 page)

Read Programming Python Online

Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

BOOK: Programming Python
11.7Mb size Format: txt, pdf, ePub
Preventing zombies with signal handlers on Linux

On some systems,
it’s also possible to clean up zombie child processes by
resetting the signal handler for the
SIGCHLD
signal delivered to a parent process
by the operating system when a child process stops or exits. If a
Python script assigns the
SIG_IGN
(ignore) action as the
SIGCHLD
signal handler, zombies will be removed automatically and immediately
by the operating system as child processes exit; the parent need not
issue wait calls to clean up after them. Because of that, this scheme
is a simpler alternative to manually reaping zombies on platforms
where it is supported.

If you’ve already read
Chapter 5
,
you know that Python’s standard
signal
module lets scripts install handlers
for signals—software-generated events. By way of review, here is a
brief bit of background to show how this pans out for zombies. The
program in
Example 12-5
installs a Python-coded signal handler function to respond to whatever
signal number you type on the command line.

Example 12-5. PP4E\Internet\Sockets\signal-demo.py

"""
Demo Python's signal module; pass signal number as a command-line arg, and use
a "kill -N pid" shell command to send this process a signal; on my Linux machine,
SIGUSR1=10, SIGUSR2=12, SIGCHLD=17, and SIGCHLD handler stays in effect even if
not restored: all other handlers are restored by Python after caught, but SIGCHLD
behavior is left to the platform's implementation; signal works on Windows too,
but defines only a few signal types; signals are not very portable in general;
"""
import sys, signal, time
def now():
return time.asctime()
def onSignal(signum, stackframe): # Python signal handler
print('Got signal', signum, 'at', now()) # most handlers stay in effect
if signum == signal.SIGCHLD: # but sigchld handler is not
print('sigchld caught')
#signal.signal(signal.SIGCHLD, onSignal)
signum = int(sys.argv[1])
signal.signal(signum, onSignal) # install signal handler
while True: signal.pause() # sleep waiting for signals

To run this script, simply put it in the background and send it
signals by typing the
kill
-
signal-number
process-id
shell command line; this is the
shell’s equivalent of Python’s
os.kill
function
available on Unix-like platforms only. Process IDs are listed in the
PID column of
ps
command results.
Here is this script in action catching signal numbers 10 (reserved for
general use) and 9 (the unavoidable terminate signal):

[...]$
python signal-demo.py 10 &
[1] 10141
[...]$
ps -f
UID PID PPID C STIME TTY TIME CMD
5693094 10141 30778 0 05:00 pts/0 00:00:00 python signal-demo.py 10
5693094 10228 30778 0 05:00 pts/0 00:00:00 ps -f
5693094 30778 30772 0 04:23 pts/0 00:00:00 -bash
[...]$
kill −10 10141
Got signal 10 at Sun Apr 25 05:00:31 2010
[...]$
kill −10 10141
Got signal 10 at Sun Apr 25 05:00:34 2010
[...]$
kill −9 10141
[1]+ Killed python signal-demo.py 10

And in the following the script catches signal 17, which happens
to be
SIGCHLD
on my Linux server.
Signal numbers vary from machine to machine, so you should normally
use their names, not their numbers.
SIGCHLD
behavior may vary per platform as
well. On my Cygwin install, for example, signal 10 can have different
meaning, and signal 20 is SIGCHLD—on Cygwin, the script works as shown
on Linux here for signal 10, but generates an exception if it tries to
install on handler for signal 17 (and Cygwin doesn’t require reaping
in any event). See the
signal
module’s library manual entry for more details:

[...]$
python signal-demo.py 17 &
[1] 11592
[...]$
ps -f
UID PID PPID C STIME TTY TIME CMD
5693094 11592 30778 0 05:00 pts/0 00:00:00 python signal-demo.py 17
5693094 11728 30778 0 05:01 pts/0 00:00:00 ps -f
5693094 30778 30772 0 04:23 pts/0 00:00:00 -bash
[...]$
kill −17 11592
Got signal 17 at Sun Apr 25 05:01:28 2010
sigchld caught
[...]$
kill −17 11592
Got signal 17 at Sun Apr 25 05:01:35 2010
sigchld caught
[...]$
kill −9 11592
[1]+ Killed python signal-demo.py 17

Now, to apply all of this signal knowledge to killing zombies,
simply set the
SIGCHLD
signal
handler to the
SIG_IGN
ignore
handler action; on systems where this assignment is supported, child
processes will be cleaned up when they exit. The forking server
variant shown in
Example 12-6
uses this trick to manage its children.

Example 12-6. PP4E\Internet\Sockets\fork-server-signal.py

"""
Same as fork-server.py, but use the Python signal module to avoid keeping
child zombie processes after they terminate, instead of an explicit reaper
loop before each new connection; SIG_IGN means ignore, and may not work with
SIG_CHLD child exit signal on all platforms; see Linux documentation for more
about the restartability of a socket.accept call interrupted with a signal;
"""
import os, time, sys, signal, signal
from socket import * # get socket constructor and constants
myHost = '' # server machine, '' means local host
myPort = 50007 # listen on a non-reserved port number
sockobj = socket(AF_INET, SOCK_STREAM) # make a TCP socket object
sockobj.bind((myHost, myPort)) # bind it to server port number
sockobj.listen(5) # up to 5 pending connects
signal.signal(signal.SIGCHLD, signal.SIG_IGN)
# avoid child zombie processes
def now(): # time on server machine
return time.ctime(time.time())
def handleClient(connection): # child process replies, exits
time.sleep(5) # simulate a blocking activity
while True: # read, write a client socket
data = connection.recv(1024)
if not data: break
reply = 'Echo=>%s at %s' % (data, now())
connection.send(reply.encode())
connection.close()
os._exit(0)
def dispatcher(): # listen until process killed
while True: # wait for next connection,
connection, address = sockobj.accept() # pass to process for service
print('Server connected by', address, end=' ')
print('at', now())
childPid = os.fork() # copy this process
if childPid == 0: # if in child process: handle
handleClient(connection) # else: go accept next connect
dispatcher()

Where applicable, this technique is:

  • Much simpler; we don’t need to manually track or reap child
    processes.

  • More accurate; it leaves no zombies temporarily between
    client requests.

In fact, only one line is dedicated to handling zombies here:
the
signal.signal
call near the
top, to set the handler. Unfortunately, this version is also even less
portable than using
os.fork
in the
first place, because signals may work slightly differently from
platform to platform, even among Unix variants. For instance, some
Unix platforms may not allow
SIG_IGN
to be used as the
SIGCHLD
action at all. On Linux systems,
though, this simpler forking server variant works like a charm:

[...]$
python fork-server-signal.py &
[1] 3837
Server connected by ('72.236.109.185', 58817) at Sun Apr 25 08:11:12 2010
[...]
ps -f
UID PID PPID C STIME TTY TIME CMD
5693094 3837 30778 0 08:10 pts/0 00:00:00 python fork-server-signal.py
5693094 4378 3837 0 08:11 pts/0 00:00:00 python fork-server-signal.py
5693094 4413 30778 0 08:11 pts/0 00:00:00 ps -f
5693094 30778 30772 0 04:23 pts/0 00:00:00 -bash
[...]$
ps -f
UID PID PPID C STIME TTY TIME CMD
5693094 3837 30778 0 08:10 pts/0 00:00:00 python fork-server-signal.py
5693094 4584 30778 0 08:11 pts/0 00:00:00 ps -f
5693094 30778 30772 0 04:23 pts/0 00:00:00 –bash

Notice how in this version the child process’s entry goes away
as soon as it exits, even before a new client request is received; no
“defunct” zombie ever appears. More dramatically, if we now start up
the script we wrote earlier that spawns
eight
clients in parallel (
testecho.py
) to talk to this
server remotely, all appear on the server while running, but are
removed immediately as they exit:

[client window]
C:\...\PP4E\Internet\Sockets>
testecho.py learning-python.com
[server window]
[...]$
Server connected by ('72.236.109.185', 58829) at Sun Apr 25 08:16:34 2010
Server connected by ('72.236.109.185', 58830) at Sun Apr 25 08:16:34 2010
Server connected by ('72.236.109.185', 58831) at Sun Apr 25 08:16:34 2010
Server connected by ('72.236.109.185', 58832) at Sun Apr 25 08:16:34 2010
Server connected by ('72.236.109.185', 58833) at Sun Apr 25 08:16:34 2010
Server connected by ('72.236.109.185', 58834) at Sun Apr 25 08:16:34 2010
Server connected by ('72.236.109.185', 58835) at Sun Apr 25 08:16:34 2010
Server connected by ('72.236.109.185', 58836) at Sun Apr 25 08:16:34 2010
[...]$
ps -f
UID PID PPID C STIME TTY TIME CMD
5693094 3837 30778 0 08:10 pts/0 00:00:00 python fork-server-signal.py
5693094 9666 3837 0 08:16 pts/0 00:00:00 python fork-server-signal.py
5693094 9667 3837 0 08:16 pts/0 00:00:00 python fork-server-signal.py
5693094 9668 3837 0 08:16 pts/0 00:00:00 python fork-server-signal.py
5693094 9670 3837 0 08:16 pts/0 00:00:00 python fork-server-signal.py
5693094 9674 3837 0 08:16 pts/0 00:00:00 python fork-server-signal.py
5693094 9678 3837 0 08:16 pts/0 00:00:00 python fork-server-signal.py
5693094 9681 3837 0 08:16 pts/0 00:00:00 python fork-server-signal.py
5693094 9682 3837 0 08:16 pts/0 00:00:00 python fork-server-signal.py
5693094 9722 30778 0 08:16 pts/0 00:00:00 ps -f
5693094 30778 30772 0 04:23 pts/0 00:00:00 -bash
[...]$
ps -f
UID PID PPID C STIME TTY TIME CMD
5693094 3837 30778 0 08:10 pts/0 00:00:00 python fork-server-signal.py
5693094 10045 30778 0 08:16 pts/0 00:00:00 ps -f
5693094 30778 30772 0 04:23 pts/0 00:00:00 –bash

And now that I’ve shown you how to use signal handling to reap
children automatically on Linux, I should underscore that this
technique is not universally supported across all flavors of Unix. If
you care about portability, manually reaping children as we did in
Example 12-4
may still be
desirable.

Why multiprocessing doesn’t help with socket server
portability

In
Chapter 5
, we
l
earned about Python’s new
multiprocessing
module. As we saw, it
provides a way to start function calls in new processes that is more
portable than the
os.fork
call used
in this section’s server code, and it runs processes instead of
threads to work around the thread GIL in some scenarios. In
particular,
multiprocessing
works
on standard Windows Python too, unlike direct
os.fork
calls.

I experimented with a server variant based upon this module to
see if its portability might help for socket servers. Its full source
code is in the examples package in file
multi-server.py
, but here are its important
bits that differ:

...rest unchanged from fork-server.py...
from multiprocessing import Process
def handleClient(connection):
print('Child:', os.getpid()) # child process: reply, exit
time.sleep(5) # simulate a blocking activity
while True: # read, write a client socket
data = connection.recv(1024) # till eof when socket closed
...rest unchanged...
def dispatcher(): # listen until process killed
while True: # wait for next connection,
connection, address = sockobj.accept() # pass to process for service
print('Server connected by', address, end=' ')
print('at', now())
Process(target=handleClient, args=(connection,)).start()
if __name__ == '__main__':
print('Parent:', os.getpid())
sockobj = socket(AF_INET, SOCK_STREAM) # make a TCP socket object
sockobj.bind((myHost, myPort)) # bind it to server port number
sockobj.listen(5) # allow 5 pending connects
dispatcher()

This server variant is noticeably simpler too. Like the forking
server it’s derived from, this server works fine under Cygwin Python
on Windows running as
localhost
,
and would probably work on other Unix-like platforms as well, because
multiprocessing
forks a process on
such systems, and file and socket descriptors are inherited by child
processes as usual. Hence, the child process uses the same connected
socket as the parent. Here’s the scene in a Cygwin server window and
two Windows client windows:

[server window]
[C:\...\PP4E\Internet\Sockets]$
python multi-server.py
Parent: 8388
Server connected by ('127.0.0.1', 58271) at Sat Apr 24 08:13:27 2010
Child: 8144
Server connected by ('127.0.0.1', 58272) at Sat Apr 24 08:13:29 2010
Child: 8036
[two client windows]
C:\...\PP4E\Internet\Sockets>
python echo-client.py
Client received: b"Echo=>b'Hello network world' at Sat Apr 24 08:13:33 2010"
C:\...\PP4E\Internet\Sockets>
python echo-client.py localhost Brave Sir Robin
Client received: b"Echo=>b'Brave' at Sat Apr 24 08:13:35 2010"
Client received: b"Echo=>b'Sir' at Sat Apr 24 08:13:35 2010"
Client received: b"Echo=>b'Robin' at Sat Apr 24 08:13:35 2010"

However, this server does
not
work on
standard Windows Python—the whole point of trying to use
multiprocessing
in this context—because open
sockets are not correctly pickled when passed as arguments into the
new process. Here’s what occurs in the server windows on Windows 7
with Python 3.1:

C:\...\PP4E\Internet\Sockets>
python multi-server.py
Parent: 9140
Server connected by ('127.0.0.1', 58276) at Sat Apr 24 08:17:41 2010
Child: 9628
Process Process-1:
Traceback (most recent call last):
File "C:\Python31\lib\multiprocessing\process.py", line 233, in _bootstrap
self.run()
File "C:\Python31\lib\multiprocessing\process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "C:\...\PP4E\Internet\Sockets\multi-server.py", line 38, in handleClient
data = connection.recv(1024) # till eof when socket closed
socket.error: [Errno 10038] An operation was attempted on something that is not
a socket

Recall from
Chapter 5
that on
Windows
multiprocessing
passes
context to a new
Python
interpreter process by pickling it, and that
Process
arguments must all be pickleable for
Windows. Sockets in Python 3.1 don’t trigger errors when pickled
thanks to the class they are an instance of, but they are not really
pickled correctly:

>>>
from pickle import *
>>>
from socket import *
>>>
s = socket()
>>>
x = dumps(s)
>>>
s

>>>
loads(x)

>>>
x
b'\x80\x03csocket\nsocket\nq\x00)\x81q\x01N}q\x02(X\x08\x00\x00\x00_io_refsq\x03
K\x00X\x07\x00\x00\x00_closedq\x04\x89u\x86q\x05b.'

As we saw in
Chapter 5
,
multiprocessing
has other IPC tools such as
its own pipes and queues that might be used instead of sockets to work
around this issue, but clients would then have to use them, too—the
resulting server would not be as broadly accessible as one based upon
general Internet sockets.

Even if
multiprocessing
did
work on Windows, though, its need to start a new Python interpreter
would likely make it much slower than the more traditional technique
of spawning threads to talk to clients. Coincidentally, that brings us
to
our next topic.

Other books

All Piss and Wind by David Salter
No Lesser Plea by Robert K. Tanenbaum
Unbound by Meredith Noone
Iran's Deadly Ambition by Ilan Berman
Billions & Billions by Carl Sagan
Stray Bullet by Simon Duringer
Grace by Carter, Mina
Earthly Possessions by Anne Tyler