Programming Python (14 page)

Read Programming Python Online

Authors: Mark Lutz

Tags: #COMPUTERS / Programming Languages / Python

BOOK: Programming Python
8.6Mb size Format: txt, pdf, ePub
Introducing the os Module

As mentioned,
os
is
the larger of the two core system modules. It contains all
of the usual operating-system calls you use in C programs and shell
scripts. Its calls deal with directories, processes, shell variables, and
the like. Technically, this module provides POSIX tools—a portable
standard for operating-system calls—along with platform-independent
directory processing tools as the nested module
os.path
. Operationally,
os
serves as a largely portable interface to
your computer’s system calls: scripts written with
os
and
os.path
can usually be run unchanged on any
platform. On some platforms,
os
includes extra tools available just for that platform (e.g., low-level
process calls on Unix); by and large, though, it is as cross-platform as
is technically feasible.

Tools in the os Module

Let’s take a quick look at the basic interfaces in
os
. As a preview,
Table 2-1
summarizes some of the most
commonly used tools in the
os
module,
organized by functional area.

Table 2-1. Commonly used os module tools

Tasks

Tools

Shell
variables

os.environ

Running
programs

os.system
,
os.popen
,
os.execv
,
os.spawnv

Spawning
processes

os.fork
,
os.pipe
,
os.waitpid
,
os.kill

Descriptor files,
locks

os.open
,
os.read
,
os.write

File
processing

os.remove
,
os.rename
,
os.mkfifo
,
os.mkdir
,
os.rmdir

Administrative
tools

os.getcwd
,
os.chdir
,
os.chmod
,
os.getpid
,
os.listdir
,
os.access

Portability
tools

os.sep
,
os.pathsep
,
os.curdir
,
os.path.split
,
os.path.join

Pathname
tools

os.path.exists('path')
,
os.path.isdir('path')
,
os.path.getsize('path')

If you inspect this module’s attributes interactively, you get a
huge list of names that will vary per Python release, will likely vary
per platform, and isn’t incredibly useful until you’ve learned what each
name means (I’ve let this line-wrap and removed most of this list to
save space—run the command on your own):

>>>
import os
>>>
dir(os)
['F_OK', 'MutableMapping', 'O_APPEND', 'O_BINARY', 'O_CREAT', 'O_EXCL', 'O_NOINH
ERIT', 'O_RANDOM', 'O_RDONLY', 'O_RDWR', 'O_SEQUENTIAL', 'O_SHORT_LIVED', 'O_TEM
PORARY', 'O_TEXT', 'O_TRUNC', 'O_WRONLY', 'P_DETACH', 'P_NOWAIT', 'P_NOWAITO', '
P_OVERLAY', 'P_WAIT', 'R_OK', 'SEEK_CUR', 'SEEK_END', 'SEEK_SET', 'TMP_MAX',
...9 lines removed here...
'pardir', 'path', 'pathsep', 'pipe', 'popen', 'putenv', 'read', 'remove', 'rem
ovedirs', 'rename', 'renames', 'rmdir', 'sep', 'spawnl', 'spawnle', 'spawnv', 's
pawnve', 'startfile', 'stat', 'stat_float_times', 'stat_result', 'statvfs_result
', 'strerror', 'sys', 'system', 'times', 'umask', 'unlink', 'urandom', 'utime',
'waitpid', 'walk', 'write']

Besides all of these, the nested
os.path
module exports
even more tools, most of which are related to processing file and
directory names portably:

>>>
dir(os.path)
['__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__',
'_get_altsep', '_get_bothseps', '_get_colon', '_get_dot', '_get_empty',
'_get_sep', '_getfullpathname', 'abspath', 'altsep', 'basename', 'commonprefix',
'curdir', 'defpath', 'devnull', 'dirname', 'exists', 'expanduser', 'expandvars',
'extsep', 'genericpath', 'getatime', 'getctime', 'getmtime', 'getsize', 'isabs',
'isdir', 'isfile', 'islink', 'ismount', 'join', 'lexists', 'normcase', 'normpath',
'os', 'pardir', 'pathsep', 'realpath', 'relpath', 'sep', 'split', 'splitdrive',
'splitext', 'splitunc', 'stat', 'supports_unicode_filenames', 'sys']
Administrative Tools

Just in case those
massive listings aren’t quite enough to go on, let’s
experiment interactively with some of the more commonly used
os
tools. Like
sys
, the
os
module comes with a collection of informational and administrative
tools:

>>>
os.getpid()
7980
>>>
os.getcwd()
'C:\\PP4thEd\\Examples\\PP4E\\System'
>>>
os.chdir(r'C:\Users')
>>>
os.getcwd()
'C:\\Users'

As shown here, the
os.getpid
function
gives the calling process’s process ID (a unique system-defined
identifier for a running program, useful for process control and unique
name creation), and
os.getcwd
returns the
current working directory. The current working directory is where files
opened by your script are assumed to live, unless their names include
explicit directory paths. That’s why earlier I told you to run the
following command in the directory where
more.py
lives:

C:\...\PP4E\System>
python more.py more.py

The input filename argument here is given without an explicit
directory path (though you could add one to page files in another
directory). If you need to run in a different working directory, call
the
os.chdir
function to
change to a new directory; your code will run relative to the new
directory for the rest of the program (or until the next
os.chdir
call). The next chapter will have
more to say about the notion of a current working directory, and its
relation to module imports when it explores script execution
context.

Portability Constants

The
os
module
also exports a set of names designed to make
cross-platform programming simpler. The set includes platform-specific
settings for path and directory separator characters, parent and current
directory indicators, and the characters used to terminate lines on the
underlying computer.

>>>
os.pathsep, os.sep, os.pardir, os.curdir, os.linesep
(';', '\\', '..', '.', '\r\n')

os.sep
is
whatever character is used to separate directory
components on the platform on which Python is running; it is
automatically preset to
\
on Windows,
/
for POSIX machines, and
:
on some Macs. Similarly,
os.pathsep
provides
the character that separates directories on directory lists,
:
for POSIX and
;
for DOS and Windows.

By using such attributes when composing and decomposing
system-related strings in our scripts, we make the scripts fully
portable. For instance, a call of the form
dirpath.split(os.sep)
will correctly split
platform-specific directory names into components, though
dirpath
may look like
dir\dir
on Windows,
dir/dir
on Linux, and
dir:dir
on some Macs. As mentioned, on Windows
you can usually use forward slashes rather than backward slashes when
giving filenames to be opened, but these portability constants allow
scripts to be platform neutral in directory processing code.

Notice also how
os.linesep
comes back
as
\r\n
here—the symbolic escape code
which reflects the carriage-return + line-feed line terminator
convention on Windows, which you don’t normally notice when processing
text files in Python. We’ll learn more about end-of-line translations in
Chapter 4
.

Common os.path Tools

The nested module
os.path
provides a large set of directory-related tools of its
own. For example, it includes portable functions for tasks such as
checking a file’s type (
isdir
,
isfile
, and others); testing file
existence (
exists
); and fetching the
size of a file by name (
getsize
):

>>>
os.path.isdir(r'C:\Users'), os.path.isfile(r'C:\Users')
(True, False)
>>>
os.path.isdir(r'C:\config.sys'), os.path.isfile(r'C:\config.sys')
(False, True)
>>>
os.path.isdir('nonesuch'), os.path.isfile('nonesuch')
(False, False)
>>>
os.path.exists(r'c:\Users\Brian')
False
>>>
os.path.exists(r'c:\Users\Default')
True
>>>
os.path.getsize(r'C:\autoexec.bat')
24

The
os.path.isdir
and
os.path.isfile
calls tell us whether a
filename is a directory or a simple file; both return
False
if the
named file does not exist (that is, nonexistence implies
negation). We also get calls for splitting and joining directory path
strings, which automatically use the directory name conventions on the
platform on which Python is running:

>>>
os.path.split(r'C:\temp\data.txt')
('C:\\temp', 'data.txt')
>>>
os.path.join(r'C:\temp', 'output.txt')
'C:\\temp\\output.txt'
>>>
name = r'C:\temp\data.txt'
# Windows paths
>>>
os.path.dirname(name), os.path.basename(name)
('C:\\temp', 'data.txt')
>>>
name = '/home/lutz/temp/data.txt'
# Unix-style paths
>>>
os.path.dirname(name), os.path.basename(name)
('/home/lutz/temp', 'data.txt')
>>>
os.path.splitext(r'C:\PP4thEd\Examples\PP4E\PyDemos.pyw')
('C:\\PP4thEd\\Examples\\PP4E\\PyDemos', '.pyw')

os.path.split
separates a filename from its directory path,
and
os.path.join
puts them
back together—all in entirely portable fashion using the path
conventions of the machine on which they are called. The
dirname
and
basename
calls here return the first and
second items returned by a
split
simply as a convenience, and
splitext
strips the file extension (after the last
.
). Subtle point: it’s almost equivalent to
use string
split
and
join
method calls with the portable
os.sep
string, but not exactly:

>>>
os.sep
'\\'
>>>
pathname = r'C:\PP4thEd\Examples\PP4E\PyDemos.pyw'
>>>
os.path.split(pathname)
# split file from dir
('C:\\PP4thEd\\Examples\\PP4E', 'PyDemos.pyw')
>>>
pathname.split(os.sep)
# split on every slash
['C:', 'PP4thEd', 'Examples', 'PP4E', 'PyDemos.pyw']
>>>
os.sep.join(pathname.split(os.sep))
'C:\\PP4thEd\\Examples\\PP4E\\PyDemos.pyw'
>>>
os.path.join(*pathname.split(os.sep))
'C:PP4thEd\\Examples\\PP4E\\PyDemos.pyw'

The last join call require individual arguments (hence the
*
) but doesn’t insert a first slash
because of the Windows drive syntax; use the preceding
str.join
method
instead if the difference matters. The
normpath
call comes in handy if your paths
become a jumble of Unix and Windows separators:

>>>
mixed
'C:\\temp\\/files/05/24/91/f052491/public/files/index.html'
>>>
os.path.normpath(mixed)
'C:\\temp\\public\\files\\index.html'
>>>
print(os.path.normpath(r'C:\temp\\sub\.\file.ext'))
C:\temp\sub\file.ext

This module also has an
abspath
call that portably returns the full directory pathname of a file; it
accounts for adding the current directory as a path prefix,
..
parent syntax, and more:

>>>
os.chdir(r'C:\Users')
>>>
os.getcwd()
'C:\\Users'
>>>
os.path.abspath('')
# empty string means the cwd
'C:\\Users'
>>>
os.path.abspath('temp')
# expand to full pathname in cwd
'C:\\Users\\temp'
>>>
os.path.abspath(r'PP4E\dev')
# partial paths relative to cwd
'C:\\Users\\PP4E\\dev'
>>>
os.path.abspath('.')
# relative path syntax expanded
'C:\\Users'
>>>
os.path.abspath('..')
'C:\\'
>>>
os.path.abspath(r'..\examples')
'C:\\examples'
>>>
os.path.abspath(r'C:\PP4thEd\chapters')
# absolute paths unchanged
'C:\\PP4thEd\\chapters'
>>>
os.path.abspath(r'C:\temp\spam.txt')
'C:\\temp\\spam.txt'

Because filenames are relative to the current working directory
when they aren’t fully specified paths, the
os.path.abspath
function helps if you want to show users what directory is
truly being used to store a file. On Windows, for example, when
GUI-based programs are launched by clicking on file explorer icons and
desktop shortcuts, the execution directory of the program is the clicked
file’s home directory, but that is not always obvious to the person
doing the clicking; printing a file’s
abspath
can
help.

Other books

All That Remains by Michele G Miller, Samantha Eaton-Roberts
The Trouble with Demons by Lisa Shearin
Alentejo Blue by Monica Ali
A Gentle Hell by Christian, Autumn
Renegade by Alers, Rochelle
Bluewing by Kate Avery Ellison
Destined for Power by Kathleen Brooks